Dataware

Simply locking up all our personal data, removing our ability to socialise it, is of little use. Dataware is the infrastructure we are building to support an ecosystem of data sources and processors, selling services that process your data without sacrificing your privacy.

One of the great drivers behind social networking has been the ecosystem of data processors who aggregate and provide services such as recommendations, location searches, or messaging. The big drawback to these is that users have to divulge more of their personal data to the third-party than is necessary, because of the difficulty of distinguishing what is needed.

Dataware approaches this problem from another angle: user data is treated as immovable by default, and the third-party applications are instead granted capabilities to run against the data wherever it is stored. Decisions are taken by the user, aided by software, about which applications are acceptable based on what data they will process, what results they will produce and where those results will be sent.

Technology

Creating this infrastructure and maintaining the same ease-of-use as centralised services currently provide is a challenge, but we are leveraging existing standards such as XMPP and emerging technology such as Apache Wave.

We identify three principals in the Dataware architecture: sources, processors and the user as represented by their catalog. In combination, a user’s sources and catalog can be considered one instantiation of their personal container.

Sources are organizations that hold personal data, ranging from banks to retailers to health professionals to your own computers and smartphones.

Processors are organizations that wish to process your personal data to some end, whether via web services, downloadable applications, or some combination.

Your catalog maintains metadata about your data sources and is responsible for applying policies on your behalf concerning who can access what data, to what end, and at what time. It is the catalog that is responsible for generating and managing capabilities delegating access to your personal data.