Personal data stores are the new grey?

If I had some money from all the people who sent me details of Tim Burners-Lee’s Solid I would have enough to buy a cheap flight to somewhere in Europe with a cheap airline.

Solid is meant to change “the current model where users have to hand over personal data to digital giants in exchange for perceived value. As we’ve all discovered, this hasn’t been in our best interests. Solid is how we evolve the web in order to restore balance – by giving every one of us complete control over data, personal or not, in a revolutionary way.”

Solid isn’t a radical new program. Instead, “Solid is a set of modular specifications, which build on, and extend the founding technology of the world wide web (HTTP, REST, HTML). They are 100% backwards compatible with the existing web.

Main reason why people seem to be sending it my way is because of another open source project I’m involved in called Databox.

For me the Solid is a personal data store, its like a secure vault for your data. This is good but like 2 factor authentication over SMS, not as secure as other ways. Put all your personal data in one place and its a central point for those who want everything at once. Think about how many times you have seen leaks of databases which contain credit cards, numbers, emails, names, etc… Its the eggs/data in one basket problem…

This came up at Mydata 2018, there was quite a lot of discussion about this through out the conference and touched on in Mikko Hypponen’s talk.

The data in one place is just aspect, others are more about the value proposal to people and technically how verified claims work; as expressed in how solid is tim’s plan to redecentralize the web.

The comparisons between Solid and Databox have been asked by many and I would certainly say Databox (regardless of its name) isn’t a place to hold all your personal data. You could use it like that but its more of a privacy aware data processing platform/unit. I remember the first time I heard about Vendor relationship management (VRM), it was clear to me how powerful this could be for many things. But then again I also identified Data portability as something essential while most people just didn’t see the point.

Everything will live or die by not just developer support, privacy controls, security, cleverness, but by user demand… and it feels like personal data stores still a while off in most peoples imagination.

Maybe once enough people personally experience the rough side of personal data breaches it may change?

For example today I received a email from have you been pwned saying…

You’re one of 125,929,660 people pwned in the Apollo data breach.

In July 2018, the sales engagement startup Apollo left a database containing billions of data points publicly exposed without a password. The data was discovered by security researcher Vinny Troia who subsequently sent a subset of the data containing 126 million unique email addresses to Have I Been Pwned. The data left exposed by Apollo was used in their “revenue acceleration platform” and included personal information such as names and email addresses as well as professional information including places of employment, the roles people hold and where they’re located. Apollo stressed that the exposed data did not include sensitive information such as passwords, social security numbers or financial data.

Till this is a everyday occurrence, most people will just carry on and not care? Maybe theres even a point it should be part of the furniture of the web, like the new grey?

Demand your data from Google and Facebook

Data Portability logo

Tim Dobson sent me this over twitter for my consideration

Tim Berners-Lee says demand your data from Google and Facebook

World wide web inventor says personal data held online could be used to usher in new era of personalised services


Seems people have forgotten the work which took place during the late 00’s as one of the founders of the Data Portability group (which still exists by the way). The group was made up of quite a few people all over the world and we successfully convinced the likes of Yahoo, Plaxo, Myspace, Google, Facebook, etc to take data portability seriously.

The turning point was when Robert Scoble tried to take his contacts out of Facebook and into Plaxo. Interesting to see Tim Berners-Lee finally getting the point.

Although to be fair he goes much further thinking about a standard way to export data.

Right now both Google and Facebook have export features and each one is very different in structure. I personally regularly export my data from them every month along with my wordpress and others. I find Google’s Data Liberation centre the best because it gives you control across the board, but then again Google do have more data from me. But right now its all just for back up purposes.

The next step which Tim hints at is the ability to transform and import the data in a standardised way. To be honest its something we (data portability group) talked and thought about, but we were maybe a little too early. Now seems about right to think about the interchange of data more than ever.

There has always been space for startups to be brokers and transformers of the data. Someone like could make a killing in this space, specially if they start charging for use of their pipes (something I suggested while doing the xml pipeline stuff). Could make a nice sustainable business