Can explicit big data replace implicit chemistry?

20130101 Experimenting with a Lott's Chemistry Set c.1956

I’m happy to say the BBC News business section has a piece which I found via ZoeIs big data dating the key to long-lasting romance? Of course I have many thoughts about this whole piece including the memory that I still need to read Love in the Time of Algorithms.

Dating agencies like OKCupid, Match.com – which acquired OKCupid in 2011 for $50m (£30m) – eHarmony and many others, amass this data by making users answer questions about themselves when they sign up.

Some agencies ask as many as 400 questions, and the answers are fed in to large data repositories. Match.com estimates that it has more than 70 terabytes (70,000 gigabytes) of data about its customers.

Applying big data analytics to these treasure troves of information is helping the agencies provide better matches for their customers. And more satisfied customers mean bigger profits.

US internet dating revenues top $2bn (£1.2bn) annually, according to research company IBISWorld. Just under one in 10 of all American adults have tried it.

Just look at those numbers! 1.2bn a year and 70 Terabytes of data plus its growing all the time! You can just imagine the shareholders hovering up the profits… However this is all explicit data, stuff you got to type in. Stuff that people tell porkies about, specially when having to fill in 400 questions…!

Dr Zhao’s algorithm can then suggest potential partners in the same way websites like Amazon or Netflix recommend products or movies, based on the behaviour of other customers who have bought the same products, or enjoyed the same films.

The facebook angle is good and recognised by the likes of Tindr and Grindr. Collaborative filtering of people implicit actions is good but its still not the missing element, aka chemistry.

We already know there is something to the theory that opposites attract. How does this work when your algorithm is based on matching? You almost need a inverse of that but you need to understand human needs and wants, and thats not as simple as copying what we do. Its the whole don’t do what I say and don’t do what I do problem? Imagine somewhere someone is looking at this thing in a totally different way, via a different lens. Because frankly I think all the explicit and implicit data in the world won’t describe why people get together. It looks to be unquantifiable and thats quite surprising from someone like me.

Decentralised networking is hard, no really?

Sydney, January 2009

Straight out of the “No Sh*t Sherlock…” book….

Although I think its amazing what developers do, I can imagine how hard it must be to write decent decentralised software. The Diaspora guys spell out how difficult it is… which Adwale likes to make sure I and others fully understand.

  • If you build a decentralized application, you actually need to ship software. You need to package, test, create installers, test on a variety of platforms, write defensive code to work around misconfigurations your customers are likely to create, etc. For a centralized website, you can often edit files in place on the production server.
    Result: decentralized is 10x harder at least.
  • Somebody somewhere will run every single version of your app that you ever shipped. It will be badly out of date, full of security holes (you fixed years ago), outmoded graphics etc. It will cost you additional support, and your brand will suffer. Almost nobody upgrades to the latest and greatest within a life time it seems.
    Result: decentralized is less functional, less pretty, and less secure.
  • Decentralized software is much harder to monetize. You can’t run ads on somebody else’s installation. You can’t data mine your users (because most of them aren’t in a place that you have access to, it’s somebody else’s installation). You can’t do cross-promotions and referrals etc. You can charge those people who install your software, but there’s a reason most websites are free: much better business.
    Result: decentralized produces less money for you, so you have less investment dollars at your disposal.
  • Database migrations and the like for decentralized apps have to be fully productized, because they will be run by somebody else who does not know what to do when something fails 15 minutes into an ALTER TABLE command.
    Result: decentralized is 10x harder at least.
  • Same thing for performance optimizations and the like: it’s much easier to optimize your code for your own server farm than trying to help Joe remotely whose installation and servers you don’t have access to.
    Result: decentralized is slower, more expensive, and harder.

Frankly although I take the points… If you want to stand out in a clearly over crowded field, and one which has a major elephant using up all the space. You need to think differently (to quote someone we all know too well).

This means doing the difficult things which no one understands and owning the platform!

Your business model should/could be charging other developers to build and be creative on top of your platform. App.net have got the right idea, charge the developers who then create the experiences. Your focus should be on managing the platform and supporting their creativity. Anything else is greed and/or lack of focus.

What do I mean by creativity? Think about Tweetdeck

Tweetdeck innovated on top of the Twitter platform and in the end the platform twitter bought them (stupid move). Tweetdeck for a lot of people made twitter usable at long last. The amount of news rooms I’ve been to and seen tweetdeck with a million panels open is untrue. The same isn’t true now… Tweetdeck guys innovated on top of Twitter and instead of sharing revenue with them or something. They bough them…!

A quote which comes to mind is something like…

The train company thought they were in the railroad business, what they didn’t get was that they were actually in the transportation business.

I really like twitter but frankly their control/greed/whatever is getting out of control. While on a panel yesterday at the London transmedia festival in Ravensbourne College. I was sat with Danielle from Tumblr, Bruce from Twitter, Cat from BBC and Doug Scott from Ogilvy. Although its tempting to make a few comments about there change in stance, I passed. Although I did notice say something which could be seen as slightly negative. Doug said how useful Twitter is for understanding users and I agreed but I said,

“Well its important to remember Twitter is only explicit data, implicit data is the stuff people really want to get there hands on…”

Anyway, the point stands and its hard to see how Twitter will get into the implicit data game at this point. If they acted like a platform, maybe someone else would do the innovation for them. But back to the main point why would you do it on someone closed system?

Decentralised network systems are harder but will drive much more interesting creativity… I can see how this might be at odds with setting up a business, startup and having investors etc… But I’m sure I could make a argument that its better in the long run…