Data use and privacy in Web services

Tim Cook recently made a speech attacking Silicon Valley companies (e.g. Google and Facebook) for making money by selling their users’ privacy. The problem with what he said is that, first of all, it’s fundamentally incorrect. As Ben Thompson points out (subscription required):

It’s simply not true to say that Google or Facebook are selling off your data. Google and Facebook do know a lot about individuals, but advertisers don’t know anything — that’s why Google and Facebook can charge a premium! [They] are highly motivated to protect user data – their competitive advantage in advertising is that they have data on customers that no one else has.

Cennydd Bowles also argues the same point:

The “you are the product” thing is pure sloganeering. It sounds convincing on first principles but doesn’t hold up to analysis. It’s essentially saying all two-sided platforms are immoral, which is daft.

The @StartupLJackson Twitter account puts this more plainly:

People who argue free-to-customer data companies (FB/Goog/etc) are selling data & hurting consumers are the anti-vaxxers of our industry.

I’ve always maintained that this is about a value exchange – you can use my data, as long as I get control and transparency over who sees it, and a useful service in return. But beyond that, another problem with making premium services where you pay for privacy is that you make a two-tier system. Cennydd again:

The supposition that only a consumer-funded model is ethically sound is itself political and exclusionary (of the poor, children, etc).

And Kate Crawford:

Two-tier social media: the rich pay to opt out of Facebook ads, the poor get targeted endlessly. Privacy becomes a luxury good.

Aside: Of course this suits Apple, as if wealthier clients can afford to opt out of advertising, then advertising itself becomes less valuable – as do, in turn, Google and Facebook.

The fact that people are willing to enter into a data exchange which benefits them when they get good services in return highlights the second problem with Tim Cook’s attack: Apple are currently failing to provide good services. As Thomas Ricker says in his snappily-titled Tim Cook brings a knife to a cloud fight:

Fact is, Apple is behind on web services. Arguably, Google Maps is better than Apple Maps, Gmail is better than Apple Mail, Google Drive is better than iCloud, Google Docs is better than iWork, and Google Photos can “surprise and delight” better than Apple Photos.

And even staunch Apple defender Jon Gruber agreed:

Apple needs to provide best-of-breed services and privacy, not second-best-but-more-private services. Many people will and do choose convenience and reliability over privacy. Apple’s superior position on privacy needs to be the icing on the cake, not their primary selling point.

As this piece by Jay Yarow for Business Insider points out, in the age of machine learning, more data makes better services. Facebook and Google are ahead in services because they make products that understand their users better than Apple do.

Small Numbers, Huge Changes

In a recent interview, Sundar Pichai of Google discusses improvements in the accuracy of their voice recognition:

Just in the last three years, we have taken things like error in word recognition down from about 23 percent to 8 percent.

That’s the difference between misunderstanding one word in four, to one word in twelve; the difference between completely unusable, and annoying.

Andew Ng, formerly of Google and now of Baidu, expands on this:

Most people don’t understand the difference between 95 and 99 percent accurate. Ninety-five percent means you get one-in-20 words wrong. That’s just annoying, it’s painful to go back and correct it on your cell phone.

Ninety-nine percent is game changing. If there’s 99 percent, it becomes reliable. It just works and you use it all the time. So this is not just a four percent incremental improvement, this is the difference between people rarely using it and people using it all the time.

It’s fascinating to see how these small numbers make a huge difference; you might think Google’s 92% accurate is only a little less than Baidu’s 95% accurate, but in practical terms there’s a big gulf. And it gives me pause to think about the money, human resource and computing power spent on trying to make those small huge increases in accuracy.