In a recent interview, Sundar Pichai of Google discusses improvements in the accuracy of their voice recognition:
Just in the last three years, we have taken things like error in word recognition down from about 23 percent to 8 percent.
That’s the difference between misunderstanding one word in four, to one word in twelve; the difference between completely unusable, and annoying.
Andew Ng, formerly of Google and now of Baidu, expands on this:
Most people don’t understand the difference between 95 and 99 percent accurate. Ninety-five percent means you get one-in-20 words wrong. That’s just annoying, it’s painful to go back and correct it on your cell phone.
Ninety-nine percent is game changing. If there’s 99 percent, it becomes reliable. It just works and you use it all the time. So this is not just a four percent incremental improvement, this is the difference between people rarely using it and people using it all the time.
It’s fascinating to see how these small numbers make a huge difference; you might think Google’s 92% accurate is only a little less than Baidu’s 95% accurate, but in practical terms there’s a big gulf. And it gives me pause to think about the money, human resource and computing power spent on trying to make those small huge increases in accuracy.