Voice-forward phones: Google Assistant and the next billion users

A range of new flag­ship phones got shown off at the MWC19 trade fair. At one end of the scale, Sam­sung intro­duced three vari­a­tions of its pre­mi­um Galaxy S10 and a new mod­el, the Galaxy Fold, with its inno­v­a­tive fold­ing screen and almost $2,000 price tag. At the oth­er, the Wiz­phone WP006, a phone made only for Indone­sia (where it will be sold in vend­ing machines), cost­ing about $7.

The WP006 is a fea­ture­phone; it has a hard­ware key­board, no touch­screen, 4G con­nec­tiv­i­ty, runs on KaiOS (an oper­at­ing sys­tem based on the aban­doned Fire­fox­OS project), and has a promi­nent micro­phone button—it’s a voice-for­ward phone, pow­ered by Google Assis­tant.


Thinking Out Loud: Understanding Voice UI, and How To Build for It

At work, we talk a lot about ‘voice’; what is it good for? Is it the post-mobile plat­form? And our clients ask us a lot about ‘voice’, and how to build a brand­ed app. But I’m not sure every­one is talk­ing about the same thing; and I’m just as unsure that any­one knows what makes a real­ly good brand­ed ‘voice’ app. I mean, I’m fair­ly sure I don’t.

This arti­cle is my attempt at defin­ing what we’re talk­ing about when we talk about ‘voice’; and, based on my expe­ri­ence as a user and devel­op­er of ‘voice’, try­ing to nail down some of the oppor­tu­ni­ties for brand­ed third-par­ty apps.


Trends in Digital Tech for 2018

In this arti­cle I’m going to be talk­ing about a few cur­rent trends in dig­i­tal tech­nol­o­gy as we move into 2018. It’s not a pre­dic­tions piece—I’m a tech­nol­o­gist, not a futur­ist. And there’s so much to talk about that this was at risk of turn­ing into an essay, so I’ve lim­it­ed it to some of the things that are inter­est­ing to me and rel­e­vant to my job, rather than the fullest/broadest scope of tech. As I said last year, it’s some­what informed, pur­pose­ly skimpy on detail, and very incom­plete.

Computers with Eyes

One of the most inter­est­ing devel­op­ments over the past cou­ple of years has been in the tran­si­tion from cam­eras to eyes; from tak­ing pic­tures, to see­ing. This has two parts: the first, com­put­er vision, recog­nis­es objects in an image; the sec­ond, aug­ment­ed real­i­ty, mod­i­fies the image before it reach­es your eyes.

Computer Vision

Com­put­er vision means under­stand­ing the con­tent of pho­tos: who is in them and what they are doing, where they are, and what else is around. This unlocks visu­al search—that is, find­ing oth­er images that are the­mat­i­cal­ly sim­i­lar to your pho­tos, rather than visu­al­ly sim­i­lar (‘is this most­ly blue?’ becomes ‘is this most­ly sky?’).

Ama­zon, ASOS, eBay, and Pin­ter­est (among oth­ers) use visu­al search to rec­om­mend prod­ucts sim­i­lar to the one you pho­to­graph (‘this pic­ture is of a den­im skirt; here is our range of den­im skirts’), which helps mit­i­gate the prob­lem of using text input to describe the prod­uct you want. Microsoft’s See­ing AI is chang­ing the lives of peo­ple with visu­al impair­ments by using com­put­er vision to describe their imme­di­ate envi­ron­ment (‘three peo­ple at a table near a win­dow’).

The next step for visu­al search is to move from clas­si­fy­ing objects in an image to pro­vid­ing con­tex­tu­al infor­ma­tion about them. Snapchat offers rel­e­vant fil­ters based on the con­tent of pho­tos, Pin­ter­est will start offer­ing looks (‘this is a den­im skirt; here are prod­ucts which com­bine well with this…’). The first mass mar­ket gen­er­al-pur­pose visu­al search is Google Lens which, while fair­ly lim­it­ed now—it can recog­nise land­marks, books/media, and URLs/phone numbers—will get smarter through the year, with recog­ni­tion of appar­el and home goods already teased as com­ing soon.

Peo­ple will begin to expect their cam­eras to be smarter, capa­ble of not just cap­tur­ing a scene, but under­stand­ing it. And it’s like­ly that expec­ta­tion will be to clear­ly give a sin­gle answer, rather than return­ing pages of search results; this will lead to the dimin­ish­ment of organ­ic search, but becomes mon­eti­s­able (brands can pay to have their prod­ucts placed in the result). Google’s years of search expe­ri­ence and an expan­sive knowl­edge graph gives them a huge soft­ware lead over Apple, but I wouldn’t be sur­prised to see a ‘Siri Lens’ sometime—Bing also has a pret­ty good knowl­edge graph they can use.

Augmented Reality

Aug­ment­ed real­i­ty, in its cur­rent form—placing dig­i­tal objects into a cam­era image of a phys­i­cal environment—has been around for a few years, with­out much impact on pub­lic con­scious­ness, but has recent­ly moved into main­stream aware­ness. Snapchat broke the ground with their face-chang­ing Lens­es, then using hor­i­zon­tal plane detec­tion to drop ani­mat­ed 3D dig­i­tal mod­els into the real world (the danc­ing hot­dog); both were sub­se­quent­ly copied and tak­en to greater scale by Facebook’s Cam­era Effects plat­form.

It’s now being pushed fur­ther by deep­er inte­gra­tion into the phone OS (Apple’s ARK­it and Google’s ARCore both take care of the com­plex cal­cu­la­tions required for AR, reduc­ing the bur­den on apps), and bet­ter hardware—Apple have a major lead here with the new cam­era set­up in the iPhone X, which will doubt­less come to all their mod­els in 2018. Google need to rely on their hard­ware part­ners to pro­vide the cam­eras and chips for AR, so will poly­fill it with soft­ware until that hap­pens (I strong­ly sus­pect the Pix­el 3 will be heav­i­ly opti­mised through chips and sen­sors).

IKEA Place and Ama­zon, amongst oth­ers, are using cur­rent-stage AR tech­nol­o­gy to let you see what their prod­ucts would look like in your home before you buy them. But find­ing use cas­es beyond prod­uct pre­views, toys (ani­mo­ji, AR Stick­ers), and games (Poké­mon Go, the forth­com­ing Har­ry Pot­ter) will, I imag­ine, occu­py much of the first part of the year, and pos­si­bly beyond; there is much dis­cov­ery still to be done. It may require an ‘AR Cloud’—a permission/coordinate space that allows dig­i­tal enhance­ments to be share­able and per­sis­tent, so mul­ti­ple peo­ple can see the same thing, in the same place, in the same state—before it becomes real­ly use­ful.

The next stage for AR is to pro­vide a map of your imme­di­ate envi­ron­ment through infrared scanning—Microsoft’s HoloLens does this, and the required scan­ners are now in the iPhone X (Apple bought Prime­Sense, whose tech­nol­o­gy pow­ered the Kinect) although not yet enabled. This allows for dig­i­tal objects to not appear over­laid in two dimen­sions, but to move around in a space with aware­ness of objects in it—this is com­mon­ly called mixed real­i­ty. This unlocks new cat­e­gories, such as indoor wayfind­ing; Google teased this at I/O 2017 with ‘visu­al posi­tion­ing ser­vice’ (VPS), the indoor equiv­a­lent of GPS, but this was a fea­ture of the Tan­go project, which has since been wound down, and with­out the required hard­ware in Android phones Apple could leapfrog them here.

Computers with Ears

Voice recog­ni­tion has improved mas­sive­ly in recent years, and there’s a grow­ing accep­tance among the pub­lic to inter­act­ing through voice. Voice assis­tants have moved from phones to smart speak­ers (Echo, Home, Home­Pod), to cars (CarPlay, Android Auto), to wrists (Apple Watch, Android Wear), to ears (Air­Pods, Pix­el Buds). Of the major dig­i­tal assis­tants, Google’s Assis­tant is much more use­ful than the oth­ers.

Chart comparing digital assistant capability to answer questions; Google Assistant does best

In voice-first (or ‑only) devices, Amazon’s Echo fam­i­ly has the lead in hard­ware sales over Google’s Home range, although Assis­tant has greater range thanks to its pres­ence on Android phones. Apple’s Home­Pod will launch soon, but is com­ing in at a high price in  a mar­ket being dis­put­ed at the low end (Echo Dot and Home Mini are the big sell­ers) and may come too late. Both Ama­zon and Google (and com­peti­tors such as Microsoft’s Cor­tana and Samsung’s Bix­by) are now com­pet­ing to get their assis­tants embed­ded in devices made by third-par­ty man­u­fac­tur­ers. All voice-first devices, how­ev­er, have two major prob­lems which they’ll need to address this year.

The first prob­lem is dis­cov­ery: with no inter­face, how do peo­ple know what they can do? Alexa cur­rent­ly has some ~25k skills on their plat­form, and although Google are pri­ori­tis­ing qual­i­ty over quan­ti­ty (by work­ing more close­ly with brands), get­ting found is still an issue. For now brands will still have to run off-plat­form advertising/awareness cam­paigns, although that’s like­ly to change (I’ll come back to that lat­er).

The sec­ond is in being proac­tive; right now, both Alexa and Assis­tant skills are explic­it­ly invoked, so the user has to ask if any­thing has changed (‘is there an update on my deliv­ery?’). Both Ama­zon and Google are in the process of enabling noti­fi­ca­tions on their devices, but they will need care­ful con­sid­er­a­tion to avoid noti­fi­ca­tion over­load; it’s already con­sid­ered a prob­lem on phones, and could be worse on voice UI if you have to sit and lis­ten to a stack of spo­ken noti­fi­ca­tions.

Audio recog­ni­tion is capa­ble of under­stand­ing more than the human voice. Always-on song recog­ni­tion (run­ning on-device, not send­ing data to servers) is a major fea­ture of the Pix­el 2, and Apple recent­ly acquired Shaz­am (Siri already has a Shaz­am ser­vice built-in). The next stage of audio recog­ni­tion will be to under­stand oth­er envi­ron­men­tal sounds (TV is an area that’s being active­ly explored) and pro­vide con­text about what is being lis­tened to.

Computers with Brains

With more devices becom­ing more capa­ble of extract­ing infor­ma­tion about the world around us, we require bet­ter tools to pro­vide con­text and make deci­sions about what’s use­ful. This becomes a vir­tu­ous cir­cle, as tools make more data, and more data makes tools more use­ful.

Rec­om­men­da­tions based on visu­al search become more use­ful by know­ing your taste through your pho­to his­to­ry; not just what you wear, but your tastes in fur­ni­ture, home goods… at the moment the visu­al search of ASOS and Pin­ter­est give rec­om­men­da­tions based on recog­nis­ing a sin­gle prod­uct but giv­en, say, your Insta­gram his­to­ry, could refine your rec­om­men­da­tions with infer­ences from your broad­er tastes (‘peo­ple who like art deco fur­ni­ture tend to wear…’).

Algo­rith­mic rec­om­men­da­tion could help solve one of the prob­lems fac­ing any future mixed real­i­ty inter­face: as you have a poten­tial­ly unlim­it­ed num­ber of things to look at (it’s the whole world around you), how does your inter­face decide what is the most appro­pri­ate con­tex­tu­al infor­ma­tion to pro­vide, and who pro­vides it for you? An app-like expe­ri­ence (‘open TopT­able and tell me about this restau­rant’) lim­its dis­cov­ery, so it may be bet­ter to take a search engine approach, where the sys­tem tries to infer the best con­tent to offer based on a num­ber of rank­ing fac­tors.

Mixed real­i­ty is a dis­play prob­lem, a sen­sor prob­lem and a deci­sion prob­lem. Show an image that looks real, work out what’s in the world and where to put that image, and work out what image you should show. — Ben Evans.

As I men­tioned ear­li­er, voice-first/-only devices suf­fer from a lack of dis­cov­er­abil­i­ty. Alexa and Google Assis­tant are try­ing to solve this using intent; if a user asks for some­thing that the assis­tant doesn’t cov­er, it will rec­om­mend a third-par­ty app. Google calls these implic­it invo­ca­tions; a voice action from, say, Nike, can sug­gest itself as appro­pri­ate if a user asks for run­ning advice rather than explic­it­ly invok­ing Nike by name (this works like organ­ic search, but there’s future scope for this to be mon­e­tised like paid search using an Adwords-like sys­tem).

The Natural User Interface

With com­put­ers being more aware of what’s around them through their ‘eyes and ears’, the next step will be to bring them togeth­er: using com­put­er vision, audio recog­ni­tion and mixed real­i­ty to cre­ate mean­ing­ful, con­tex­tu­al con­nec­tions between the phys­i­cal and digital—a vir­tu­al map of the imme­di­ate envi­ron­ment, with an aware­ness and under­stand­ing of the things in it, and con­tex­tu­al infor­ma­tion pro­vok­ing rel­e­vant inter­ac­tion with dig­i­tal objects.

Plac­ing 3D objects into a scene is one part of this, but images can also be enhanced in dif­fer­ent ways, enrich­ing and enliven­ing the world around us. We can ask the ques­tion: what would aug­ment real­i­ty? Answers range from pro­vid­ing expla­na­tions and instruc­tions of phys­i­cal objects, to trans­lat­ing for­eign lan­guages in situ, to show­ing user reviews or price com­par­isons. With motion mag­ni­fi­ca­tion, almost imper­cep­ti­ble move­ments (like a pulse, or a baby’s breath­ing) can be ampli­fied to become vis­i­ble. Real­ly, we’re just at the start of what’s pos­si­ble.

The next log­i­cal step is to move this from video see-through dis­plays to opti­cal see-through dis­plays—that is, out of phones and into glass­es (just as voice con­trol has moved into ear­phones). Microsoft’s HoloLens is an ear­ly pre­view of this, Mag­ic Leap promised that their much-talked-about tech­nol­o­gy will become pub­lic in 2018, and you can bet that every oth­er major tech com­pa­ny has their own ver­sion in the works.

Dif­fer­ent ser­vices, pow­ered by machine learning—computer vision, con­tex­tu­al rec­om­men­da­tions, mixed real­i­ty, and voice recognition—could even­tu­al­ly come togeth­er to cre­ate the post-mobile inter­face: under­stand­ing the phys­i­cal envi­ron­ment and enhanc­ing it with a con­tex­tu­al dig­i­tal lay­er, and dis­trib­ut­ing it into devices beyond the phone. Whether any­one will actu­al­ly achieve that in 2018 is up for debate (but unlike­ly).

Closing Social

There were signs this year that open social might have peaked. Shar­ing on Face­book has been declin­ing for a cou­ple of years, off­set some­what by increased shar­ing on Mes­sen­ger and What­sApp. It’s too soon to say it’s def­i­nite­ly peaked—or why—but cer­tain­ly in the broad­er media nar­ra­tive open social (and Face­book in par­tic­u­lar) was blamed as the flash­point for con­flicts of the val­ues of dif­fer­ent groups and gen­er­a­tions. Face­book can’t have failed to notice the decrease, and recent bouts of soul-search­ing led to them depri­ori­tis­ing arti­cles from the News Feed (with an appro­pri­ate drop in engage­ment for pub­lish­ers), and pro­mot­ing shar­ing and per­son­al updates—even to the extent of tri­alling a sep­a­rat­ed news feed, with all arti­cles in a sep­a­rate (hid­den) view—splitting the social from the media.

None of the big open social apps do tru­ly sequen­tial time­lines any more; Twit­ter and Insta­gram have fol­lowed Face­book by show­ing algo­rith­mi­cal­ly sort­ed time­lines so you don’t miss the good stuff (or, what they under­stand to be what you think is the good stuff). More shar­ing on Insta­gram is going into direct messages—another exper­i­ment is under­way to move DMs in their own app, which would become Facebook’s fifth mes­sag­ing app (after Mes­sen­ger, Mes­sen­ger Kids, What­sApp, and recent­ly pur­chased teen-focused app, tbh). Instagram’s Sto­ries have been one of their suc­cess­es, quick­ly sur­pass­ing the usage of Snapchat (from which they stole the for­mat), although Snapchat is increas­ing­ly more pop­u­lar with teens—per­haps anoth­er rea­son for the tbh pur­chase.

Adver­tis­ing is less of a nat­ur­al fit in mes­sag­ing inter­faces, and Face­book are try­ing to get more diverse rev­enue from their plat­forms—over 98% of glob­al rev­enue is from adver­tis­ing. Mes­sen­ger recent­ly announced mon­eti­sa­tion options for their Instant Games (inter­est­ing­ly, WeChat—usually viewed as the mod­el for Messenger—are over-reliant on games rev­enue and would like more adver­tis­ing). What­sApp is set to launch busi­ness accounts, where enter­prise users will pay for access to their cus­tomers.

The Mes­sen­ger (bot) plat­form seems to be set­tling around cus­tomer ser­vice, with brands (rather than ser­vices) com­ing to realise that it’s not a great fit for cam­paigns, but not always able to see anoth­er way into it. The ear­ly promise of con­ver­sa­tion­al inter­ac­tion in mes­sag­ing has hit the real­i­ty that nat­ur­al lan­guage requires a great invest­ment in train­ing, script­ing, and test­ing, so bots have tend­ed to fall back into button/prompt UI, which is often a worse expe­ri­ence than using a rich Web or native app inter­face. With many brands not will­ing to invest with­out clear return on invest­ment there is a vicious cir­cle (low invest­ment, dimin­ished expe­ri­ence, low user uptake, and repeat) indi­cat­ing that mes­sag­ing is like­ly to take a while longer to ful­fil its promise.

Other Notes

Those are the major trends I’m inter­est­ed in for (ear­ly) 2018, but there’s plen­ty more to be aware of.

Smarter and Cheaper Devices

Machine learn­ing is increas­ing­ly being run on-device (most­ly phone) rather than cloud servers. On-device ML is good for get­ting fast results, low­er­ing net­work data usage, and improv­ing pri­va­cy. Google’s Ten­sor­flow Lite seems set to become the ear­ly stan­dard for on-device learn­ing, using pre-trained mod­els accel­er­at­ed by device APIs (Android 8.1’s Neur­al Net­works API, iOS 11’s Core ML) Many of Apple’s iOS machine learn­ing mod­els, such as face recog­ni­tion, are already on-device, and Google’s recent pho­tog­ra­phy ‘appsper­i­ments’ (ugh) also show that’s a way for­ward they’re embrac­ing.

On-device learn­ing com­bined with cheap, minia­turised hard­ware (a prod­uct of the smart­phone boom) opens up a new cat­e­go­ry of smart, sin­gle-pur­pose devices. Google Clips is one exam­ple: a cam­era with pre-trained com­put­er vision mod­el that detects when an ‘inter­est­ing’ moment hap­pens, cap­tures it in a short video clip and sends it to your phone—no oper­a­tor required.

This could extend to oth­er phone/smart device func­tions, such as voice-con­trolled speak­ers that don’t require the full pow­er of Alexa or Assis­tant, instead using pre-trained mod­els to con­trol music play­back. And research repeat­ed­ly shows that some of the most-used func­tions on smart speak­ers are set­ting alarms and timers, and unit con­ver­sion (for cook­ery), so it’s not a stretch to imag­ine a cheap kitchen timer that has the lim­it­ed smarts to car­ry out those core func­tions.

The Decline of the Ad-funding Model

The steady growth of ad-block­ers indi­cates that users are tired of ads and—in particular—invasive track­ing, lead­ing to more device-native ad-block­ing; Apple’s Safari brows­er recent­ly start­ed block­ing a num­ber of third-par­ty track­ing scripts (the impact of that is already being felt), and from ear­ly 2018 Google’s Chrome will start to black­list sites that per­sis­tent­ly vio­late the Bet­ter Ads Stan­dards. The EU’s Gen­er­al Data Pro­tec­tion Reg­u­la­tion (GDPR) will come into force in ear­ly 2018, which should make it hard­er for com­pa­nies to (legal­ly) track users and share their data with oth­er ser­vices. All of this may have a knock-on effect on adver­tis­ing rev­enue (espe­cial­ly to those oper­at­ing in murky areas who deserve pun­ish­ment).

It seems strange to talk about a decline when dig­i­tal ad spend con­tin­ues to grow (and, in 2017, over­took TV spend for the first time), but the prob­lem is that Google and Face­book already take around 2/3 of adver­tis­ing spend, and Ama­zon (includ­ing Alexa) is on course to join them (as they become the de fac­to pre-pur­chase search engine). This leaves dig­i­tal media pub­lish­ers with less rev­enue, and 2017 saw busi­ness­es rely­ing on the ad-fund­ing model—such as Buz­zfeed and Vice—facing job cuts and restruc­tur­ing.

Many pub­lish­ers have opt­ed for paywalls/paygates, but these lim­it reach and have a nat­ur­al cap—how many peo­ple can afford to pay for one or more sub­scrip­tions? A few pub­lish­ers are try­ing read­er dona­tion ser­vices to make up for the drop in ad revenue—the Guardian and New York Times have had some suc­cess with this mod­el. With bet­ter pay­ment meth­ods arriv­ing in browsers (Apple Pay, the Pay­ment Request API), it’s pos­si­ble that some ad rev­enue loss could be off­set by micro­pay­ments.


The UK’s Open Bank­ing API stan­dard rolls out in ear­ly 2018, with the EU Sec­ond Pay­ment Ser­vices Direc­tive (PSD2) fol­low­ing short­ly after. The two are set to have a huge impact on bank­ing and per­son­al finance in Europe, bring­ing a wave of new banks and sav­ings appli­ca­tions and shak­ing up the exist­ing insti­tu­tions.

As for cryp­tocur­ren­cies… I have a hard time with these. The lead­ing cryp­tocur­ren­cy, Bit­coin, has basi­cal­ly failed to meet every one of its promis­es, and only real­ly works as an invest­ment vehi­cle. The under­ly­ing blockchain tech­nol­o­gy promis­es to have more ben­e­fit, but most of them seem to be B2B—I haven’t real­ly seen any con­vinc­ing con­sumer use cas­es. One area that I am intrigued by is using them to cre­ate dig­i­tal scarci­ty, like Cryp­toKit­ties; play­ful use cas­es can often lead to more inter­est­ing out­comes, and adding val­ue to dig­i­tal art sounds use­ful. For every­thing else… I’ll wait and see.


Although there is grow­ing oppor­tu­ni­ty in VR gam­ing, I still can’t see this break­ing into the main­stream. Phone-based VR has seri­ous tech­ni­cal lim­i­ta­tions to over­come, teth­ered head­sets are too expen­sive and cum­ber­some (and don’t seem to have sold well, although recent price cuts have helped a lit­tle). The next gen­er­a­tion stand­alone head­sets (Ocu­lus Go, Vive Focus, Day­dream) could open the mar­ket a lit­tle more, but I still think it has to over­come its biggest prob­lems: iso­la­tion, and requir­ing excep­tion­al behav­iour (it’s not as easy as watch­ing TV or using a phone). This may be mit­i­gat­ed by future tech­nol­o­gy, but I can’t see any imme­di­ate signs of that hap­pen­ing.

Machine Learning

There’s lit­tle point in talk­ing about machine learn­ing as a sep­a­rate tech­nol­o­gy; it’s the fuel pow­er­ing much of every­thing inter­est­ing that’s hap­pen­ing. One area of par­tic­u­lar inter­est for 2018 will be authen­tic­i­ty: ‘fake’ images and audio gen­er­at­ed with machine learn­ing algo­rithms are get­ting increas­ing­ly con­vinc­ing, and it seems alarm­ing­ly easy to use an adver­sar­i­al net­work to ‘trick’ com­put­er vision mod­els into see­ing some­thing oth­er than we do.

The Web

There’s lit­tle point in talk­ing about the Web as a sep­a­rate tech­nol­o­gy; it’s the data lay­er con­nect­ing much of every­thing inter­est­ing that’s hap­pen­ing. While the major oper­at­ing sys­tems and plat­forms refuse to coop­er­ate, the Web still pro­vides the broad­est reach, espe­cial­ly in devel­op­ing mar­kets using low­er-pow­ered devices and with­out access to closed app stores. It’s inter­est­ing to see pre­vi­ous­ly closed plat­forms like Insta­gram and Snapchat more will­ing to go to the Web as they move to scale.

Thanks for read­ing. If you’re inter­est­ed in sto­ries about technology’s role in cul­ture, soci­ety, sci­ence, and phi­los­o­phy, you might want to sub­scribe to my newslet­ter, The Thought­ful Net.


Trends in digital media for 2017

Alright, stand back every­one: I’m about to have some opin­ions about tech­nol­o­gy in 2017. Because obvi­ous­ly there’s been a short­age of those.

As part of my Tech­nol­o­gist role at +rehab­stu­dio I put togeth­er inter­nal brief­in­gs about dig­i­tal media, con­sumer tech­nol­o­gy, where the dig­i­tal mar­ket­ing indus­try could go in the near future, and what we should be com­mu­ni­cat­ing to our clients. Not try­ing to make pre­dic­tions, but to fol­low trends.

This arti­cle is based on my lat­est brief­ing. It’s some­what informed, pur­pose­ly skimpy on detail, and very incom­plete: I have some thoughts on adver­tis­ing and pub­lish­ing that I can’t quite dis­til yet, and machine learn­ing is a vast sur­face that I can bare­ly scratch.

If for noth­ing more than press cov­er­age, 2016 was the year of mes­sag­ing, and the explo­sion of the mes­sag­ing bot. The biggest play­er in the game, Facebook’s Mes­sen­ger, launched their bot plat­form in April, and by Novem­ber some 33,000 bots had been released. Recent tools added to the plat­form include embed­ded web­views, HTML5 games, and in-app pay­ments.

The first six months of bots were large­ly the ‘fart app’ stage, but there are signs that brands and ser­vices are final­ly start­ing to see the real oppor­tu­ni­ties in mes­sag­ing: remov­ing fric­tion from their users’ inter­ac­tions with them. Fric­tion in app man­age­ment and UI com­plex­i­ty, for exam­ple.

The same removal of fric­tion is also a key dri­ver behind the growth of home assis­tants and voice inter­ac­tion, like Alexa. Remov­ing the UI abstrac­tion between users and tasks is a clear trend. As an illus­tra­tion, com­pare two user flows for watch­ing Stranger Things on Net­flix on your TV; first using a smart­phone:

  1. Unlock phone.
  2. Find and open Net­flix app.
  3. Press the ‘cast’ but­ton.
  4. Find ‘Stranger Things’.
  5. Play.

Now using Google Home:

  1. OK Google, play Stranger Things from Net­flix on My TV.”

Home assis­tants make the smart home eas­i­er to man­age. No more sep­a­rate apps for Wemo, Hue, Nest, etc; a sin­gle voice inter­face (per­haps glued togeth­er with a cloud ser­vice like IFTT) con­trols all the dif­fer­ent devices in your home.

Mes­sag­ing and voice are vis­i­ble aspects of the trend towards the inter­face on demand:

The app only appears in a par­tic­u­lar con­text when nec­es­sary and in the for­mat which is most con­ve­nient for the user.

While native mobile apps are still a growth area, it’s becom­ing much hard­er to get users to down­load and engage with apps out­side of a small pop­u­lar core. This is espe­cial­ly true for retail, where con­sumers are more omniv­o­rous and like to browse wide­ly.

Improve­ments in the capa­bil­i­ties of web apps (espe­cial­ly on Chrome for Android) sug­gest an alter­na­tive to native apps in some cas­es. This has been demon­strat­ed by the suc­cess of new web apps from major retail brands like Flip­kart and Ali Baba in devel­op­ing economies where an offi­cial app store may not be avail­able, or net­work costs may make app down­loads unde­sir­able.

Web apps require no instal­la­tion, avoid­ing the app store prob­lem. They’re start­ing to get impor­tant fea­tures like push noti­fi­ca­tions and pay­ment APIs. And mes­sag­ing plat­forms, with their large installed user base, pro­vide the web with a social and dis­tri­b­u­tion lay­er that the brows­er nev­er did:

Mes­sag­ing apps and social net­works [are] wrap­pers for the mobile web. They’re actu­al­ly browsers… [and] give us the social con­text and con­nec­tions we crave, some­thing tra­di­tion­al browsers do not.

So it may be that for some brands, a web­site opti­mised for per­for­mance, engage­ment, and shar­ing, along with a decent mes­sag­ing and social strat­e­gy, will offer a bet­ter invest­ment than native apps and app store mar­ket­ing. Patag­o­nia already closed their native app. Gart­ner pre­dict that some 20% of brands will fol­low by 2019:

Many brands are find­ing that their mobile apps are not pay­ing off.

The most impor­tant app on your phone could be the cam­era, which will be increas­ing­ly impor­tant this year. First, by reveal­ing the ‘dark mat­ter’ of the inter­net: images, video and sound. So much of this data is uploaded every day, but with­out the seman­tic val­ue of text, it’s mean­ing is lost to non-humans — like search engines, for exam­ple. But machine learn­ing is becom­ing very good at under­stand­ing the con­tent of this opaque data, mean­ing the role of the cam­era changes:

It’s not real­ly a cam­era, tak­ing pic­tures; it’s an eye, that can see.

It can see faces, land­marks, logos, objects; hear back­ground chat and music. That’s under­stand­ing con­text, loca­tion, pur­chase his­to­ry, and behav­iour, with­out being explic­it­ly told any­thing. This is why Face­book, through Mes­sen­ger and Insta­gram, are furi­ous­ly copy­ing Snapchat’s best fea­tures: they want their young audi­ence and the data they bring.

Will it be intru­sive? Yes. Will it hap­pen? Yes. I’ve tried to avoid mak­ing hard pre­dic­tions in this piece, but I am as con­fi­dent as I can be that our image and video his­to­ry will be used for mar­ket­ing data.

Cam­eras will also be impor­tant in alter­ing the images that are shown to the users. Aug­ment­ed real­i­ty is an excit­ing tech­nol­o­gy, although good-enough ded­i­cat­ed hard­ware is still a while away. But there’s a def­i­nite mar­ket drift in that direc­tion, and lead­ing it is Snapchat: they’re stealth­ily intro­duc­ing AR through mod­i­fy­ing the base lay­er of reality—first, by alter­ing faces using their lens­es. This isn’t friv­o­lous; it’s expand­ing the range of dig­i­tal com­mu­ni­ca­tion, like emo­ji do for text.

If peo­ple are talk­ing in pic­tures, they need those pic­tures to be capa­ble of express­ing the whole range of human emo­tion.

Recent Snapchat lens­es have start­ed alter­ing voic­es, and your envi­ron­ment. They’ve recent­ly bought a com­pa­ny that spe­cialis­es in adding 3D objects into real envi­ron­ments. With Spec­ta­cles they’re not only remov­ing fric­tion from the process of tak­ing a pho­to, they’re pro­to­typ­ing hard­ware at scale. This is the road to AR. Snap Inc. want to be the cam­era com­pa­ny — not in the way that Nikon was, but in the way that Face­book is the social com­pa­ny.

The com­pan­ion to an aug­ment­ed real­i­ty is a vir­tu­al one, but I don’t believe we’ll see VR going main­stream in 2017—and I say that as a pro­po­nent. It’s sta­t­ic, iso­lat­ing, and it requires peo­ple to form a new behav­iour. It’s inter­est­ing to see cre­ators exper­i­ment with the form, and I’ve no doubt that we’ll see some very inter­est­ing expe­ri­ences launched this year. But domes­tic sales aren’t huge, and high-end units are too expen­sive, and low-end not quite up to scratch yet. Still think it will be big for gamers, though.

I have more. A lot more. But I think it will all be bet­ter explained in a series of sub­se­quent blog posts, so I’ll aim to do that. In the mean­time, would love to hear your thoughts, argu­ments, objec­tions, and con­clu­sions.