Voice tech like Alexa and Siri hasn’t found its true calling yet: Inside the voice assistant ‘revolution’
Until we invent something that wouldn’t be possible without voice, we’re just repurposing online content for our ears.
You can soon, if you insist, talk to your microwave — and it will listen.
Amazon recently announced a $60 voice-activated microwave, along with 10 other new products using its Alexa voice assistant. Roll your eyes if you’d like, but it’s the latest example of Amazon’s obsession with making Alexa — which first launched four years ago — ubiquitous, from the kitchen to the car.
And Amazon is hardly alone. Google has been ramping up its Assistant voice-powered gadget lineup, recently announcing the Google Home Hub, a smart speaker with a screen. Facebook just unveiled Portal, a device that specializes in video calls in addition to its smart speaker responsibilities. Apple has its HomePod, plus Siri — which has been around since 2011 — built in to all of its devices, from the iPhone to AirPods.
Voice assistants are potentially privacy nightmares as they proceed to monitor more and more elements of our daily lives. That fear, however, hasn’t seemed to put a dent in sales.
As the holiday shopping season approaches, voice-powered smart speakers are again expected to be big sellers, adding to the approximately one-quarter to one-third of the U.S. population that already owns a smart speaker and uses a voice assistant at least once a month.
Voice interfaces have been adopted faster than nearly any other technology in history. And with big sales has come big hype, thanks in part to breathless prognostications about our voice-driven future:
- The global number of installed smart speakers is going to more than double to 225 million units in two years, says Canalys.
- Voice shopping on Alexa alone could generate more than $5 billion per year in revenue by 2020, according to RBC Capital Markets.
- Global ad spending on voice assistants — currently nonexistent — will reach $19 billion by 2022, nearly the size of the current magazine ad business, per Juniper Research.
While some of this will likely come to pass, the hype might be disguising where we really are with voice technology: Earlier than we think.
About a third of smart speaker owners end up using them less after the first month, according to an NPR and Edison Research report earlier this year. Just a little more than half said they wouldn’t want to go back to life without a smart speaker.
While people are certainly enthusiastic about the new technology, it’s not exactly life-changing yet.
Today, voice assistants and smart speakers have proven to be popular ways to turn on the radio or dim the lights or get weather information. But to be revolutionary, they will need to find a greater calling — a new, breakout application.
Where voice technology is succeeding
That’s not to say they aren’t already proving useful.
Human-to-machine voice interactions are inevitable. And as affordable consumer devices and fast wireless networks proliferate — most voice-assistant processing happens on far-away servers — they are driving real utility and are changing how we interact with machines.
Playing music and other audio content is far and away the most successful current use case for voice assistants.
Anywhere from 70 percent to 90 percent of smart speaker users say they have streamed music on a smart speaker, depending on the study. About half of that amount do so daily. These statistics are meaningful, showing major change in how we consume media.
Indeed, people who listen to Spotify on smart speakers are more likely to listen to music every day, according to the streaming music company, than Spotify users overall. They’re also more likely to listen to music on weekends and request nostalgic songs.
Old-school radio is feeling the smart speaker love as well. This time last year, 4 percent of National Public Radio’s live listening hours came through smart speakers. It’s now at 19 percent, according to Tamar Charney, managing director for personalization and curation at NPR.
Importantly for public radio, which is funded by listeners, this listening is accretive, as NPR hasn’t seen declines on other platforms. “The more time people listen, the more engaged they are with the content and the more likely they are to donate,” Charney said. (Public radio stations now regularly remind listeners that they can ask their smart speakers to stream NPR.)
“Smart speakers single-handedly brought radio back to the home,” according to Bret Kinsella, founder of the Voicebot blog and podcast, which are dedicated completely to voice technology. “The killer voice app is calling up music.”
Even podcasts, which can be more difficult to surface with voice assistants, are seeing a boost. People often spend twice as long listening to podcasts on smart speakers as they do on their phones, according Cara Meverden, founder of voice-controlled podcast curation app Scout FM.
Even more compelling for the advertising-led world of podcasts, smart speaker listeners are less prone to skipping through advertising than those who listen on computers or phones.
“People are much less likely to skip ads on Alexa. It’s more inconvenient to tell your Alexa to skip forward 30 seconds than it is to just let the ads play,” Meverden told Recode. “Smart speaker listeners are much more passive,” she added. “People with voice interfaces tend to accept what’s given to them.”
Voice technology has also helped bring smart home devices — thermostats, lights, locks and any other appliances that can be controlled from anywhere — closer to the mainstream. Google’s Assistant now works with more than 10,000 smart home devicesfrom other manufacturers. Alexa works with more than 20,000.
Instead of having to set up your remote-controlled lights or program a smart home hub, voice assistants have shouldered a lot of the complexity. In some cases, it’s only a matter of plugging in the device and then commanding your voice assistant to control it.
“It used to be that the only person who could use [the smart home device] was the person who set it up,” Google Director of Product Management and Hardware Micah Collins told Recode. “Using the voice interface to control things represented a huge change in usability.”
That has caused sales of smart home devices to increase. The worldwide smart home device market — including smart speakers, digital media adapters, lighting, thermostats, home monitoring and security devices — is expected to grow 27 percent this year to about 550 million units, according to IDC.
The impact has been felt throughout the smart home industry.
A majority of Leviton’s smart light switches and outlets have been accessed using Alexa or Google Assistant. Has it helped sales? “Without a doubt,” said Leviton product manager James Shurte. “Voice control is really the primary driver of more mass market smart home options.”
Voice assistants helped smart lock maker August double its revenue last year. “Once people buy smart speakers, they want to do things with them — connect them to lights and locks,” according to August co-founder and CEO Jason Johnson. “They look for things to buy to make use of the speakers they’re purchasing.”
In a perfectly set-up house, you can control the sound, the temperature and locks with your voice. You can command streaming media to play with the movement of your lips. It’s all very cool and Jetsons-y. But it’s also stuff most could have done pretty easily without voice technology, by walking a few feet or by clicking on an app.
While streaming radio and smart home controls seem like useful-enough functions of smart speakers, it would be disappointing if that’s as far as voice assistants end up taking us. The life-changing thing you couldn’t do without voice doesn’t seem to exist yet.
Consumers aren’t buying voice shopping
One thing that companies hope will happen — but hasn’t yet — is that people will start using their voice assistants to effortlessly buy all kinds of stuff. Most surveys show that only around 20 percent of smart speaker users have ever used their device to make a purchase. The number that shop monthly is half that.
A more alarming report for retailers by The Information said that just 2 percent of Alexa users had made a purchase with the device this year, through early August. Whatever the number is, it’s not yet what retailers are dreaming of.
Indeed, the majority of people still prefer to shop at physical stores, according to a May survey by Voicebot. Not even 1 percent of Americans said they would rather shop on a smart speaker.
Perhaps in response, Amazon has been trying to convince consumer brands to feature Alexa shopping commands on their advertisements and packaging.
It’s certainly telling that the latest iteration of smart speakers all have screens. It points to the current limits of voice technology, especially when it comes to commerce.
“I think shopping is still very early days, especially for voice-only products,” Google’s Collins told Recode. “Shopping is predominantly a visual and tactile experience.”
With voice, you only get one or two options when you’re searching for a product, not pages and pages of a product catalog, like you might be accustomed to online.
That makes it more ideal for household consumables — rather than, say, clothing — because they are low-cost, need to be reordered frequently and don’t require as much consideration as discretionary purchases with higher price tags.
Still, that’s a potential king-maker moment for brands. Becoming the first toilet paper suggestion on a voice assistant can mean being the first toilet paper purchase.
Big brands are going all-in on voice
Some 85 percent of consumers who’ve made a voice-assisted purchase say they’ve bought the first option presented, according to a study from marketing agency Digitas.
So big brands are trying to get first-mover advantage; they’re not waiting for shopping to get off the ground before they start their voice strategies.
“We believe that the impact of voice is going to be a bigger impact on the consumer journey than Google search was in the 2000s,” Campbell’s VP of Digital Marketing Matt Pritchard told Recode. “Two or four years from now, smarter people than me say voice will be the preferred way to search. If you haven’t converted your code or site experience to be ready, you’re going to be behind.”
How it works now: If you ask Amazon’s Alexa or Google Assistant to buy, say, shampoo, they’ll surface what they think you’ll want. Alexa uses several criteria to suggest a purchase option: Your order history, whether a product is eligible for free Prime shipping and whether the product has the “Amazon’s Choice” seal of approval — “highly rated, well-priced products available to ship immediately.”
Google picks products from Google Express merchantsthat are most relevant to the query. It also considers purchase history and information about user preferences, as well as an item’s availability and proximity.
Both companies say there is no favoring of specific retailers — or their own products.
Brands also can’t pay for visibility — yet. For now, Amazon and Google are trying to build trust among new — few — voice buyers by making their search results as relevant as possible. It doesn’t, however, take much imagination to see a future in which Amazon or Google merchants could pay to have their products suggested by their smart assistants — like sponsored ads that crowd their websites — as a way to generate more ad dollars.
Until then, brands are mostly using voice for marketing and awareness campaigns around their products. That often includes updating search keywords to reflect how people actually talk, offering more web content in general and beefing up brand FAQs so that they accurately answer queries customers might ask.
The best examples of this are when a brand provides a vital utility related to its product. That utility usually comes in the form of a skill or action, which is basically a voice assistant’s version of a mobile app.
Tide, for example, created a popular voice skill that explains how to get different stains out of different fabrics.
Campbell’s offers step-by-step recipes based on items you have in your house. “We want to get closer to consumers so we can better know and better understand them,” Campbell’s Pritchard said.
Patrón Spirits Company suggests new cocktails to try and coaches you through preparing them at home.
Since these brands would like to spread their messages as far as possible, they aren’t sticking to just one voice assistant, but rather creating skills and experiences for all smart devices.
“Amazon is more of a bet on purchasing, while Google is more for browsing or education,” Patrón VP of Digital Marketing Adrian Parker told Recode. “Both are in our strategy.”
Companies that complement or compete with the major voice players are keeping an open mind as well.
Spotify, Sonos and Qualcomm, which all work in conjunction with smart speakers and voice assistants, are being neutral on which platforms they work with so they can reach the most customers.
“We have a uniform, open platform that effectively scales across all voice control systems,” according to Rahul Patel, senior VP and GM of connectivity and networking at Qualcomm, a company that provides software and processing power for other brands’ smart devices. The company recently released an audio chipspecifically designed to integrate headphones with digital assistants. “We’re not picking sides; we’re supporting everyone.”
Sonos, which is known for its high-end speakers that work with Alexa and soon Google Assistant, is also remaining neutral.
“Instead of trapping people in walled gardens, Sonos’ design is to be open and work with whatever voice assistant you want to use,” according to Antoine Leblond, VP of software.
This is what winning could look like
It’s too soon in the voice revolution to call a winner, but companies are certainly staking their claims.
Siri had a few years’ head start among voice assistants and is now actively used on half a billion devices, including on iPhones, MacBook Pros and Apple Watches. However, Apple’s voice assistant has been riddled with problems that caused it to largely squander that lead.
Google Assistant, thanks to Android’s omnipresence, also claims that it’s on half a billion devices.
Amazon is the smart speaker leader for now, though it’s hamstrung by not having its own phone. The company is making up for that deficit by doubling down on loading other devices with Alexa in the hope of cornering the smart home and auto markets. Importantly, it recently unveiled a low-cost Alexa-enabled chip that can make stupid devices smart.
In the second quarter of this year, Amazon had 41 percent of global smart speaker shipments, followed by Google at 28 percent, according to Strategy Analytics data. In the U.S., Amazon makes up about 65 percent of installed speakers as of September, according to a Voicebot survey.
Market share, however, isn’t everything.
The future will likely include a number of smart assistants, depending on where you are and what you want to do. Juniper estimates that the average smartphone user will engage with three voice assistant platforms by 2022.
“No single smart assistant platform can provide the complete portfolio of services and devices that consumers are looking for,” according to Adam Wright, a senior research analyst at IDC.
For Amazon, Google, Microsoft and Facebook, what’s most important is getting users to participate in their ecosystems. “They’re happy to forgo revenue from hardware for new customers using their services,” Wright said.
In Amazon’s case, that means getting people to buy more through Prime and to subscribe to Amazon Music. For Google, it’s encouraging interaction with its search and other products like Gmail, YouTube and Maps — all of which produce ad dollars for it. Microsoft’s Cortana — which also works with Alexa — wants people to use its Office suite. Indeed, Microsoft is leaning its efforts heavily into enterprise, recently releasing a platform that lets companies build their own workplace skills with Cortana.
Apple, in addition to looking for Apple Music subscribers, is making a hardware play. Unlike the others, Apple makes it more difficult for third parties to create hardware for Apple devices. Until recently, outside device makers had to include a physical Apple chip in their products that significantly drove up the material costs. Still, Apple charges royalties and has to approve partner devices. While it slows growth, this sort of environment allows Apple to guarantee a higher quality in its few devices and perhaps more of their own hardware sales in the future.
Voice assistants have gotten good at answering one-off questions: “What’s the capital of North Dakota?” or, “What time is the Cavaliers game tonight?” But their ability to follow a line of thought or figure out what a pronoun is referring to is limited. A voice assistant is capable of three or four back-and-forth interactions at most, barring it from having a true conversation, though it’s incrementally getting better.
For consumers, what matters most is that voice technology works, regardless of what name the assistant goes by.
But, assuming that the devices someday attain a level of accuracy similar to one another, which device and assistant consumers use will likely come down to other preferences.
People who prefer shopping on Amazon might pick Alexa, while Walmart and Target shoppers might like Google Assistant better. Those who want a wide variety of smart home device options will have those with Google and Amazon. Those who prefer the Apple universe and its high-end products will end up with Siri.
What the future holds
Fortunately for voice enthusiasts, we’re still in early days.
Smart speakers, like training wheels, are getting people more used to talking to their devices. However, the future of voice probably won’t be on speakers at all. The major speaker makers have all added screens to their assistants. Samsung, smartly, is putting its voice assistant Bixby on its TVs, which have the potential to become the smart assistant hub of choice.
The key element is the voice assistant, regardless of what device it resides in. Smart assistants will creep into every aspect of our lives and will be available at home and away.
Some see a future in which stores and other public places are outfitted with voice assistants that will be able to recognize you and adapt their responses to your individual needs. For now voice assistants are still working on figuring out what you’re saying in the first place.
What we ultimately do with this surfeit of voice technology remains to be seen.
Think of the mouse in the ’80s. The new way of interacting with computers was met with scorn, but that was less about the concept than the possibilities and execution. At first the mouse’s precision was poor and software hadn’t yet figured out how to smartly utilize the new technology. Fast-forward nearly 40 years and it’s tough to imagine design software and video games without a mouse or touch pad.
Voice is much more intuitive than a mouse, but we’re still trying to find ways to make voice work.
“There’s always been a tendency to force the ‘old’ onto the ‘new’ when it comes to emerging technology platforms — the first ads on television, for example, were essentially radio ads, read out loud,” Will Hall, chief creative officer of Rain, a digital agency that specializes in voice, told Recode, regarding early attempts at voice advertising. “Eventually TV ads evolved into multi-sensory stories — images of a car driving down the highway, music blaring — and so will the voice experience.”
Until we find the app, use-case or invention that could only be possible using voice, we’re still just repurposing online content for your ears.