Finding Your Voice


Two weeks ago, it was the bright lights of Las Vegas at the CES (Consumer Electronics Show) with rides, giveaways and manufacturers from refrigerators to toilets to cars touting their smart speaker integrations.

One week later, I was far from the glitz of Vegas in Chattanooga, Tennessee at a conference of several hundred Amazon Echo developers working to figure out what people will respond to, and how they will interact with smart speakers.

It is by no means certain that people are interested in a whole lot more than weather forecasts and music from their devices, but this group is aggressively going after it.

Some things observed:

Innovation abounds - Smart speakers are a nascent category exploding with experimentation. We saw the gamut ranging from interactive book adaptation aimed at children, who index as early users of smart speakers, navigational skills, voice-based surveys and way more.

Brush your teeth, alot, and listen to podcasts - Gimlet Media won the “Alexa Skill of the Year” for “Chompers,” a 2 minute daily feature which accompanies kids as they brush their teeth. Wilson Standish from Gimlet was on hand, and also joined me and a few others for a session on smart speakers and podcasts, which certainly has its early-inning challenges, but the opportunity is clear.

Finding an audience/users for their skill/show, can be elusive with 70,000 active skills

To your health - Health is a big voice category with plenty of developers on hand. Notable is the Mayo Clinic. They have spent several years developing an impressive first-aid skill. Dr. Sandhya Pruthi spoke to the group about the initiative which takes the valuable resources of their website and reinterprets it for voice with shorter more directed answers to first-aid queries. There is little time to waste when someone is choking or there has been an odd reaction to a bee sting. Many older patients feel more comfortable with voice than typing. Look for more innovation in this sector.

Discovery is on everyone’s mind - Similar to the challenge facing many podcasters, finding an audience/users for their skill/show, can be elusive with 70,000 active skills worldwide and a still confusing “enable” process. Megaphones matter.


Interactive voice games - Among the highlights of the conference was the appearance of the inventor of Atari and Chuck E Cheese, Nolan Bushnell. He regaled the crowd in his Alexa “triggered” light-up sneakers for a keynote announcing his new venture, X2games featuring interactive voice games on the Alexa platform. St. Noire is their first game to market with his Hollywood creative director and co-founder Zai Ortiz. It is a murder mystery with various clues and outcomes.

Some skills have daunting menus - I’m thinking about those old endless phone-tree menus (and not so old … I’m looking at you, every airline…). There are many skills that offer too many selections and options and are difficult to follow. Many (most?) end up in the “I tried it once, but never again” bin. The business faces a great deal of skill abandonment and this is certainly a part of that.


Complex multi-part requests - The mission for many is to go beyond simple data requests such as the forecast, to more conversational engagement with voice devices. Soundhound VP/GM Katie McMahon and others talked about the advances made toward multi-part requests such as asking for “Italian restaurants with a four-star rating but excluding pizza shops.” However, they don’t yet know that Tom Brady is playing in the Super Bowl, (I asked) so we are not there just yet.

Searching for audio has been elusive - Israeli company Audioburst listens and tracks audio from hundreds of radio stations and podcasts and serves up “relevant” audio curated by the user. Try their skill “newsfeed.” I had some trouble getting things going. I didn’t ask for the weather, but it opened to a forecast and then a traffic report from a New York radio station from a few minutes earlier. I see the vast potential, but so far it surfaces superfluous content.

The data is talking - Voicebot.Ai chief Brett Kinsella kicked off the conference with a series of useful data points illustrating the different hierarchy of voice use in different environments. On smart speakers, for example, people ask questions, stream music and check the weather. On smartphones they ask questions, seek directions or call someone. Voice systems in the car are dominated by calling, asking for directions and sending texts.

Congratulations to Bradley Metrock who dreamed up the conference and put together a varied agenda and attendee list ranging from big companies to small entrepreneurs. It was a pleasure to be there and meet so many people focused on building the “voice first” future.

Steven GoldsteinComment