The Conversational Interaction (CI) Conference was held Feb. 5-6 in San Jose, CA. The conference is organized by AVIOS and has been held annually in various forms (Voice Search, then Mobile Voice, then CI) since 2008. An overarching theme of the conference is the ways in which people deal with digital systems using human speech, and how those applications are rapidly proliferating. The conference explores options for creating language interfaces, as well as some of the unique challenges involved in designing such interfaces. In recent years, the conferences has increasingly focused on current developments in natural language processing (NLP) technologies and in Artificial Intelligence (AI), especially as chatbots, virtual agents, and digital assistants become more commonplace methods of interacting with customers/users.
At this year’s conference, some of the most interesting and informative talks were the keynotes and panels. Of particular note was the opening keynote, given by Xuedong Huang, the esteemed Technical Fellow from Microsoft, who provided a “state of the union” assessment of the field, and addressed the question, “What are the limits of speech recognition accuracy?” Automatic speech recognition (ASR) technology and machine learning have matured significantly in recent years, and we’re seeing widespread adoption of speech-based personal assistants (Alexa, Cortana, OK Google, Siri), but there are still many challenges to overcome. Active areas of research include accented speech (i.e., speech that is not a user’s native language), far-field speech, so-called cocktail party speech (where multiple people are speaking at the same time and/or in a noisy environment), and contextual understanding (e.g. where the speech itself is correctly recognized, but the correct meaning is dependent on contextual factors).
A significant number of presentations dealt with chatbots and virtual/digital assistants. Topics of interest included considerations for designing and developing them, how natural they should be made to appear (if they’re not natural enough, users get frustrated and may doubt the quality of their information; too natural sounding, and users may think they are interacting with a person rather than a chatbot, and may have unrealistic expectations of what it can do), how to monetize them, etc. Other interesting presentations addressed the challenges involved when creating conversational systems that are used in difficult environments, for example noisy environments or situations where the speaker is at a distance from the microphone.
I attended the conference and gave a presentation titled “Retail’s Impending Demise: are the rumors greatly exaggerated? How Theatro is using speech technology to transform brick-and-mortal retail.” The presentation noted the various challenges faced by brick-and-mortal retail and some of the ways in which retailers are responding; and went on to identify the many ways in which Theatro is transforming store communication and operations by enabling employees to be more efficient, allowing stores to do more with less. Of particular interest to attendees were Theatro’s facilitation of targeted person to person communication, our enabling quick and easy access to product information by any associate on the salesfloor, and our various location-aware apps. Using Theatro, retailers have realized the following benefits:
- Increased Sales: A 7% lift in loyalty registration for a leading home goods retailer. An incremental gain of $87M annually.
- Drive Productivity: Reduced 83% of ear chatter. Employees saved 12% of time with improved communication.
- Elevate Associates: 91% of associates agree that Theatro helps them serve customers better.
- Improve Service: Associate response time improved by 77%. Faster response at the register leads to less abandonment and happier customers.
All in all, the conference highlighted the many ways in which speech technology is rapidly advancing, and proved to be yet another validation that Theatro has a unique solution for the retail space.