CIOAdvisor Apac

Voice Interaction: Make Life More Practical and Interesting

Xiongguo Lei, Vice President, AISpeech

Xiongguo Lei, Vice President, AISpeech

With the booming of artificial intelligence, intelligent voice technology, the most natural interaction method, has been brought into focus. Many companies devote to developing voice technology, such as Nuance, Amazon, Microsoft, etc.

Among various speech technologies, automatic speech recognition (ASR) has always attracted most attention. I believe that ASR accuracy may be increased by new technologies in acoustic modeling and signal processing. For example, by using more advanced microphone array techniques we can significantly reduce noise and side-talks and thus improve the recognition accuracy under these conditions. We may also generate or collect more training data for far field microphones and thus improve the performance when similar microphones are used.

"With the widespread use of voice technology, human-machine interaction will be more practical and interesting in our true life"

Microphone arrayhas become a necessary tool in speech interaction. A typical example is Amazon Echo, which employs circular array technique. AI Speech released a “7-Microphone Circular Array” solution in December 2015, which has been applied widely in robot and smart home field.In the module, six microphones form a ring and one microphone in the center for sound pickup. It supports far field voice recognitionand the accuracy is above 92% within 5 meters. It can cover360 degree with a margin of error of ±10 degree. Through denoising algorithm and speech enhancement, it can identify environment noise and improve recognition accuracy. This solution is suitable for smart home devices and robots which need to pick up sound without dead angle, such as sound box. And it also has been already applied into many kinds of robot in China, including commercial robot, domestic robot and robot assistant. This techniqueforms the solid foundation for voice interaction.

Besides ASR, voice solution also has to strengthen the dialogue interaction ability. When we talk about artificial intelligent interaction, what we are actually talking about is the back end information resources for interaction, voice/vision/action are only the methods for interaction. So, providing necessary resources to satisfy users’ needs is the key point for interaction. There fore, the “7-Microphone Circular Array” solution integrates massive back end resources, such as Gaode Map, UC Browser, Xiami Music, Kuwo Music, Himalaya FM, Kaola FM, Wechat, DianPing O2O and so on, to fulfill users’ needs of social inter course, shopping, entertainment, information and so on. Later, based on the users’ data, more resources will be integrated and more functions will be developed.

Although current voice technology has been widely used, it still has a long way to go. Many capabilities are so far not reachable by current deep learning technology, and require “looking outside” into other fields, such as cognitive science, computational linguistics, neuro science and so on, to assist the development of voice technology. With the wide spread use of voice technology, human-machine interaction will be more practical and interesting in our true life.

Enterprise Digital Transformation is not for the faint hearted: Guiding principles for a enterprise-wide digital transformation

Linda Zeelie, Enterprise Digital Transformation Architect and Leader, Metlife and Nina Evans (Professorial lead: UniSA STEM, University of South Australia (UniSA))

CIOAdvisor Apac

Stephen Barnham, SVP & CIO, MetLife Asia

Guan Wang, Senior Data Scientist and Vice President, Group Data Services, Swiss Re Asia Pte. Ltd

Richard Ramsden, Head of Data Science - Innovation Incubator, AkzoNobeland Barry Hodgson, Director of Strategy, National Innovation Centre for Data

Sunando Das, CMI Director, Predictive Marketing and Retail Analytics, Unilever

Haymans Fung, Chief Marketing and Digital Officer, Sun Life Hong Kong

Prithesh Prabhu, Head of Automation and Change, National Australia Bank

Agustinus Nalwan, Head of AI and Machine Learning, carsales

Guan Wang, Analytics Specialist, Digital and Smart Analytics, Swiss Re [SWX: SREN]

Voice Interaction: Make Life More Practical and Interesting

Entertainment

Cognitive

Your Application is Mostly Written by Strangers

ESG Performance - Why It's Crucial To Future Success

Olympic sports training applied to media agency clients?!?

Case Study: Media consultancy leverages data to self fund client marketing campaigns

Vertical value chain integration applied to media agency clients?!?

Enterprise Digital Transformation is not for the faint hearted: Guiding principles for a enterprise-wide digital transformation