Automatic speech recognition (ASR) and other natural speech and language processing techniques have become ubiquitous in the technologies that surround us in today’s world. From my cell phone to my dashcam to my nightstand, I always have some form of digital assistant nearby, which I can summon with the sound of my voice.
#1 - You're Paying Them To Promote Their Brand
The rise of Alexa and Google Speech as a service has been swift. Many of the leading brands have signed up to using their off-the-shelf speech systems, building voice skills for their customers or integrating them into their products as the only voice interaction method. Just like "Intel Inside" of years past, we now have, "Works With <Alexa/Google Assistant>." However, this is not an easily removed sticker that tells your customers what type of microchip is inside their computer.
Imagine you work for a late-night comedy show and want to put together a montage of news anchors saying the word "covfefe". You could employ an army of interns to listen to hundreds of hours of recorded broadcasts, or you could use Cobalt's Telefol engine to search. Technology to the rescue!
Automatic Speech Recognition (ASR) is a key component of a virtual assistant - it converts audio into text. As well as being crucial for conversational AI, ASR has applications as a standalone technology in places like automated subtitling, call centre transcription and analytics, meeting transcription, and more. This post takes a deeper look at what’s under-the-hood of Cobalt’s Cubic speech recognition technology.
USING OPTICAL SENSING TO IMPROVE VOICE TECHNOLOGY IN NOISY ENVIRONMENTS
Automatic speech recognition (ASR) has improved markedly in recent years, due to advances in data collection methods, cheaper computation, and innovations in the underlying machine learning algorithms. Still, noisy environments pose a particular problem for voice technology.
USING VOICE TECHNOLOGY TO HELP STUDENTS LEARN FOREIGN LANGUAGES
Whether it’s to improve their employment prospects, to make traveling easier, or simply to exercise their brain, many people are learning foreign languages.
The speech processing research community is a dynamic place these days. The commercial prominence of highly popular speech interfaces has taken an already-thriving research community and provided a transforming jolt of attention. The evidence is quite clear at the annual Interspeech conference, the 2019 version of which was held in Graz, Austria from September 15-19. Cobalt had five employees in attendance this year -- Jeff Adams, Ryan Lish, Stan Salvador, Rasmus Dall, and myself -- and we found it
Many of our readers will know that Cobalt is a virtual company; all our employees work from home. People often ask us how we manage to maintain such great team cohesion and loyalty when we don’t see each other daily. There are many answers to that question, but one of them is certainly our tradition of CoWs.
Virtual assistants allow us to interact with technology by voice. They are built on a complex pipeline of AI technology that understands the breadth and complexity of spoken language. This pipeline includes automatic speech recognition, natural language understanding, dialogue management and text-to-speech components. The technology in the pipeline is based on machine learning - a subset of AI algorithms that learn their behaviour from data instead of being explicitly programmed.
“Hey Computer, tell me the latest”
With the rise of virtual assistants like Amazon’s Alexa, Apple’s Siri and Google’s assistant, we’re all beginning to get used to talking to our devices. In contrast to computers that have a keyboard and mouse, or tablets and phones with a touchscreen, virtual assistants let us interact using natural spoken language. Voice interfaces drastically simplify our interaction with technology.
To fulfill requests, virtual assistants are built on a complex pipeline of AI technology: