Cobalt builds innovative voice technology solutions

Voice technology is changing the way we interact with devices. Millions of people worldwide are becoming accustomed to talking to their computers (think about Alexa and Siri), and businesses are quickly taking advantage of the opportunity. 

At Cobalt, we work with you to bring voice and language technology to your products and services. We are there for all stages of your project, from idea generation through to execution and optimization, working together to achieve your specific business goals. We’ve worked alongside CEOs, executives and senior leadership teams to define goals, roadmaps, and strategy, and we’ve also worked directly with engineers and data scientists to deliver custom technology. You are experts in your domain, and we bring knowledge of AI, voice and language technology.  

Cobalt has built a wide variety of voice and language systems across industries from agriculture to manufacturing to medicine. In doing so, we’ve optimized for challenges such as noisy environments, highly specialized vocabulary, or children's speech. Because we’ve built technology for all the components of these systems - automated speech recognition, natural language understanding, text-to-speech, diarization & more - we have expertise across the entire system and understand how the individual components fit together effectively.

Our deep expertise and world-class team mean that we can tackle any problem. Together, we have decades of experience building, deploying, and maintaining real-world AI systems.

Our competencies include

  • Speech to text
  • Natural language processing
  • Voice biometrics
  • Conversational AI
  • Audio diarization
  • Text to speech
  • ..anything voice and language-related

How We Work


Our Latest Posts

Jun 15, 2022
Close up of a woman, face and mouth, with letters floating across the screen. speech synthesis concept
By Rasmus Dall

We’ve previously written about one of our core technologies at Cobalt Speech & Language - automatic speech recognition (ASR). When you speak, the ASR system converts your spoken words into text. Another core technology at Cobalt is text-to-speech (TTS), or speech synthesis, which converts written words into spoken audio.

Jun 3, 2020
3 people talking with each other
By Arif Haque


Many people have used automatic speech recognition systems to transcribe audio to text, but there are a host of other items that it’s useful to identify from a stream of audio. One task in particular is called diarization - who spoke when? Knowing this information can help with a range of downstream applications. For example, in meeting summarization, knowing who said something means you can accurately make notes and allocate action items.