Diarization

Who Spoke When?

Submitted by Arif Haque on Wed, 06/03/2020 - 08:22

IMPROVING SPEAKER DIARIZATION

Many people have used automatic speech recognition systems to transcribe audio to text, but there are a host of other items that it’s useful to identify from a stream of audio. One task in particular is called diarization - who spoke when? Knowing this information can help with a range of downstream applications. For example, in meeting summarization, knowing who said something means you can accurately make notes and allocate action items. In video subtitling, the speech from different speakers can be color coded, to better assist those who are hard of hearing. In a virtual assistant, background speech can be ignored to improve the performance of the assistant.

Subscribe to Diarization

Our Latest Posts

Jun 15, 2022
Close up of a woman, face and mouth, with letters floating across the screen. speech synthesis concept
By Rasmus Dall

We’ve previously written about one of our core technologies at Cobalt Speech & Language - automatic speech recognition (ASR). When you speak, the ASR system converts your spoken words into text. Another core technology at Cobalt is text-to-speech (TTS), or speech synthesis, which converts written words into spoken audio.

Jun 3, 2020
3 people talking with each other
By Arif Haque

IMPROVING SPEAKER DIARIZATION

Many people have used automatic speech recognition systems to transcribe audio to text, but there are a host of other items that it’s useful to identify from a stream of audio. One task in particular is called diarization - who spoke when? Knowing this information can help with a range of downstream applications. For example, in meeting summarization, knowing who said something means you can accurately make notes and allocate action items.