The accuracy of automatic speech recognition has made significant gains in the last few years thanks to the advent of deep neural networks. But there’s one area that has thwarted researchers: telling multiple speakers apart. Now a startup called Chorus says it has made a breakthrough in the matter through a technique it calls “voice fingerprinting.”
Speech recognition and computer vision arguably are the two computational challenges that have benefited the most from deep learning. Armed with huge training sets – including vast troves of photographs and digital recordings of voices – convolutional neural network (CNNs) and recurrent neural networks (RNNs) have given computers sensory perception that can almost rival humans’ senses. Despite those successes, we still run into edge cases in where deep learning has not delivered breakthroughs in cognitive applications.
Author: Alex Woodie