Developed offline speech recognition system running with an accuracy of 97%

As a rule, different systems of speech recognition, individuals, interpreters and others use enormous server capacity for their work. And in order to make them available for everyone, developers transmit all data over the Internet, which makes it impossible to use them offline. However, modern algorithms of neural networks help to achieve really amazing results. Not long ago, Microsoft and Google have already made their translators based on neural networks completely independent of the network, and it is now time algorithms for voice recognition.

Responsible for the development of a team of researchers from the University of Waterloo and a startup called DarwinAI. Their technology is called EdgeSpeechNets.

“In this study, we use a strategy to create architecture with a low level of load on the device, but with all the advantages of the approach using a powerful neural networks with deep machine learning.”

To begin with, the experts have created a prototype of the future system, which performed speech recognition, but possessed a limited vocabulary. However, he was able to identify the known key words, even from a very fast flow of speech. Thereafter, the resulting data were used to convert the audio signal into a mathematical formula. This formula was subsequently used for designing the neural network, which would have high performance, but would not be demanding on hardware.

After this the researchers decided to test the resulting program. For this purpose we used the Google storage Speech Commands, which contains 65000 1 second sound samples. In the end, one version of the system, namely EdgeSpeechNet-D showed excellent results, reaching a precision of 97% on weak smartphone Motorola Moto E a c processor of 1.4 GHz.

“EdgeSpeechNet has higher recognition accuracy at a much lower cost of computing. The obtained results demonstrate that EdgeSpeechNet were able to achieve the most advanced performance, requiring significantly less computational power, which makes them very suitable for use in mobile devices and apps.”

This and other news you can discuss in our chat in Telegram.

Leave a Reply

Your email address will not be published. Required fields are marked *