Broadband wired and wireless internet connections are getting faster all the time. But in many parts of the world high-speed internet access is unavailable and/or unreliable. So researchers continue to find ways to find ways to use compression to save bandwidth while transmitting audio, video, images, and other content that typically uses a lot of data.
A few years ago developers behind the open source Opus audio codec announced a breakthrough that allowed for high-quality audio to be delivered using as little as 6 kilobits per second. Now Google has unveiled a new codec called Lyra that can sound as good or better at just 3 kbps.
That could enable high-quality audio communications on some of the slowest networks. The only catch? Lyra is specifically optimized for speech, so it probably won’t be much use for music streaming or other services.
Google developed the Lyra speech codec using “traditional codec techniques while leveraging advances in machine learning (ML) with models trained on thousands of hours of data.”
In other words, Google trained its artificial intelligence system to know what it sounds like when people are speaking by having the system examine thousands of hours worth of recordings of people talking. The model was trained using speakers from more than 70 different languages, so the results shouldn’t be limited to spoken in English or any other specific language.
Researchers says the Lyra codec recreates speech signals using a generative model that can generate multiple signals at different frequency ranges at the same time and output them into a single signal in an efficient manner that allows the model to run either on a cloud server or on a local device – including mid-range smartphones.
So how well does it work? Pretty well.
Google made some samples available to demonstrate how Lyra sounds compared with an original source recording and two other codecs: Opus running at 6 kbps and Speex at 3 kbps. You can hear the samples below.
According to Google, most listeners in a crowdsourced test judged the Lyra samples to sound the most like the original source recording, and I’d tend to agree. It’s likely that the sound quality would be even better at higher bit rates.
Google says it’s already beginning to roll out Lyra for use in its own Duo voice and video chat software. It’s unclear if or when Lyra will be available for use in third-party apps.
via CNX Software