'WaveNet': Google's DeepMind AI learns how t

'WaveNet': Google's DeepMind AI learns how to talk like humans

Erick Njoroge | Sep 11, 2016 09:23 PM EDT

Google DeepMind Artificial Intelligence Learns to Talk (Photo : Getty Images/ Miles Willis)

Google has successfully developed technology that is able to mimic the sound of human voice, this brings Google to a milestone in its DeepMind artificial intelligence (A.I.) project.

The breakthrough dubbed as WaveNet, was described as a deep neural network that generates raw audio wave forms to provide speech. WaveNet reportedly win over already existing Text-to-Speech systems.

Like Us on Facebook

In a research done by scientists in the Britain-based WaveNet unit, the team notes that the gap in human performance, which could be availed in an actual A.I., human conversation, is reduced by about 50 percent, The Verge reported.

WaveNet is made more interesting in that it is capable of learning different speech and voices patterns to a point of even simulating artificial breaths and mouth movements in addition to emotions, accents, and language inflections.

"A single WaveNet can get the characteristics of a number of different speakers with equal fidelity, and can manage to switch between them via conditioning on the speaker identity," according to Google.

Currently, WaveNet is capable of using the Chinese and English languages and can also compose songs on its own and output music such as classical piano pieces.

According to Bloomberg, the impact of the recent A.I. breakthrough for Google rests wholly on the sheer quantity of data required to enhance its current technological quality. This is in considering how most computer-generated Text-to-Speech technologies are based on a collection of huge amounts of human sound recordings.

Google is putting A.I. into practice in order to address the difficulty, selecting an approach called modelling raw audio based on previous technologies called PixelCNN and PixelRNN or two-dimensional Pixelnets.

The new system, one-dimensional WaveNet, needs at least 16,000 different bits of samples within a second, this entails the use of immense computing power as noted by WaveNet's creators in a blog post. The system had to be coached to learn context and produce utterances, among others.

In total, the WaveNet algorithm needed about 44 hours of sample sounds and voices recorded by more than a hundred speakers.

At the moment, researchers do not see any instant commercial utility for WaveNet in comparison to a DeepMind algorithm that can minimize energy consumptio.

However, as people increasingly become dependent on technologies, there is a need for both sophisticated and natural mechanisms to ensure an effective and seamless interaction with the humans. And for this reason, WaveNet is being closely watched by tech companies, according to Bloomberg.

Here is a video of Google's Deep Mind Explained! - Self Learning A.I.: