Face of a Robot, Voice of an Angel?

13.09.2016.

Robotic speech can still get much, much better. Enter Google’s machine learning division, DeepMind, which has just announced a new text-to-speech technology called WaveNet. Unlike many previous approaches, the system uses convolutional neural networks to create an entire raw audio waveform from the ground up. Trained using real audio recordings, when provided with text it works through a sentence stepwise, generating short utterances which are combined to create the final waveform. DeepMind claims that the results sound more natural than the best existing systems, but you can judge for yourself by listening to some examples. There’s one hitch: the technique, which needs to produce perhaps 16,000 data points for every second of audio it creates, is incredibly computationally expensive. According to the Financial Times, that means it won’t, for now, be used within any of Google’s products.

Experten- und Marktplattformen
Cloud Computing

Technologie-Basis zur Digitalisierung

mehr
Sicherheit und Datenschutz

Vertrauen zur Digitalisierung

mehr
Anwendungen

wichtige Schritte zur Digitalisierung

mehr
Digitale Transformation

Partner zur Digitalisierung

mehr
Energie

Grundlage zur Digitalisierung

mehr
Experten- und Marktplattformen
  • company
    Cloud Computing –

    Technologie-Basis zur Digitalisierung

    mehr
  • company
    Sicherheit und Datenschutz –

    Vertrauen zur Digitalisierung

    mehr
  • company
    Anwendungen –

    wichtige Schritte zur Digitalisierung

    mehr
  • company
    Digitale Transformation

    Partner zur Digitalisierung

    mehr
  • company
    Energie

    Grundlage zur Digitalisierung

    mehr
Blogs