Natural vs. Synthesized Speech in Spoken Dialog Systems Research – Comparing the Performance of Recognition Results

Konferenz: Sprachkommunikation - Beiträge zur 10. ITG-Fachtagung
26.09.2012 - 28.09.2012 in Braunschweig, Deutschland

Tagungsband: Sprachkommunikation

Seiten: 4Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Autoren:
Scheffler, Tatjana; Roller, Roland; Reithinger, Norbert (DFKI GmbH, Projektbüro Berlin, Alt-Moabit 91c, 10559 Berlin, Germany)
Kretzschmar, Florian; Möller, Sebastian (Deutsche Telekom Laboratories, TU Berlin, Ernst-Reuter-Platz 7, 10587 Berlin, Germany)

Inhalt:
In this paper, we test the effect of using speech synthesis when interacting with a spoken dialog system (SDS). We use a user simulation to connect our speech synthesis to a real, state-of-the-art automatic speech recognition (ASR) component deployed in a working commercial SDS via a standard telephone line. In a series of experiments, we compare human-machine dialogs and their recognition scores with simulated dialogs using synthesis. Our results show that a good text-to-speech synthesis configuration rivals human speech both in recognition scores as well as variability. This makes the speech interface in user simulation quite attractive.

Natural vs. Synthesized Speech in Spoken Dialog Systems Research – Comparing the Performance of Recognition Results

Individuelle Cookie-Einstellungen

Notwendige Cookies

Optionale Cookies