Evaluation of Information Structure in Speech Synthesis: The Case of Product Recommender Systems

Konferenz: Sprachkommunikation - Beiträge zur 10. ITG-Fachtagung
26.09.2012 - 28.09.2012 in Braunschweig, Deutschland

Tagungsband: Sprachkommunikation

Seiten: 4Sprache: EnglischTyp: PDF

Kügler, Frank; Smolibocki, Bernadett; Stede, Manfred (Dept. of Linguistics/EB Cognitive Science & SFB 632 “Information structure”, University of Potsdam, Karl-Liebknecht-Straße 24-25, 14476 Potsdam, Germany)

Speech synthesis nowadays is of acceptable quality for many purposes. Nonetheless there are applications where contextual and other pragmatic factors play an important role, which cannot be accounted for by straightforward text-to-speech (TTS) systems. This is the case for systems giving product comparisons and recommendations: For instance, an appropriate intonation is required that signals contrasting entities, and in longer discourse there is a need to distinguish between given and new entities prosodically. That is, the linguistic notion of information structure (IS) should be considered in the synthesis. In our project, we are extending an existing text generator for product comparison/ recommendation with a speech synthesis component, and we are aiming at integrating information structure in a systematic way. Our paper describes the architecture of our system (as currently being built) and the results of two perception experiments that we have conducted in order to verify that listeners do indeed perceive the difference between "standard" TTS and IS-enriched synthesis. The results show that there is a benefit of the ISenriched synthesis for the listeners.