Distant Speech Recognition: No Black Boxes Allowed

Conference: Sprachkommunikation 2008 - 8. ITG-Fachtagung
10/08/2008 - 10/10/2008 at Aachen, Germany

Proceedings: Sprachkommunikation 2008

Pages: 11Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
McDonough, John; Kumatani, Kenichi; Rauch, Barbara; Faubel, Friedrich; Klakow, Dietrich (Spoken Language Systems, Saarland University, Saarbrücken, Germany)
Wölfel, Matthias (Institut für Theoretische Informatik, Universität Karlsruhe (TH), Karlsruhe, Germany)

Abstract:
A complete system for distant speech recognition (DSR) typically consists of several distinct components. While it is tempting to isolate and optimize each component individually, experience has proven that such an approach cannot lead to optimal performance. In this talk, we will discuss several examples of the interactions between the individual components of a DSR system. In addition, we will describe the synergies that become possible as soon as each component is no longer treated as a “black box”. To wit, instead of treating each component as having solely an input and an output, it is necessary to peal back the lid and look inside. It is only then that it becomes apparent how the individual components of a DSR system can be viewed not as separate entities, but as the various organs of a complete body, and how optimal performance of such a system can be obtained.