On Iterative Exchange of Soft State Information in Two-Channel Automatic Speech Recognition

Konferenz: Sprachkommunikation - Beiträge zur 10. ITG-Fachtagung
26.09.2012 - 28.09.2012 in Braunschweig, Deutschland

Tagungsband: Sprachkommunikation

Seiten: 4Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Scheler, David; Walz, Simon; Fingscheidt, Tim (Institute for Communications Technology, Technische Universität Braunschweig, 38106 Braunschweig, Germany)

The robustness of automatic speech recognition systems can be improved by exploiting further information sources such as additional acoustic channels or modalities. Since the arising problem of information fusion exhibits striking parallels to problems in digital communications, where the turbo principle [1] was a groundbreaking innovation, Shivappa et al. showed that a similar iterative scheme can be applied to multimodal speech recognition [2]. We provide new interpretations and propose significant modifications of their approach: First, we show that no modification of the forward-backward recognition algorithm is required; second, we dispense with their proposed heuristic model; third, we deliver our own interpretation and formulation of the extrinsic information passed between the recognizers. Our proposed method is successfully applied to a synthetic unimodal two-channel speech recognition task.