Development of Large Vocabulary ASR Systems for Mandarin and Arabic

Conference: Sprachkommunikation 2008 - 8. ITG-Fachtagung
10/08/2008 - 10/10/2008 at Aachen, Germany

Proceedings: Sprachkommunikation 2008

Pages: 4Language: englishTyp: PDF

Schlüter, R.; Gollan, Ch.; Hahn, S.; Heigold, G.; Hoffmeister, B.; Lööf, J.; Plahl, Ch.; Rybach, D.; Ney, H. (Lehrstuhl für Informatik 6, RWTH Aachen University, 52056 Aachen)

In this work, we describe our automatic speech recognition (ASR) systems for Mandarin and Arabic. We describe the general modeling, training, and recognition architecture, as well as language specific aspects. Several design aspects of the systems, including multiple system design and combination, are analyzed. For Arabic, we summarize the semi-automatic lexicon generation using a statistical approach to grapheme-to-phoneme conversion and pronunciation statistics. For Mandarin, different methods for integrating tone information are described. We present systematic recognition results on recent evaluation corpora of the Global Autonomous Language Exploitation (GALE) project.