Improving Robust Speech Recognition for German Oral History Interviews Using Multi-Condition Training

Konferenz: Speech Communication - 13. ITG-Fachtagung Sprachkommunikation
10.10.2018 - 12.10.2018 in Oldenburg, Deutschland

Tagungsband: ITG-Fb. 282: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Autoren:
Gref, Michael (Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Sankt Augustin, Germany & Institute for Pattern Recognition (iPattern), Niederrhein University of Applied Sciences, Krefeld, Germany)
Schmidt, Christoph; Koehler, Joachim (Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Sankt Augustin, Germany)

Inhalt:
In historical sciences, the term oral history refers to conducting and analyzing interviews with contemporary witnesses. To significantly reduce the resources needed to transcribe these interviews, we work on the adaptation of our speech recognition system to oral history interviews. In this work, we build on our previous experiments by using 1000 hours of training data from the broadcast domain. Utilizing the Kaldi ASR toolkit, we show that advanced chain acoustic models greatly benefit from large data sets and achieve remarkable performance on several test sets. To further improve the speech recognition performance on oral history interviews, we apply artificially created multi-condition data to the chain model training and reduce the WER on the oral history test set compared to a clean trained chain model by 4.8% absolute and 13.9% relative.