Improving Robust Speech Recognition for German Oral History Interviews Using Multi-Condition Training
Konferenz: Speech Communication - 13. ITG-Fachtagung Sprachkommunikation
10.10.2018 - 12.10.2018 in Oldenburg, Deutschland
Tagungsband: ITG-Fb. 282: Speech Communication
Seiten: 5Sprache: EnglischTyp: PDFPersönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt
Gref, Michael (Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Sankt Augustin, Germany & Institute for Pattern Recognition (iPattern), Niederrhein University of Applied Sciences, Krefeld, Germany)
Schmidt, Christoph; Koehler, Joachim (Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Sankt Augustin, Germany)
In historical sciences, the term oral history refers to conducting and analyzing interviews with contemporary witnesses. To significantly reduce the resources needed to transcribe these interviews, we work on the adaptation of our speech recognition system to oral history interviews. In this work, we build on our previous experiments by using 1000 hours of training data from the broadcast domain. Utilizing the Kaldi ASR toolkit, we show that advanced chain acoustic models greatly benefit from large data sets and achieve remarkable performance on several test sets. To further improve the speech recognition performance on oral history interviews, we apply artificially created multi-condition data to the chain model training and reduce the WER on the oral history test set compared to a clean trained chain model by 4.8% absolute and 13.9% relative.