Improving Robust Speech Recognition for German Oral History Interviews Using Multi-Condition Training

Conference: Speech Communication - 13. ITG-Fachtagung Sprachkommunikation
10/10/2018 - 10/12/2018 at Oldenburg, Deutschland

Proceedings: Speech Communication

Pages: 5Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Gref, Michael (Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Sankt Augustin, Germany & Institute for Pattern Recognition (iPattern), Niederrhein University of Applied Sciences, Krefeld, Germany)
Schmidt, Christoph; Koehler, Joachim (Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Sankt Augustin, Germany)

Abstract:
In historical sciences, the term oral history refers to conducting and analyzing interviews with contemporary witnesses. To significantly reduce the resources needed to transcribe these interviews, we work on the adaptation of our speech recognition system to oral history interviews. In this work, we build on our previous experiments by using 1000 hours of training data from the broadcast domain. Utilizing the Kaldi ASR toolkit, we show that advanced chain acoustic models greatly benefit from large data sets and achieve remarkable performance on several test sets. To further improve the speech recognition performance on oral history interviews, we apply artificially created multi-condition data to the chain model training and reduce the WER on the oral history test set compared to a clean trained chain model by 4.8% absolute and 13.9% relative.