Unsupervised Domain Adaptation by Adversarial Learning for Robust Speech Recognition

Konferenz: Speech Communication - 13. ITG-Fachtagung Sprachkommunikation
10.10.2018 - 12.10.2018 in Oldenburg, Deutschland

Tagungsband: ITG-Fb. 282: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Denisov, Pavel; Vu, Ngoc Thang; Ferras Font, Marc (Institute for Natural Language Processing, University of Stuttgart, Germany)

In this paper, we investigate the use of adversarial learning for unsupervised adaptation to unseen recording conditions, more specifically, single microphone far-field speech. We adapt neural networks based acoustic models trained with close-talk clean speech to the new recording conditions using untranscribed adaptation data. Our experimental results on Italian SPEECON data set show that our proposed method achieves 19.8% relative word error rate (WER) reduction compared to the unadapted models. Furthermore, this adaptation method is beneficial even when performed on data from another language (i.e. French) giving 12.6% relative WER reduction.