Unsupervised Domain Adaptation by Adversarial Learning for Robust Speech Recognition

Konferenz: Speech Communication - 13. ITG-Fachtagung Sprachkommunikation
10.10.2018 - 12.10.2018 in Oldenburg, Deutschland

Tagungsband: ITG-Fb. 282: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Autoren:
Denisov, Pavel; Vu, Ngoc Thang; Ferras Font, Marc (Institute for Natural Language Processing, University of Stuttgart, Germany)

Inhalt:
In this paper, we investigate the use of adversarial learning for unsupervised adaptation to unseen recording conditions, more specifically, single microphone far-field speech. We adapt neural networks based acoustic models trained with close-talk clean speech to the new recording conditions using untranscribed adaptation data. Our experimental results on Italian SPEECON data set show that our proposed method achieves 19.8% relative word error rate (WER) reduction compared to the unadapted models. Furthermore, this adaptation method is beneficial even when performed on data from another language (i.e. French) giving 12.6% relative WER reduction.