Language Feature Vectors for Resource Constraint Speech Recognition

Conference: Speech Communication - 12. ITG-Fachtagung Sprachkommunikation
10/05/2016 - 10/07/2016 at Paderborn, Deutschland

Proceedings: Speech Communication

Pages: 5Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Mueller, Markus; Stueker, Sebastian; Waibel, Alex (Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany)

Abstract:
Deep Neural Networks (DNNs) are a key element of state-of-the-art speech recognition systems. Being a data-driven method, they require a significant amount of training data. There exist scenarios in which such an amount of data is not available for a particular language. Building systems for such resource constrained tasks requires special techniques. One common method is to use data from multiple languages to train the acoustic model. But there are limitations on knowledge transfer between different languages. By the use of Language Feature Vectors (LFVs), we try to mitigate these limitations by providing language information to DNNs. Similar to i-Vectors for speaker adaptation, LFVs enable DNNs to better capture and adapt to inter language characteristics. Previous experiments have shown that providing LFVs to DNNs improved system performance. In this paper, we show that by adding LFVs the performance gap between mono- and multilingual systems decreases.