Densely Connected Convolutional Networks for Speech Recognition

Konferenz: Speech Communication - 13. ITG-Fachtagung Sprachkommunikation
10.10.2018 - 12.10.2018 in Oldenburg, Deutschland

Tagungsband: ITG-Fb. 282: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Autoren:
Li, Chia Yu; Vu, Ngoc Thang (Institute for Natural Language Processing (IMS), University of Stuttgart, Germany)

Inhalt:
This paper presents our latest investigation on Densely Connected Convolutional Networks (DenseNets) for acoustic modelling (AM) in automatic speech recognition. DenseNets are very deep, compact convolutional neural networks, which have demonstrated incredible improvements over the state-of-the-art results on several data sets in computer vision. Our experimental results show that DenseNet can be used for AM significantly outperforming other neuralbased models such as DNNs, CNNs, VGGs. Furthermore, results onWall Street Journal revealed that with only a half of the training data DenseNet was able to outperform other models trained with the full data set by a large margin.