Growing a Deep Neural Network Acoustic Model with Singular Value Decomposition

Conference: Speech Communication - 12. ITG-Fachtagung Sprachkommunikation
10/05/2016 - 10/07/2016 at Paderborn, Deutschland

Proceedings: Speech Communication

Pages: 5Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Kilgour, Kevin; Tseyzer, Igor; Nguyen, Thai Son; Stueker, Sebastian; Waibel, Alex (Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, Germany)

Abstract:
Singular Value Decomposition (SVD) allows the weight matrix connecting two layers in a deep neural network (DNN) to be decomposed into two smaller matrices. In this paper we show how SVD can be used to initialise a new layer between the two original layers. Using SVD restructuring we can improve the word error rate (WER) of DNN based speech recognition systems while at the same time reducing their number of parameters. On a German test this resulted in a WER improvement from 16.61% to 16.16% while the number of parameters were reduced from 17.3 million to 14.55 million. When applied to an online real time speech recognition system the approach noticeable improved its real time factor while at the same time also slighty reducing its WER.