On the Initialization of Dynamic Models for Speech Features

Konferenz: Sprachkommunikation 2010 - 9. ITG-Fachtagung
06.10.2010 - 08.10.2010 in Bochum, Deutschland

Tagungsband: Sprachkommunikation 2010

Seiten: 4Sprache: EnglischTyp: PDF

Krueger, Alexander; Leutnant, Volker; Haeb-Umbach, Reinhold (Department of Communications Engineering, University of Paderborn, 33098 Paderborn, Germany)
Ackermann, Marcel; Bloemer, Johannes (Department of Computer Science, University of Paderborn, 33102 Paderborn, Germany)

In this work, a novel approach for the initialization of switching linear dynamic models (SLDMs) as dynamic models for the trajectory of speech features is proposed. Borrowing ideas from the "k-means++"-algorithm, the goal of this approach is to find distinctly different SLDMs, modelling the complex dynamics of the speech features, already at the initialization stage of a subsequently following "expectation-maximization (EM)"-algorithm. Experimental results comparing differently initialized SLDMs in a model-based speech feature enhancement scheme show the superiority of the proposed initialization routine in terms of a reduced word error rate on an automatic speech recognition task.