Spectral Noise Tracking for Improved Nonstationary Noise Robust ASR

Konferenz: Speech Communication - 11. ITG-Fachtagung Sprachkommunikation
24.09.2014 - 26.09.2014 in Erlangen, Deutschland

Tagungsband: Speech Communication

Seiten: 4Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Autoren:
Chinaev, Aleksej; Puels, Marc; Haeb-Umbach, Reinhold (Department of Communications Engineering, University of Paderborn, 33098, Paderborn, Germany)

Inhalt:
A method for nonstationary noise robust automatic speech recognition (ASR) is to first estimate the changing noise statistics and second clean up the features prior to recognition accordingly. Here, the first is accomplished by noise tracking in the spectral domain, while the second relies on Bayesian enhancement in the feature domain. In this way we take advantage of our recently proposedmaximum a-posteriori based (MAP-B) noise power spectral density estimation algorithm, which is able to estimate the noise statistics even in time-frequency bins dominated by speech. We show that MAP-B noise tracking leads to an improved noise model estimate in the feature domain compared to estimating noise in speech absence periods only, if the bias resulting from the nonlinear transformation from the spectral to the feature domain is accounted for. Consequently, ASR results are improved, as is shown by experiments conducted on the Aurora IV database.