Uncertainty Propagation for Speech Recognition using RASTA Features in Highly Nonstationary Noisy Environments
Conference: Sprachkommunikation 2008 - 8. ITG-Fachtagung
10/08/2008 - 10/10/2008 at Aachen, Germany
Proceedings: Sprachkommunikation 2008
Pages: 4Language: englishTyp: PDFPersonal VDE Members are entitled to a 10% discount on this title
Astudillo, Ramón F.; Kolossa, Dorothea; Orglmeister, Reinhold (Fachgebiet Elektronik und medizinische Signalverarbeitung, TU Berlin, 10587 Berlin)
The use of speech enhancement techniques in nonstationary noisy environments often leaves residual noise along with speech distortions that affect the performance of automatic speech recognition systems. In "R.F. Astudillo, D. Kolossa and R. Orglmeister, Propagation of statistical information through non-linear feature extractions for robust speech recognition", we showed that the effect of these inaccuracies can be attenuated by propagating a measure of uncertainty from the speech enhancement domain to the mel-cepstral domain of recognition features. In this domain, it is possible to re-estimate unreliable features by combining the uncertainty information with the speech recognizer parameters. In this paper we extend this technique to a more complex feature extraction domain with a psychoacoustical basis: Cepstral coefficients obtained from RelAtive SpecTrAl Perceptual Linear Prediction features (RASTA-PLP). We particularly address the problem of dealing with the RASTA filter and the different non-linearities involved in the feature extraction while keeping computational costs low.