Recognition of Noisy Speech by Starting the Likelihood Calculation at Voiced Segments

Conference: Speech Communication - 11. ITG-Fachtagung Sprachkommunikation
09/24/2014 - 09/26/2014 at Erlangen, Deutschland

Proceedings: Speech Communication

Pages: 4Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Hirsch, Hans-Guenter; Kremer, Frank (Institute for Pattern Recognition, Niederrhein University of Applied Sciences, 47805 Krefeld, Germany)

Abstract:
The performance of automatic speech recognition systems is still not comparable with the ability of humans to communicate and understand speech in noisy scenarios. Observing humans in noisy environments they do not perceive and understand all fragments of the speech. Under extremely noisy conditions this can go that far that they can only understand a few segments with a high signal-to-noise ratio (SNR). Derived from this observation we investigate an alternative recognition approach where we start the recognition process at the usually voiced segments with high SNR. Furthermore, we ignore or will take into account to a lesser extent the fragments with low SNR. We developed a method to detect the voiced segments in distorted speech signals. Beginning at the detected positions we calculate probabilities backward and forward in time with a type of modified subword HMM. Neglecting segments with low SNR leads to the need to modify the usual likelihood calculation and to extract probability information for being in a certain HMM state during the recognition process. First recognition results are presented for the recognition of the isolated TIDigits. The achieved results can be seen as a proof that this approach of an alternative recognition strategy is applicable in principal.