Auditory Filterbank Based Frequency-Warping Invariant Features for Automatic Speech Recognition
                  Conference: Sprachkommunikation 2006 - ITG-Fachtagung
                  04/26/2006 - 04/28/2006 at Kiel, Germany              
Proceedings: Sprachkommunikation 2006
Pages: 4Language: englishTyp: PDF
Personal VDE Members are entitled to a 10% discount on this title
            Authors:
                          Rademacher, Jan; Mertins, Alfred (Signal Processing Group, University of Oldenburg, 26111 Oldenburg, Germany)
                      
              Abstract:
              Auditory filterbanks have a long history in the preprocessing stage of automatic speech recognition systems, with the most prominent examples being the mel frequency cepstral coefficients (MFCCs). In this paper, we study the usefulness of auditory-filterbank analyses as a preprocessor for the generation of frequency-warping invariant features. The results indicate, that gammatone-filterbank analyses following the equivalent rectangular bandwidth (ERB) scale yield the most robust feature sets. The performance improvements are most significant when the vocal tract lengths in the training and test sets differ, which is important when, for example, children speech is to be recognized with a system that was mainly trained on adult data.            


