Unsupervised Classification of Voiced Speech and Pitch Tracking Using Forward-Backward Kalman Filtering

Konferenz: Speech Communication - 12. ITG-Fachtagung Sprachkommunikation
05.10.2016 - 07.10.2016 in Paderborn, Deutschland

Tagungsband: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Autoren:
Boenninghoff, Benedikt T.; Zeiler, Steffen; Kolossa, Dorothea (Institute of Communication Acoustics, Ruhr-Universität Bochum, 44780 Bochum, Ge)
Nickel, Robert M. (Department of Electrical and Computer Engineering, Bucknell University, Lewisburg, PA 17837, USA)

Inhalt:
The detection of voiced speech, the estimation of the fundamental frequency and the tracking of pitch values over time are crucial subtasks for a variety of speech processing techniques. Many different algorithms have been developed for each of the three subtasks. We present a new algorithm that integrates the three subtasks into a single procedure. The algorithm can be applied to pre-recorded speech utterances in the presence of considerable amounts of background noise. We combine a collection of standard metrics, such as the zero-crossing rate for example, to formulate an unsupervised voicing classifier. The estimation of pitch values is accomplished with a hybrid autocorrelation- based technique. We propose a forward-backward Kalman filter to smooth the estimated pitch contour. In experiments we are able to show that the proposed method compares favorably with current, state-of-the-art pitch detection algorithms.