Predicting the quality of processed speech by combining modulation-based features and model trees

Conference: Speech Communication - 12. ITG-Fachtagung Sprachkommunikation
10/05/2016 - 10/07/2016 at Paderborn, Deutschland

Proceedings: Speech Communication

Pages: 5Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Cauchi, Benjamin; Goetze, Stefan (Fraunhofer IDMT, Project Group Hearing, Speech and Audio Technology, Oldenburg, Germany & Cluster of Excellence Hearing4all, Oldenburg, Germany)
Santos, Joao F.; Falk, Tiago H. (INRS-EMT, University of Quebec, Montreal, QC, Canada)
Siedenburg, Kai; Doclo, Simon (University of Oldenburg, Dept. of Medical Physics and Acoustics, Oldenburg, Germany & Cluster of Excellence Hearing4all, Oldenburg, Germany)
Naylor, Patrick A. (Imperial College London, Dept. of Electrical Engineering, London, UK)

Abstract:
Many signal processing methods have been proposed to improve the quality of speech recorded in the presence of noise and reverberation. The evaluation of these methods either requires the use of perceptual measures, i.e. listening tests, or instrumental measures. Perceptual measures are typically more reliable but are quite costly and timeconsuming. On the other hand, instrumental measures may correlate poorly with the perceived speech quality. In this paper we propose to train an instrumental measure, combining modulation-based features and model trees, on the basis of perceptual scores obtained on a small corpus of speech data that has been processed by a combination of beamforming and spectral postfiltering. For evaluation purposes the resulting measure is then applied to a larger corpus. Results show that the use of model trees to train the predicting function of an instrumental measure increases its correlation with perceptual scores.