Towards Non-Intrusive Prediction of Speech Recognition Thresholds in Binaural Conditions

Konferenz: Speech Communication - 14th ITG Conference
29.09.2021 - 01.10.2021 in online

Tagungsband: ITG-Fb. 298: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Huelsmeier, David (Medizinische Physik, CvO Universität Oldenburg & Kommunikationsakustik, CvO Universität Oldenburg, Oldenburg, Germany;)
Hauth, Christopher F.; Roettges, Saskia; Kranzusch, Paul; Brand, Thomas (Medizinische Physik, CvO Universität Oldenburg, Germany)
Rossbach, Jana; Meyer, Bernd T. (Cluster of Excellence Hearing4all & Kommunikationsakustik, CvO Universität Oldenburg, Germany)
Schaedler, Marc Rene (Medizinische Physik, CvO Universität Oldenburg & Cluster of Excellence Hearing4all & Vibrosonic GmbH, Mannheim, Germany)
Warzybok, Anna (Medizinische Physik, CvO Universität Oldenburg & Cluster of Excellence Hearing4all, Germany)

Four non-intrusive models are compared that predict human speech recognition thresholds (SRTs, i.e., signal to noise ratios with 50% word recognition rate) in different acoustic environments. Three of them use the blind binaural processing stage (bBSIM) as front-end, while one model uses the spectral representation of the left and right ear signal channels together with their difference. Predictions are evaluated for three acoustic environments (anechoic, office, and cafeteria) with speech from the front and noise from different directions. Despite many technical differences across the models, all of them perform quite accurately (root mean squared prediction errors below 2.2 dB for all models). This implies that any of the non-intrusive models facilitates to predict SRTs for listeners with normal hearing measured in stationary noise, different acoustic environments, and spatial configurations.