Towards Non-Intrusive Prediction of Speech Recognition Thresholds in Binaural Conditions

Conference: Speech Communication - 14th ITG Conference
09/29/2021 - 10/01/2021 at online

Proceedings: ITG-Fb. 298: Speech Communication

Pages: 5Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Huelsmeier, David (Medizinische Physik, CvO Universität Oldenburg & Kommunikationsakustik, CvO Universität Oldenburg, Oldenburg, Germany;)
Hauth, Christopher F.; Roettges, Saskia; Kranzusch, Paul; Brand, Thomas (Medizinische Physik, CvO Universität Oldenburg, Germany)
Rossbach, Jana; Meyer, Bernd T. (Cluster of Excellence Hearing4all & Kommunikationsakustik, CvO Universität Oldenburg, Germany)
Schaedler, Marc Rene (Medizinische Physik, CvO Universität Oldenburg & Cluster of Excellence Hearing4all & Vibrosonic GmbH, Mannheim, Germany)
Warzybok, Anna (Medizinische Physik, CvO Universität Oldenburg & Cluster of Excellence Hearing4all, Germany)

Abstract:
Four non-intrusive models are compared that predict human speech recognition thresholds (SRTs, i.e., signal to noise ratios with 50% word recognition rate) in different acoustic environments. Three of them use the blind binaural processing stage (bBSIM) as front-end, while one model uses the spectral representation of the left and right ear signal channels together with their difference. Predictions are evaluated for three acoustic environments (anechoic, office, and cafeteria) with speech from the front and noise from different directions. Despite many technical differences across the models, all of them perform quite accurately (root mean squared prediction errors below 2.2 dB for all models). This implies that any of the non-intrusive models facilitates to predict SRTs for listeners with normal hearing measured in stationary noise, different acoustic environments, and spatial configurations.