Emotion Intelligibility within Codec-Compressed and Reduced Bandwidth Speech

Konferenz: Speech Communication - 12. ITG-Fachtagung Sprachkommunikation
05.10.2016 - 07.10.2016 in Paderborn, Deutschland

Tagungsband: ITG-Fb. 267: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Siegert, Ingo; Lotz, Alicia Flores; Wendemuth, Andreas (Institute for Information and Communications Engineering, Otto-von-Guericke University Magdeburg, 39106 Magdeburg, Germany)
Maruschke, Michael; Jokisch, Oliver (Institute of Communications Engineering, Leipzig University of Telecommunications, 04277 Leipzig, Germany)

In human-computer interaction, affective measures like emotion assessment are an important means to steer and improve the dialog flow, which will be an integral part of automatic interaction management. In this context, we evaluate the intelligibility of affective speech both by interrater reliability (IRR) in human voting and by the ITU-recommended POLQA measure. Since speech is transmitted under reduced bandwidth and in compressed form, we conduct our study with different audio codecs and bandwidths. Correctness is computed as unweighted Averaged Recall (UAR) in a seven-emotions-setting. We present a ranking of emotion intelligibility both for IRR/UAR and POLQA and give conclusions for tolerable bandwidths and codec potentials.