Enhancement of G.711-Coded Speech Providing Quality Higher Than Uncoded

Konferenz: Speech Communication - 13. ITG-Fachtagung Sprachkommunikation
10.10.2018 - 12.10.2018 in Oldenburg, Deutschland

Tagungsband: ITG-Fb. 282: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Autoren:
Zhao, Ziyue; Liu, Huijun; Fingscheidt, Tim (Institute for Communications Technology, Technische Universität Braunschweig, Schleinitzstr. 22, 38106 Braunschweig, Germany)

Inhalt:
The speech quality of the worldwide mostly used speech codec ITU-T G.711 can be advantageously enhanced on the decoder side without modifying the codec itself. In this paper we present an enhancement approach for G.711-coded speech based on a convolutional neural network (CNN) employing cepstral domain features in a system-compatible fashion. In a subjective CCR listening test, the proposed enhancement approach exceeds the speech quality of an ITU-T-standardized postfilter by 0.36 CMOS points, and improves G.711-coded speech by a clear 1.77 CMOS points. The proposed CNN-based enhancement approach even achieves a significant 0.18 CMOS points improvement when compared to uncoded speech, a surprising result which, to the best of our knowledge, has never been seen before.