Enhancement of G.711-Coded Speech Providing Quality Higher Than Uncoded

Conference: Speech Communication - 13. ITG-Fachtagung Sprachkommunikation
10/10/2018 - 10/12/2018 at Oldenburg, Deutschland

Proceedings: Speech Communication

Pages: 5Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Zhao, Ziyue; Liu, Huijun; Fingscheidt, Tim (Institute for Communications Technology, Technische Universität Braunschweig, Schleinitzstr. 22, 38106 Braunschweig, Germany)

Abstract:
The speech quality of the worldwide mostly used speech codec ITU-T G.711 can be advantageously enhanced on the decoder side without modifying the codec itself. In this paper we present an enhancement approach for G.711-coded speech based on a convolutional neural network (CNN) employing cepstral domain features in a system-compatible fashion. In a subjective CCR listening test, the proposed enhancement approach exceeds the speech quality of an ITU-T-standardized postfilter by 0.36 CMOS points, and improves G.711-coded speech by a clear 1.77 CMOS points. The proposed CNN-based enhancement approach even achieves a significant 0.18 CMOS points improvement when compared to uncoded speech, a surprising result which, to the best of our knowledge, has never been seen before.