Spectral Envelope Statistics for Source Modeling in Speech Enhancement

Konferenz: Speech Communication - 12. ITG-Fachtagung Sprachkommunikation
05.10.2016 - 07.10.2016 in Paderborn, Deutschland

Tagungsband: ITG-Fb. 267: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Das, Sneha; Craciun, Alexandra; Jaehnel, Tobias; Baeckstroem, Tom (International Audio Laboratories Erlangen, Friedrich-Alexander University (FAU), Germany)

Source modeling is an efficient tool in speech and audio coding, yet in enhancement applications it has been less extensively employed. Incorporating speech source models from coding to enhancement has been difficult because the models are based on linear prediction, which is non-linear in the frequency domain. In this paper we propose a speech source model based on distribution quantizer, which quantifies the coarse shape of the spectral envelope. The spectral envelope is thus described by a set of parameters whose probability distributions have a simple form. The source parameters are estimated using these probability distributions from a single-channel noisy observation by maximum likelihood. Our experiments show that the proposed method is able to track the signal-to-noise ratio with good accuracy. In addition, although trained only on English items, our method showed relatively good results for German items as well, which demonstrates the robustness of the estimated source models.