Sparse, Hierarchical and Semi-Supervised Base Learning for Monaural Enhancement of Conversational Speech

Konferenz: Sprachkommunikation - Beiträge zur 10. ITG-Fachtagung
26.09.2012 - 28.09.2012 in Braunschweig, Deutschland

Tagungsband: Sprachkommunikation

Seiten: 4Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Autoren:
Weninger, Felix; Wöllmer, Martin; Schuller, Björn (Institute for Human-Machine Communication, Technische Universität München, Germany)

Inhalt:
We address the learning of noise bases in a monaural speaker-independent speech enhancement framework based on non-negative matrix factorization. Bases are estimated from training data in batch processing by means of hierarchical and non-hierarchical sparse coding, or determined during the speech enhancement process based on the divergence of the observed noisy speech signal and the speech base. In extensive test runs on the Buckeye corpus of highly spontaneous speech and the CHiME corpus of nonstationary real-life noise, we observe that semi-supervised learning of noise bases leads to overall best results while a-priori learning of noise bases is useful to speed up computation.