Joint Reduction of Ego-noise and Environmental Noise with a Partially-adaptive Dictionary

Konferenz: Speech Communication - 14th ITG Conference
29.09.2021 - 01.10.2021 in online

Tagungsband: ITG-Fb. 298: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Fang, Huajian (Signal Processing (SP), Universität Hamburg, Hamburg, Germany & Knowledge Technology (WTM), Universität Hamburg, Hamburg, Germany)
Carbajal, Guillaume; Gerkmann, Timo (Signal Processing (SP), Universität Hamburg, Hamburg, Germany)
Wermter, Stefan (Knowledge Technology (WTM), Universität Hamburg, Hamburg, Germany)

We consider the problem of simultaneous reduction of egonoise, i.e., the noise produced by a robot, and environmental noise. Both noise types may occur simultaneously for humanoid interactive robots. Dictionary- and template-based approaches have been proposed for ego-noise reduction. However, most of them lack adaptability to unseen noise types and thus exhibit limited performance in real-world scenarios with environmental noise. Recently, a variational autoencoder (VAE)-based speech model combined with a fully-adaptive dictionary-based noise model, i.e., non-negative matrix factorization (NMF), has been proposed for environmental noise reduction, showing decent adaptability to unseen noise data. In this paper, we propose to extend this framework with a partially-adaptive dictionary-based noise model, which partly adapts to unseen environmental noise while keeping the part pre-trained on ego-noise unchanged. With appropriate sizes, we demonstrate that the partially-adaptive approach outperforms the approaches based on the fully-adaptive and completely-fixed dictionaries, respectively.