Joint Reduction of Ego-noise and Environmental Noise with a Partially-adaptive Dictionary

Conference: Speech Communication - 14th ITG Conference
09/29/2021 - 10/01/2021 at online

Proceedings: ITG-Fb. 298: Speech Communication

Pages: 5Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Fang, Huajian (Signal Processing (SP), Universität Hamburg, Hamburg, Germany & Knowledge Technology (WTM), Universität Hamburg, Hamburg, Germany)
Carbajal, Guillaume; Gerkmann, Timo (Signal Processing (SP), Universität Hamburg, Hamburg, Germany)
Wermter, Stefan (Knowledge Technology (WTM), Universität Hamburg, Hamburg, Germany)

We consider the problem of simultaneous reduction of egonoise, i.e., the noise produced by a robot, and environmental noise. Both noise types may occur simultaneously for humanoid interactive robots. Dictionary- and template-based approaches have been proposed for ego-noise reduction. However, most of them lack adaptability to unseen noise types and thus exhibit limited performance in real-world scenarios with environmental noise. Recently, a variational autoencoder (VAE)-based speech model combined with a fully-adaptive dictionary-based noise model, i.e., non-negative matrix factorization (NMF), has been proposed for environmental noise reduction, showing decent adaptability to unseen noise data. In this paper, we propose to extend this framework with a partially-adaptive dictionary-based noise model, which partly adapts to unseen environmental noise while keeping the part pre-trained on ego-noise unchanged. With appropriate sizes, we demonstrate that the partially-adaptive approach outperforms the approaches based on the fully-adaptive and completely-fixed dictionaries, respectively.