Speaker vs Noise Conditioning for Adaptive Speech Enhancement

Conference: Speech Communication - 16th ITG Conference
09/24/2025 - 09/26/2025 at Berlin, Germany

Proceedings: ITG-Fb. 321: Speech Communication

Pages: 5Language: englishTyp: PDF

Authors:
Triantafyllopoulos, Andreas; Tsangko, Iosif; Mueller, Michael; Schroeter, Hendrik; Schuller, Bjoern

Abstract:
Deep neural networks have shown improved results on noise attenuation and speech quality compared to traditional denoising algorithms. However, their heavy computational requirements make them unsuitable for the constraints of hearing aid devices. To that end, adaptive denoising, which introduces complementary information to the main denoising network, has emerged as an alternative to improve performance and partially offload the load to a supporting module that runs externally. Adaptive denoising relies primarily on speaker fingerprints, also known as personalised speech enhancement, or on fingerprints of the background noise. Our work compares the two in a fair setting for the first time using the DeepFilterNet architecture. We also investigate the extent in which they can both facilitate a reduction in the size of the main denoising model, showcasing the promise of using contextual adaptation to reduce the workload of models running on the hearing aid.

Speaker vs Noise Conditioning for Adaptive Speech Enhancement

Individual Cookie Settings

Necessary Cookies

Optional Cookies