Binaural Distance Estimation Using a Joint Latent Representation of Acoustic Distance and Direct Path Response

Conference: Speech Communication - 16th ITG Conference
09/24/2025 - 09/26/2025 at Berlin, Germany

Proceedings: ITG-Fb. 321: Speech Communication

Pages: 5Language: englishTyp: PDF

Authors:
Neudek, Daniel; Stodt, Benjamin; Getzmann, Stephan; Martin, Rainer

Abstract:
Estimating the distance to an acoustic source using a binaural receiver is challenging, since binaural signals are highly dependent on room acoustics, the positions of the source and the receiver, and their acoustic properties. For this reason, a large amount of diverse training data and a robust training mechanism are essential to enable generalization to unseen conditions with sufficient accuracy. In addition, regularization methods may be used to prevent overfitting to training conditions. In this work, we introduce a multitask learning approach in which the main task (distance estimation) and an auxiliary task (direct path response estimation) are jointly trained. This ensures that information about room acoustics, including the source and receiver characteristics, is integrated into the latent space. Our experiments demonstrate that the joint learning approach improves both performance in similar acoustic conditions and generalization capabilities in varying conditions.