Sound Source Localisation using Neural Networks with Circular Binary Classification

Conference: Speech Communication - 14th ITG Conference
09/29/2021 - 10/01/2021 at online

Proceedings: ITG-Fb. 298: Speech Communication

Pages: 5Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Schaefer, Magnus; Geyer, Leonie (HEAD acoustics GmbH, Herzogenrath, Germany)

There are several approaches for localising sound sources. This contribution utilises an eight-channel microphone array that is mounted on an artificial head and spatially samples the direct vicinity of both ears. This allows for a localisation approach that resembles the experience of a human listener while avoiding the limitations of an artificial head (e.g., with respect to front-back confusions). The microphone signals are converted to the frequency domain by a short-time Fourier transform and the phase information is used by a convolutional neural network – interpreting the localisation task as a classification. This approach employs a novel use of binary classifiers that allows for implicitly taking the localisation error into account. Different variations of how to estimate the source position with the circular binary classifier are given. A performance assessment of the localisation system based on real audio recordings is presented. A comparison with a related machine learning approach using a regular classification and two conventional beamforming algorithms is shown to highlight the positive impact of the new classifier design.