Time-frequency Dependent Multichannel Voice Activity Detection

Konferenz: Speech Communication - 11. ITG-Fachtagung Sprachkommunikation
24.09.2014 - 26.09.2014 in Erlangen, Deutschland

Tagungsband: Speech Communication

Seiten: 4Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Autoren:
Stenzel, Sebastian; Freudenberger, Juergen (Institute for System Dynamics, HTWG Konstanz, University of Applied Sciences, Germany)

Inhalt:
This work proposes a method to determine voice activity for each time-frequency point in the noisy microphone signals of a microphone array. The speech signal as well as noise signals are assumed to be multivariate Gaussian random variables. Based on this signal model a generalized likelihood ratio test is derived. This likelihood ratio test can be simplified to a threshold test that compares the current a posteriori signal-to-noise ratio for each timefrequency point with a predetermined threshold. The theoretical results as well as the simulation results indicate that voice activity is well approximated.