Time-frequency Dependent Multichannel Voice Activity Detection

Conference: Speech Communication - 11. ITG-Fachtagung Sprachkommunikation
09/24/2014 - 09/26/2014 at Erlangen, Deutschland

Proceedings: Speech Communication

Pages: 4Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Stenzel, Sebastian; Freudenberger, Juergen (Institute for System Dynamics, HTWG Konstanz, University of Applied Sciences, Germany)

Abstract:
This work proposes a method to determine voice activity for each time-frequency point in the noisy microphone signals of a microphone array. The speech signal as well as noise signals are assumed to be multivariate Gaussian random variables. Based on this signal model a generalized likelihood ratio test is derived. This likelihood ratio test can be simplified to a threshold test that compares the current a posteriori signal-to-noise ratio for each timefrequency point with a predetermined threshold. The theoretical results as well as the simulation results indicate that voice activity is well approximated.