Voice Activity Detection Based on Modulation-Phase Differences

Konferenz: Speech Communication - 12. ITG-Fachtagung Sprachkommunikation
05.10.2016 - 07.10.2016 in Paderborn, Deutschland

Tagungsband: ITG-Fb. 267: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Autoren:
Graf, Simon (Acoustic Speech Enhancement Research, Nuance Communications Deutschland GmbH, Ulm, Germany & Digital Signal Processing and System Theory, Christian-Albrechts-Universität zu Kiel, Kiel, Germany)
Herbig, Tobias; Buck, Markus (Acoustic Speech Enhancement Research, Nuance Communications Deutschland GmbH, Ulm, Germany)
Schmidt, Gerhard (Digital Signal Processing and System Theory, Christian-Albrechts-Universität zu Kiel, Kiel, Germany)

Inhalt:
Many speech processing algorithms rely on voice activity detection (VAD) that separates speech from noise. For this task, several features have been introduced that employ different characteristic properties of speech. In this contribution, we introduce a new feature that is robust against various types of noise. By considering an alternating excitation structure of low and high frequencies, speech is detected with a high confidence. The computationally low complex feature can cope even with the limited spectral resolution that is typical for in-car-communication systems. By combining the feature with a conventional modulation feature, the performance can be improved. Our simulations confirm the robustness of the feature and show the increasing performance compared to established VAD features.