General Detection of Speech Signals in the Time-Frequency Plane

Konferenz: Speech Communication - 12. ITG-Fachtagung Sprachkommunikation
05.10.2016 - 07.10.2016 in Paderborn, Deutschland

Tagungsband: ITG-Fb. 267: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Urrigshardt, Sebastian; Kreuzer, Sebastian; Kurth, Frank (Fraunhofer Institute for Communication, Information Processing and Ergonomics FKIE, 53343 Wachtberg, Germany)

We consider the problem of general detection of speech signals in the time-frequency plane. In addition to classical speech detection that aims at a temporal localization of speech only, we also consider scenarios where the frequency regions containing speech (carrier frequencies) are unknown. As typical applications are broadband communication signals, we subsequently refer to this task as broadband speech detection (BBSD). BBSD systems face a variety of challenges which classical narrowband speech detection approaches are not designed for, most of all the unknown carrier frequency of the speech signal to detect. This paper investigates various feature extraction and classification methods in the context of BBSD preprocessing. With the goal of an efficient as well as robust BBSD, we systematically analyze suitable cascades of such preprocessing steps. Besides classical audio features we also consider a class of features that has been developed for com- munication signals.