Keyword Detection for the Activation of Speech Assistants

Conference: Speech Communication - 13. ITG-Fachtagung Sprachkommunikation
10/10/2018 - 10/12/2018 at Oldenburg, Deutschland

Proceedings: Speech Communication

Pages: 5Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Hirsch, Hans-Guenter; Gref, Michael (Institute for Pattern Recognition, Niederrhein University of Applied Sciences, Krefeld, Germany)

Abstract:
The detection and recognition of a spoken keyword is the most comfortable method to activate a speech assistant. Normally this keyword detection has to be realized in a client system with only limited computational resources. However, erroneously detecting not spoken keywords as well as not identifying spoken keywords should occur as rarely as possible. We present a two stage algorithm that has been designed under the main criterion of low computational requirements. The first stage is based on comparing the emission and accumulated probabilities with a set of thresholds during a Viterbi decoding with a keyword HMM. To lower the number of erroneous detections the MEL spectrum of a detected keyword is analyzed with means of a neural net in a second processing stage. First recognition experiments with speech data not containing the keyword show that already the first stage leads to a quite low false acceptance rate (FAR). This FAR can be considerably reduced by the second stage.