Scoring and Re-Ranking of ASR Hypotheses Using Phoneme Error Models

Konferenz: Speech Communication - 11. ITG-Fachtagung Sprachkommunikation
24.09.2014 - 26.09.2014 in Erlangen, Deutschland

Tagungsband: Speech Communication

Seiten: 4Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Autoren:
Hacker, Martin; Noeth, Elmar (Embedded Systems Initiative (ESI) / Department of Computer Science, University of Erlangen-Nuremberg, Germany)

Inhalt:
We present a method to score automatic speech recognition (ASR) hypotheses. Potential candidate hypotheses are scored in terms of their phonetic confusability with all competing hypotheses in an n-best list. The scores are computed with a probabilistic phoneme error model of the ASR process that is taken as a black box. One of the applications of the scoring technique is nbest list re-ranking. In this paper, we evaluate and compare different phoneme error models for this task. We obtained significant improvements of the speech recognition accuracy for both spontaneous and read speech in conjunction with a decision tree classifier that can predict in which cases the re-ranking should be applied.