Scoring and Re-Ranking of ASR Hypotheses Using Phoneme Error Models

Conference: Speech Communication - 11. ITG-Fachtagung Sprachkommunikation
09/24/2014 - 09/26/2014 at Erlangen, Deutschland

Proceedings: Speech Communication

Pages: 4Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Hacker, Martin; Noeth, Elmar (Embedded Systems Initiative (ESI) / Department of Computer Science, University of Erlangen-Nuremberg, Germany)

Abstract:
We present a method to score automatic speech recognition (ASR) hypotheses. Potential candidate hypotheses are scored in terms of their phonetic confusability with all competing hypotheses in an n-best list. The scores are computed with a probabilistic phoneme error model of the ASR process that is taken as a black box. One of the applications of the scoring technique is nbest list re-ranking. In this paper, we evaluate and compare different phoneme error models for this task. We obtained significant improvements of the speech recognition accuracy for both spontaneous and read speech in conjunction with a decision tree classifier that can predict in which cases the re-ranking should be applied.