Robust Speaker Identification by Fusing Classification Scores with a Neural Network

Conference: Speech Communication - 13. ITG-Fachtagung Sprachkommunikation
10/10/2018 - 10/12/2018 at Oldenburg, Deutschland

Proceedings: Speech Communication

Pages: 5Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Wilkinghoff, Kevin; Baggenstoss, Paul M.; Cornaggia-Urrigshardt, Alessia; Kurth, Frank (Fraunhofer Institute for Communication, Information Processing and Ergonomics FKIE, Fraunhoferstr. 20, 53343 Wachtberg, Germany)

Abstract:
Score-based fusion of multiple independent models for the purpose of identifying speakers is widely used as it reduces the identification error rate significantly. In this work, a speaker identification system for low-quality speech which has been propagated through telephone and communication channels is proposed. The system consists of 15 models based on 5 features as well as a Neural Network structure for the task of fusing the classification scores resulting from the individual models. Its performance is evaluated in closed-set speaker identification experiments conducted on the Switchboard corpus. Furthermore, the proposed Neural Network architecture is compared to other fusion techniques such as taking the mean, a Majority Voting, an Evolutionary Algorithm and Logistic Regression.