Robust Speaker Identification by Fusing Classification Scores with a Neural Network

Konferenz: Speech Communication - 13. ITG-Fachtagung Sprachkommunikation
10.10.2018 - 12.10.2018 in Oldenburg, Deutschland

Tagungsband: ITG-Fb. 282: Speech Communication

Seiten: 5Sprache: EnglischTyp: PDF

Persönliche VDE-Mitglieder erhalten auf diesen Artikel 10% Rabatt

Wilkinghoff, Kevin; Baggenstoss, Paul M.; Cornaggia-Urrigshardt, Alessia; Kurth, Frank (Fraunhofer Institute for Communication, Information Processing and Ergonomics FKIE, Fraunhoferstr. 20, 53343 Wachtberg, Germany)

Score-based fusion of multiple independent models for the purpose of identifying speakers is widely used as it reduces the identification error rate significantly. In this work, a speaker identification system for low-quality speech which has been propagated through telephone and communication channels is proposed. The system consists of 15 models based on 5 features as well as a Neural Network structure for the task of fusing the classification scores resulting from the individual models. Its performance is evaluated in closed-set speaker identification experiments conducted on the Switchboard corpus. Furthermore, the proposed Neural Network architecture is compared to other fusion techniques such as taking the mean, a Majority Voting, an Evolutionary Algorithm and Logistic Regression.