Exploring Nonnegative Matrix Factorization for Audio Classification: Application to Speaker Recognition

Conference: Sprachkommunikation - Beiträge zur 10. ITG-Fachtagung
09/26/2012 - 09/28/2012 at Braunschweig, Deutschland

Proceedings: Sprachkommunikation

Pages: 4Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Joder, Cyril; Schuller, Björn (Institute for Human-Machine Communication, Technical University Munich, 80333 München, Germany)

In this paper, we test the use of Nonnegative Matrix Factorization (NMF) for feature extraction in the context of audio classification. NMF calculates a decomposition of the spectrogram into nonnegative factors and has been successfully applied to audio source separation. Thus, it has the potential to be robust to noise disturbances when used for feature calculation. We then introduce two feature sets directly derived from the NMF decomposition. Experiments performed on an 8-class speaker recognition task with Support Vector Machines show that the proposed representations convey complementary information to the baseline MFCC features. Indeed, the use of only the NMF-based descriptors lead to similar results as the reference features, and the combination of these representations yields a significant improvement of the obtained accuracy.