Gender Voice Imbalance Classification Comparisons and Analysis

Conference: ICMLCA 2021 - 2nd International Conference on Machine Learning and Computer Application
12/17/2021 - 12/19/2021 at Shenyang, China

Proceedings: ICMLCA 2021

Pages: 5Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Authors:
Peng, Yunfei (Lassonde School of Engineering, York University, Toronto, Canada)

Abstract:
This research aims to solve the problem of data imbalance and improve the accuracy of gender voice recognition. A binary multi-layer perceptron (BMLP) is a class of feedforward artificial neural network (ANN) with additional layers compared to a binary single-layer perceptron (BSLP). A spectrogram is a representation of an audio signal whose time-varying frequency content is captured. This paper describes how a BMLP can be applied to the spectrogram of voice of males and females, using different approaches including copying, adding noise and SMOTE to solve the problem of imbalanced data and recognize the gender. Keras is implemented to test the approaches. This method improves the neural network’s classification performance by 5%. A classification accuracy of over 93% is obtained in experiments on our own dataset.