1D-DCNN: 1D Dilated CNN for Efficient Depression Diagnosis Based on Audio Signal
Konferenz: BIBE 2025 - The 8th International Conference on Biological Information and Biomedical Engineering
11.08.2025-13.08.2025 in Guiyang, China
Tagungsband: BIBE 2025
Seiten: 7Sprache: EnglischTyp: PDF
Autoren:
Huang, Nan; Luo, Yu; Que, Lingyi; Wang, Lu; Li, Xinwei; Li, Zhangyong; Liu, Zhichao
Inhalt:
Depression is a prevalent mental disorder that involves prolonged feelings of sadness or loss of interest in activities for a long time, even self-harm and suicidal. However, due to low awareness of depression in society, discrimination and prejudice, together with the complex diagnosis process and un-unified diagnosis standard, far from enough attention or treatment has been acquired by depression patients. In this study, we propose the 1D dilated convolutional neural network (1D-DCNN) for efficient depression diagnosis with human speech. Following extracting Linear Prediction Coefficients (LPC) and Mel Frequency Cepstral Coefficients (MFCC) of the preprocessed audio signals, the multimodal features are fused and evaluated by custom-designed 1D dilated convolutional neural kernels, together with nonlinear projection layers for the final classification and diagnosis of depression. The model is evaluated on MODMA, a publicly available dataset of more than 1500 audio signals from 52 subjects. The experimental results show that our proposed model achieves state-of-the-art performance, with an accuracy of 84.27% and F1-score of 85.58%. The proposed method can serve as a clinically auxiliary tool that screens and evaluates potential subjects with depression.

