Unveiling Deep Speech Embeddings: Acoustic Insights into Hatespeech Detection
Conference: Speech Communication - 16th ITG Conference
09/24/2025 - 09/26/2025 at Berlin, Germany
Proceedings: ITG-Fb. 321: Speech Communication
Pages: 5Language: englishTyp: PDF
Authors:
Rammohan, Rathi Adarshi; Ren, Zhao; Swiderska, Aleksandra; Kuester, Dennis; Schultz, Tanja
Abstract:
Modern democratic societies face a growing challenge from hate speech, yet most research focuses on hatetext (written content) detection, overlooking valuable acoustic cues. To explore the potential of hatespeech (spoken content), this study focuses on hate detection using acoustic features and self-supervised speech embeddings in the HateMM dataset. We performed a layer-wise analysis of two fine-tuned speech models, Wav2Vec2.0 and HuBERT. A linear classifier yielded F1 scores of 0.8137 and 0.8106, demonstrating their effectiveness. To further interpret these models, canonical correlation analysis (CCA) was applied to measure the similarity between hand-crafted acoustic features and learned speech embeddings. Our findings highlight the role of energy-related features, embedded in layers of speech models, in distinguishing hatespeech from non-hatespeech.

