Unveiling Deep Speech Embeddings: Acoustic Insights into Hatespeech Detection

Conference: Speech Communication - 16th ITG Conference
09/24/2025 - 09/26/2025 at Berlin, Germany

Proceedings: ITG-Fb. 321: Speech Communication

Pages: 5Language: englishTyp: PDF

Authors:
Rammohan, Rathi Adarshi; Ren, Zhao; Swiderska, Aleksandra; Kuester, Dennis; Schultz, Tanja

Abstract:
Modern democratic societies face a growing challenge from hate speech, yet most research focuses on hatetext (written content) detection, overlooking valuable acoustic cues. To explore the potential of hatespeech (spoken content), this study focuses on hate detection using acoustic features and self-supervised speech embeddings in the HateMM dataset. We performed a layer-wise analysis of two fine-tuned speech models, Wav2Vec2.0 and HuBERT. A linear classifier yielded F1 scores of 0.8137 and 0.8106, demonstrating their effectiveness. To further interpret these models, canonical correlation analysis (CCA) was applied to measure the similarity between hand-crafted acoustic features and learned speech embeddings. Our findings highlight the role of energy-related features, embedded in layers of speech models, in distinguishing hatespeech from non-hatespeech.

Unveiling Deep Speech Embeddings: Acoustic Insights into Hatespeech Detection

Individual Cookie Settings

Necessary Cookies

Optional Cookies