Exploring In-Context Learning Capabilities of ChatGPT for Pathological Speech Detection
Konferenz: Speech Communication - 16th ITG Conference
24.09.2025-26.09.2025 in Berlin, Germany
Tagungsband: ITG-Fb. 321: Speech Communication
Seiten: 5Sprache: EnglischTyp: PDF
Autoren:
Amiri, Mahdi; Shahreza, Hatef Otroshi; Kodrasi, Ina
Inhalt:
Automatic pathological speech detection approaches have shown promising results, gaining attention as potential diagnostic tools alongside costly traditional methods. Recently, it has been demonstrated that large language models (LLMs) can be leveraged for downstream tasks through few-shot in-context learning. In this paper, we investigate the use of multimodal LLMs, specifically ChatGPT-4o, for automatic pathological speech detection in a few-shot in-context learning setting. Experimental results demonstrate that this approach achieves competitive performance compared to state-of-the-art methods. To further understand its effectiveness, we conduct an ablation study to analyze the impact of different factors, such as input type and system prompts, on the final results. Our findings highlight the potential of multimodal LLMs for further exploration and advancement in automatic pathological speech detection.

