Visual Modeling and Feature Adaptation in Sign Language Recognition

Conference: Sprachkommunikation 2008 - 8. ITG-Fachtagung
10/08/2008 - 10/10/2008 at Aachen, Germany

Proceedings: Sprachkommunikation 2008

Pages: 4Language: englishTyp: PDF

Personal VDE Members are entitled to a 10% discount on this title

Dreuw, Philippe; Ney, Hermann (Human Language Technology and Pattern Recognition, RWTH Aachen University, D-52056 Aachen)

We propose a tracking adaptation to recover from early tracking errors in sign language recognition by optimizing the obtained tracking paths w.r.t. the hypothesized word sequences of an automatic sign language recognition system. Hand or head tracking is usually only optimized according to a tracking criterion. As a consequence, methods which depend on accurate detection and tracking of body parts lead to recognition errors in gesture and sign language processing. Similar to speaker dependent feature adaptation methods in automatic speech recognition, we propose an automatic visual alignment of signers for vision-based sign language recognition. Furthermore, the generation of additional virtual training samples is proposed to reduce the lack of data problem in sign language processing, which often leads to “one-shot” trained models. Most state-ofthe-art systems are speaker dependent, and consider tracking as a preprocessing feature extraction part. Experiments on a publicly available benchmark database show that the proposed methods strongly improve the recognition accuracy of the system.