Sign Language Recognition by Combining Statistical DTW and Independent Classification
To recognize speech, handwriting, or sign language, many hybrid approaches have been proposed that combine dynamic time warping (DTW) or hidden Markov models (HMMs) with discriminative classifiers. However, all methods rely directly on the likelihood models of DTW/HMM. We hypothesize that time warpi...
Saved in:
Published in: | IEEE transactions on pattern analysis and machine intelligence Vol. 30; no. 11; pp. 2040 - 2046 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
Los Alamitos, CA
IEEE
01-11-2008
IEEE Computer Society The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | To recognize speech, handwriting, or sign language, many hybrid approaches have been proposed that combine dynamic time warping (DTW) or hidden Markov models (HMMs) with discriminative classifiers. However, all methods rely directly on the likelihood models of DTW/HMM. We hypothesize that time warping and classification should be separated because of conflicting likelihood modeling demands. To overcome these restrictions, we propose using statistical DTW (SDTW) only for time warping, while classifying the warped features with a different method. Two novel statistical classifiers are proposed - combined discriminative feature detectors (CDFDs) and quadratic classification on DF Fisher mapping (Q-DFFM) - both using a selection of discriminative features (DFs), and are shown to outperform HMM and SDTW. However, we have found that combining likelihoods of multiple models in a second classification stage degrades performance of the proposed classifiers, while improving performance with HMM and SDTW. A proof-of-concept experiment, combining DFFM mappings of multiple SDTW models with SDTW likelihoods, shows that, also for model-combining, hybrid classification can provide significant improvement over SDTW. Although recognition is mainly based on 3D hand motion features, these results can be expected to generalize to recognition with more detailed measurements such as hand/body pose and facial expression. |
---|---|
Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 ObjectType-Article-1 ObjectType-Feature-2 |
ISSN: | 0162-8828 1939-3539 |
DOI: | 10.1109/TPAMI.2008.123 |