Abstract
Issues in modeling American Sign Language for computer-based recognition. Talk given at the Math & Computer Science Department, Gallaudet University, April 26, 2002.
Full-color handouts (3.8MB) - Black and white handouts (suitable for printing) (545K)
In this talk I will present the state of the art in the field of computer-based American Sign Language (ASL) recognition. ASL is a very complex language, especially on the phonological level, because it is so highly inflected. As a result, signs can appear in an incredibly large number of forms and shapes. Early approaches to ASL recognition ignored this problem and modeled each appearance of each sign explicitly, and trained the recognizer on each of these models. Explicit modeling, however, do not scale at all to large vocabularies, because it is practically impossible to capture enough training examples as the number of models grows.
More recent approaches have attempted to use results from ASL phonology to break down the complexity of the recognition task. The idea behind these approaches is to break down the signs into their constituent phonemes (cheremes), and to train the recognizer on the models for the phonemes. These approaches have much better potential to scale to large vocabularies, because the number of phonemes is limited and relatively small.
In this talk I will present our phoneme-based approach, which uses a modified version of Liddell & Johnson's Movement-Hold model. I will discuss which parts of the Movement-Hold model directly carry over to computer-based recognition, and which parts require modification and why. One particularly thorny problem is how to model simultaneously evolving processes on a computer; for example, the strong and the weak hand move simultaneously in ASL. The Movement-Hold model lumps simultaneously evolving processes together into so-called "articulatory feature bundles." Unfortunately, it is not practical to model these bundles for computer-based recognition. Doing so would require modeling all possible combinations of features explicitly, but there are simply too many of them (~ 100 million). In a radical departure from the Movement-Hold model, our approach attempts to overcome this problem by modeling simultaneously evolving processes independently from one another.
This is joint work with Dimitris Metaxas (University of Pennsylvania/Rutgers University).
