Next: Mix And Match Up: Control Architecture Examples Previous: Text-To-Visual/Auditory Speech

Lip Contour Parameterization

In this control scheme the lips of real human faces are analyzed to extract parameters that drive continuous functions that fit the shape of the lips [6]. Using video analysis it is then possible to synchronize a lip model with the natural voice of the speaker.

One of the unique features of this scheme is that the lip contour shapes, while highly deformable, follow some very regular rules. In fact, the coefficients for the continuous functions can easily be predicted from three anatomical parameters measured on the speaker's face: (1) the horizontal width and (2) the vertical height of the internal lip contour, and (3) the distance between a vertical profile reference and the lip contact protrusion.

The process is as follows:

  1. The images of the lips of a real human face uttering coarticulated phonemes are first recorded and then geometrically analyzed.

  2. From this analysis, a set of lip-jaw shapes, representing the ``labial space'' of a speaker, as well as relevant control parameters, are extracted [8].

  3. The three above mentioned control parameters predict a set of continuous functions (polynomial and sinusoid) that best fit the frontal projection of the contours of the ``viseme'' set. The analysis is extended to 3D [58] where the equations of the lip contours in the coronal plane can be derived. The lip volume is created by linearly interpolating three intermediate contours in-between the frontal, internal, and external contours of the vermilion zone. For each of the five contours, a function approximates each horizontal projection. Two extra parameters, the distances between the vertical profile reference and (4) the lower and (5) the upper lip protrusions, are necessary to predict all the equations of the 3D model.

  4. The model is animated and synchronized with the natural voice of the speaker whose lip gestures control the model by real-time video analysis [54].



Next: Mix And Match Up: Control Architecture Examples Previous: Text-To-Visual/Auditory Speech


pkitchin@graphics
Thu Nov 17 10:12:34 EST 1994