Next: List of Participants
Up: FINAL REPORT TO NSF
Previous: Final Remarks
Some of the achieved goals of
the workshop are
- It produced a vocabulary list of signals that characterize faces and
their motions.
- Due to the various and diverse applications there are, no unique
standard seems appropriate but a minimum requirement of facial
geometry and function is required.
- FACS is widely used in facial animation but needs more details
regarding lip movements and the temporal data of muscle actions.
- Validation of facial models and controls is an important
problem and different techniques were suggested.
- A videotape gathering works by the participants is being made.
In comparing the list of phenomena to be modeled to the list of
modeling techniques, it is clear that researchers have been working
effectively to address many of the fundamental modeling problems of
facial animation. However, In spite of the array of advanced modeling
technologies that have been introduced into the field, it is still a
far from trivial task for an end user to create a model of a specific
person's face and to make it speak or perform other complex behaviors.
In order to enable practical applications of facial animation, five
basic research tasks should be undertaken: understand application
needs, develop a data description language, collect an extensive
database, formalize and validate modeling techniques, and perform
basic research into modeling techniques. As improved computing
technology becomes available, new applications of facial modeling are
becoming feasible and cost-effective. These research tasks are
designed to encourage efficient development of these applications.
- Understanding application needs: None of the modeling topics is
completely application independent since the application will at least
determine to what level of detail certain phenomena are to be modeled.
Researchers currently have few guidelines beyond common sense for
determining which phenomena may safely be ignored in a simulation
while still meeting the needs of the application. A detailed analysis
of the potential applications for facial modeling would be a great
benefit to researchers wishing to develop useful facial animation
techniques.
- Development of a language for describing facial data: The
facial modeling community currently lacks a standard method for
recording facial models. The FACS system has been an invaluable tool
for organizing research into facial actions, but it does not address
geometric or physical properties of the face. In addition, FACS is
not a computer standard, and implementations of FACS vary widely. It
is likely that facial modeling will be able to benefit from the more
general work in progress within the CAD and computer animation
communities by providing face-specific extensions to standard file
formats. Whatever approach to definition of a standard, it will no
doubt undergo rapid evolution as new modeling techniques enter common
use and therefore it should be built within a flexible framework. The
format should be capable of representing information collected from
living human faces in addition to purely synthetic faces in order to
facilitate both validation and performance-driven hybrid models.
- Collecting data: Several of the validation techniques described
above rely on the comparison of model generated data to data collected
from living human faces. To facilitate validation and to provide
insight into the structure and behavior of the face, the research
community should be constructing a database of information about the
faces of a wide range of human subjects. This task would include
recording as much raw data as possible about the physical phenomena of
the face. This effort would no doubt overlap with the effort
described above to define a language for describing facial data, since
it is through that language that the data would be stored and
processed.
- Formalization and validation of modeling techniques: Now that
facial modeling has been successfully applied to a range of
application types, a standard and robust implementation of these
techniques should be made available over the Internet for
incorporation into new applications. This approach has proven
extremely valuable for numerical algorithms and other computing
tools. Together with the database of human subjects described above,
this network resource should encourage application developers to
incorporate sophisticated facial models in ways that might otherwise
have been rejected as too difficult.
- Further development of modeling techniques: The field of facial
modeling is still very new and promising as a field of research. The
human face is perhaps one of the most difficult objects to model both
because of its inherent complexity and because of the special
attention human observers often pay to even the slightest details of
shape, color, and motion. Many of the modeling techniques discussed
in this section have only begun to be explored and others should be
expanded and refined to incorporate advances from pure computer
science and related engineering specialties.
FACS is the premier notation scheme being using for facial animation.
However, FACS was designed as a recognitive scheme - defined actions
were created as cognitively/visually distinct units rather than
minimally generable units. In particular, (1) actions are imprecisely
defined, and (2) actions combine in an unpredictable manner.
The imprecise definition is necessary with respect to the initial
intent of FACS to notate general expression usage on human faces.
However, describing precise animations can require the manipulation of
the face at a sub-FACS level. The inability of FACS to exactly
predict the result of an AU sequence is irrelevant to notators. A
notator must be aware of what actions may not be currently visible;
however, an animation control system must be able to explicitly state
whether an action is visible. (Alternatively, facial region changes
must be controllable at a sub-AU level, implying multiple levels of
control even within the realm of FACS.) That FACS continues to
function so well despite these weaknesses, in a manner almost
completely converse to its initial design, testifies to its
flexibility and accuracy.
A general notation scheme for describing changes to the human face
needs to be designed. The general philosophy of FACS (i.e.,
representable actions are described in parallel with a structural
model of matching complexity) will be maintained. We will refer to
this scheme as FACS+.
Control systems can be defined at several levels. The face suggests
at least four: (1) geometric, (2) structural, (3) expressive, and (4)
conversational. Geometric facial models manipulate the object at a
purely physical level. Structural models represent the face in terms
of active regions, taking a simplistic view of the underlying
geometry. Expressive models work at a grosser feature level,
animating faces based on its most obvious features. Conversational
representation and control operate at an even higher level, dealing
with emotional intent and general facial actions.
FACS operates at a structural level. Its ability to operate at a
geometric level is limited due to its lack of definition at that low a
level. Likewise, its ability to operate at higher levels has been
well-researched, but exact mappings from intent to FACS AUs are
specified in an ad hoc manner. FACS+ will also operate at a structural
level. We assume other schemes will be used to operate on geometric and
feature-based models; FACS+ will act as an intermediary between the two.
Issues involved in the replacement of FACS center on its relationship to
lower and higher level schemes as well as extensions needed to more
fully represent the face. Based on this, we have identified the
following areas of concern when extending FACS to produce FACS+:
- Downward links to physical model controls.
- Mappings between FACS+ and lower-level controls need to be
defined, as do methods of controlling the mappings.
- Upward links to feature-based model controls.
- Likewise, a control scheme operating at a gross feature
level must be mappable and easily remappable into FACS+
controls.
- Static definition of expression changes.
- Actions need to be precisely (unambiguously) defined in an
appropriate manner - a single primitive action should not possess
variants or options. Note that feature-based model controls
allow the creation of macro action options.
- Expression dynamics.
- Hooks to allow the expression of dynamics of action
scripting should be present.
- Muscle intensity.
- Better control of action intensity needs to be added. Simply, a
linear scale from 0 (no action) to 1 (full intensity) may be
sufficient.
- Extensibility.
- The system should be extensible to allow for alternate
physiological structures. This includes the redefinition of actions
to account for deviant faces (scarring) as well as alternate
architectures (non-human faces).
- Fine/Coarse definition.
- Certain areas of the face need finer definition. In
particular, mouth and lip actions need improvement. Other FACS
actions based on timing (blink, wink, etc.) or intensity (various lip
pull actions) need to be redefined to eliminate temporal and
intensity-based components.
- Tongue action.
- Actions of the tongue need to be developed. These are
limited to gross actions (position and shape of the tongue), and do
not include interactions between the tongue and other facial
structures.
- Interactions.
- A method of further defining interactions between structures
needs to be developed (e.g., cheek thrust). Although these are
handled at a geometric level by simple interaction tests and
constraint propagation, a FACS+ representation will provide a
necessary link between the physically based models and high-level
feature models.
- External interactions.
- Likewise, a general method needs to be defined to handle
the effects on facial structures due to external influences. These
include but are not limited to gravity on slack faces and cheek
puffing.
Attempts are already underway aimed at extending FACS by defining methods to
automate FACS-coding, and to extend and improve the modeling, especially
within the context of simulations, animations and human-machine interaction.
Ekman and Sejnowski [39] are at present developing a
neural net appraoch for recognizing FACS AUs.
Yacoob and
Davis [152] have developed a facial expression recognition system
based on FACS. Essa and Pentland [44][42] are concentrating on
extending the FACS model to FACS+ by observing real people making expressions
and extracting spatial and temporal information from video to describe facial
motion.
Next: List of Participants
Up: FINAL REPORT TO NSF
Previous: Final Remarks