The last few years have seen great maturation in the computation speed and control methods needed to portray 3D humans suitable for real interactive applications. I will describe the state of the art through a discussion of the Jack software and its evolution at the University of Pennsylvania. The definition, manipulation, animation, and performance analysis of virtual human figures will be examined. From modeling reasonable body sizes and shape, through control of the highly redundant body linkage, to simulation of plausible motions, human figures offer numerous computational challenges. Enhanced interactive control is provided by natural behaviors such as multiple constraints, looking, reaching, balancing, lifting, stepping, walking, grasping, and so on. A sense-control-act structure permits reactive behaviors that are locally adaptive to the environment. Parallel "programs" based on finite-state machines and task planners can be used to drive animated human agents through complex tasks. Example situations are drawn from human factors in engineering design, real-time agent simulation, and medical applications.
Virtual humans are here. The last few years have seen great maturation in the computation speed and control methods needed to portray 3D humans suitable for interactive and real-time applications. Current, emergent, or future major applications of virtual humans include:
Besides general industry-driven improvements in the underlying computer and graphical display technologies themselves, virtual humans will enable quantum leaps in applications requiring personal and live participation.
The University of Pennsylvania has tried to lead in research and development of human-like simulated figure. Our interest in human simulation is not unique, but the complex of activities surrounding our approach is. The framework for our research is a software system called Jack [Badl93b]. Jack is an interactive system for definition, manipulation, animation, and performance analysis of virtual human figures. Our philosophy has led to a particular realization of a virtual human model that:
Of course, there are many reasons to design specialized human models that individually optimize character, performance, intelligence, and so on. Many research and development efforts concentrate on one or two of these criteria. We are engaged in all of these issues, while building upon a common software framework.
In particular, our work coordinates and integrates several domains:
In building models of virtual humans, there are varying notions of virtual fidelity. Understandably, these are application dependent. For example, fidelity to human size, capabilities, and joint and strength limits are essential to some applications such as design evaluation; whereas in games, training, and military simulations, temporal fidelity (real-time behavior) is essential. In our efforts we have attacked both.
Understanding that different applications require different sorts of virtual fidelity leads to the question of what makes a virtual human ``right''?
Unfortunately the state of research in virtual humans is not as advanced as to make the proper selection a matter of buying off-the-shelf systems. There are gradations of fidelity in the models: some models are very advanced in a narrow area but lack other desirable features.
In a very general way, we can characterize the state of virtual human modeling along three dimensions:
The arrows and hash marks are meant to be qualitative indicators of where we think usable technology exists today. Understanding that the arrows can actually extend an undetermined distance to the right, the idea is nonetheless being conveyed that we (and others) have proceeded rather far beyond the individual rendering of still frames as realized by traditional hand animation or even computer assisted cartoon animation. If we need to invoke them, increasingly accurate medically-grounded human models may be obtained. We can create virtual humans with consistent shape, geometry, and limitations rather than arbitrary scaling and deformability. Virtual humans are also beginning to exhibit the early stages of intelligence as they make decisions in novel, changing environments rather than being forced into one-time movements fixed in the environment.
Virtual humans are different than simplified cartoon and game characters. What are the characteristics of this difference and why are virtual humans more difficult to construct? After all, anyone who goes to the movies can see marvelous synthetic characters (toys, dinosaurs, etc.), but they have been created typically for one scene or one movie and are not meant to be re-used (except possibly by the animator -- and certainly not by the viewer). The difference lies in the interactivity and autonomy of virtual humans. What makes a virtual human human is not just a well-executed exterior design but movements, reactions, and decision-making which appear ``natural,'' appropriate, and contextually-sensitive.
Since accurate human motion is difficult to synthesize, motion capture is a popular alternative, but one must recognize its limited adaptability and subject specificity. Although a complex motion may be used as performed, say in a CD-ROM game or as the source material for a (non-human) character animation, the motions may be best utilized if segmented into motion ``phrases'' that can be executed separately and connected via transitional (non-captured) motions with each other [Brud95,Rose96]. Several projects have used this technique to interleave ``correct'' human movements into simulations that control the order of the choices. While 2D game characters have been animated this way for years -- using pre-recorded or hand animated sequences for the source material -- recently the methods have graduated to 3D whole body controls suitable for 3D game characters, real-time avatars, and military simulations that include individual synthetic soldiers [Gran95].
Providing a virtual human with human-like reactions and decision-making is more complicated than controlling its joint motions from captured or synthesized data. Here is where we engage the viewer with the character's personality and demonstrate its skill and intelligence in negotiating its environment, situation, and other agents. This level of performance requires significant investment in decision-making tools. We presently use a two level architecture:
The architecture is built on parallel transition networks (PaT-Nets) [Badl93b]: nodes represent executable processes, edges contain conditions which when true cause transitions to another node (process), and a combination of message passing and global memory provide coordination and synchronization across multiple parallel processes. Elsewhere we have shown how this architecture can be applied to the game of ``Hide and Seek'' [Badl95], to two person animated conversation [Cass94], or to simulated emergency medical care [Chi95]. Currently we are using this architecture to study multi-agent activity scheduling for aircraft maintenance, construct appropriate gestural responses from a synthetic agent, and visual attention during other task execution. A particularly interesting connection is underway to connect PaT-Nets into other high level ``AI-like'' planning tools for improved cognitive performance of virtual humans.
The future holds great promise for the virtual humans who will populate our virtual worlds. They will provide economic benefits by helping designers early in the product design phases to produce more human-centered vehicles, equipment, assembly lines, manufacturing plants, and interactive systems. Virtual humans will enhance the presentation of information through training aids, virtual experiences, and even teaching and mentoring. And Virtual humans will help save lives by providing surrogates for medical training, surgical planning, and remote telemedicine. They will be our avatars on the Internet and will portray ourselves to others, perhaps as we are or perhaps as we wish to be. They may help turn cyberspace into a real, or rather virtual, community.
This research is partially supported by DARPA DAMD17-94-J-4486; U.S. Air Force DEPTH through Hughes Missile Systems F33615-91-C-0001; U.S. Air Force through BBN F33615-91-D-0009/0008; U.S. Air Force DAAH04-95-1-0151; DMSO DAAH04-94-G-0402; ONR through Univ. of Houston K-5-55043/3916-1552793; ARO DURIP DAAH04-95-1-0023; DARPA through the Franklin Institute; Army AASERT DAAH04-94-G-0220; DARPA AASERT DAAH04-94-G-0362; NSF IRI95-04372; National Library of Medicine N01LM-43551; and JustSystem Japan.