The data-driven approach subdivides the model into independently trained regions.
Computer animators struggle to make faces do the simplest natural motions, such as flashing a wink or a smirk. Researchers at Disney Research and Carnegie Mellon University’s Robotics Institute have created computerized models derived from actors’ faces that reflect a full range of natural expressions while also giving animators the ability to manipulate facial poses.
The researchers developed a method that not only translates the motions of actors into a 3D face model, but also subdivides it into facial regions that enable animators to intuitively create the poses they need. The work, to be presented today at Siggraph in Vancouver, envisions creation of a facial model that could be used to rapidly animate any number of characters for films, video games or exhibits.
“We can build a model that is driven by data, but can still be controlled in a local manner,” said J. Rafael Tena, a Disney research scientist, who developed the interactive face models based on principal component analysis (PCA) with Iain Matthews, senior research scientist at Disney, and Fernando De la Torre, associate research professor of robotics at Carnegie Mellon.
Previous data-driven approaches have resulted in models that capture motion across the face as a whole. Tena said these are of limited use for animators because attempts to alter one part of an expression—a cocked eye, for instance—can cause unwanted motions across the entire face. Attempts to simply divide these holistic models into pieces are less effective because the resulting model isn’t tailored to the motion of each piece.
As a result, Tena said, most facial animation still depends on “blendshape” models—a set of facial poses sculpted by artists based on static images. Given the wide range of human expressions, it can be difficult to predict all of the facial poses required in a film or video game, however. Many additional poses often must be created during the course of production.
By contrast, Tena, De la Torre and Matthews created their models by recording facial motion capture data from a professional actor as he performed sentences with emotional content, localized actions and random motions. To cover the whole face, 320 markers were applied to enable the camera to capture facial motions during the performances.
The data from the actor was then analyzed using a mathematical method that divided the face into regions, based in part on distances between points and in part on correlations between points that tend to move in concert with each other. These regional sub-models are independently trained but share boundaries. In this study, the result was a model with 13 distinct regions, but Tena said more regions would be possible by using performance capture techniques that can provide a dense reconstruction of the face, rather than the sparse samples produced by traditional motion capture equipment.
Future work will include developing models based on higher-resolution motion data and developing an interface that can be readily used by computer animators.