Cubic Motion makes more than faces

With funding and ambition, Cubic Motion is working to bring consumers face to face with performance capture.

Although Cubic Motion has stayed primarily in the background, leaving the spotlight to its partners, the company has defined performance capture for the gaming and movie industry since its founding in 2009. Recently, the company has demonstrated its prowess with character performances driven such as Siren, a digital reinterpretation of a female character. The technology developed by the company’s founders enables compelling performance capture, with an emphasis on facial animation. The company offers technology and services to track, solve, and animate characters using head-mounted cameras. In preparation for GDC this year, the company said its computer vision technology tracks more than 200 facial features at over 90 frames per second and automatically maps this data to high-quality digital characters in realtime.

The content creation industry is racing to an inflection point in performance capture, digital characters, and realtime digital performance. Cubic Motion is among those running the fastest with new investment and expansion plans.

Much of the technology used in facial animation today comes from original work done by Cubic Motion founder Gareth Edwards at the University of Manchester with Professor Chris Taylor and Tim Cootes entitled “Active Appearance Models.” It’s an early example of a machine learning approach for imaging and has formed the basis of the work that has gone on to power several companies, multiple superheroes and a couple of nice young women. The approach described in Gareth’s paper in 1998 was used for the Kinect.

Edwards and his colleagues are enabling realtime performance capture. One of the most stunning examples was a centerpiece at the Game Developer Conference 2016 where together with Epic, Cubic Motion, Ninja Theory, and 3Lateral demonstrated realtime performance capture on Senua’s Sacrifice. The team proved the possibility of realtime character performances by showing Ninja Theory’s Melina Juergens driving her character’s performance in realtime.

It was the culmination of significant advances in facial capture, rendering, and facial animation solving. Cubic Motion has developed and advanced the capabilities of markerless capture enabling realistic performance. An earlier company, Image Metrics, founded by Gareth Edwards, Kevin Walker, and Alan Brett in 2000, demonstrated the Power of the facial capture technology at Siggraph in 2008. The Image Metrics team created a composite model of actress Emily O’Brien.

Using Light Stages, 3D scanning technology developed by Paul Debevec’s team at USC’s Southern California Institute for Creative Technologies (ICT) created a stunningly realistic model of O’Brien’s face for their “Meet Emily” video presentation and integrated it into a staged interview. The video effectively predicted the coming end of the uncanny valley.

Image Metrics technology has been used in a variety of films and also to create a digital version of Richard Burton in his mid-30s playing George Herbert in Who’s Afraid of Virginia Woolf. The company had hoped productize its technology and enable consumer applications. That idea may have been a little ahead of its time, but Image Metrics is established in the consumer facial capture market. Meanwhile, Edwards, Steven Dorning, Doug Tate, and Mike Jones founded Cubic Motion in 2009 to further explore performance capture and continue their R&D on that front for performance streaming. Image Metrics spun off its facial tracking and animation technology to Faceware in 2012.

Image Metrics has continued to develop facial animation for professional markets including movies, commercials, and games. The technology developed by Edwards and his team have been used for a variety of games including Sony’s God of War, EA’s Battlefield, Activision’s Call of Duty, and Bethesda’s Wolfenstein. In addition, Netflix has used it in a TV show called “Kiss Me First.”

The group created the Meet Mike demo that was demonstrated at Siggraph in 2017 in conjunction with Epic and FXGuide to demonstrate the potential of virtual performances in VR. FXGuide co-founder Mike Seymour was captured interviewing creatives and developers in the industry as a virtual character.

Most recently, Cubic Motion’s technology was used in the creation of the latest Spider Man game for the PS 4.

In September 2017, Cubic Motion received a £20m investment from private equity firm NorthEdge. At that time, the company said it planned to double its head-count of around 70 people and to establish offices in Southern California.

Phase next

At GDC 2018, Cubic Motion took its first step towards commercializing its technology by licensing a platform that could be used by customers to do their own facial animation work. The Siren demo created by Cubic Motion, 3Lateral, Vicon and Tencent was used to demonstrate the capabilities of the technology. The technology was demonstrated on the GDC show flow at the Vicon booth. It has also appeared at FMX in 2018.

The circle is coming around again, and the Cubic Motion team is ready. The company sees an opportunity for realtime face capture, performance capture, interactive communication for consumers via a variety of clients.

If timing is everything, it could be argued that this is a much better time for a consumer-oriented approach to facial animation and face tracking. AI, machine learning, motion capture, and ray tracing have all seen accelerating advances. The average computer has considerable compute power and often, powerful GPU power as well. These advances have changed the world significantly over just the last ten years. Add to that, the rise of social media has helped to build a market for an interactive conversation that didn’t really exist before. Facebook is rife with concert recordings and travelogs, why not add interactive animation? China’s giant online gaming and chat company Tencent participated in the development of Siren because it is interested in offering interactive streaming characters. Another opportunity may come from the online gaming networks that have completely exploded over the past two years. People are becoming Twitch stars, for nothing more than being clever while playing a game, and Twitch is expanding to accommodate multiple forms of Interactive performers.

The first obvious opportunity is coming with VR. At Siggraph, Vicon’s Jeff Ovadya said, “if you want to do anything with live performance capture today, it’s got to be VR.” Vicon is working with companies such as Dreamscape VR on location-based VR attractions.

We might have to wait a bit before we get to be realtime rendered avatars chatting with friends online. Given the spectacular results being achieved with realtime capture, I asked Cubic Motion’s product manager Daniel Graham why we weren’t seeing much better lip syncing in games. He pointed out that console technology is five years old, which yeah, I should have thought of. He said, “in a game, with multiple characters and limited rendering resources (e.g., on a 5-year-old console), you simply haven’t got the GPU to be able to render everything that you would like to. So compromises are: frame rate, optional features (more complex shaders, skin texture, sub-surface scattering, etc.), and level of detail—the actual models themselves can be reduced polygon versions, textures much smaller.” He also noted that given due to the limitations of the hardware, “the facial rig itself is simplified although it accepts the same inputs, so they may reduce the facial model in-game also.”

Big changes are all but here says Graham, “it’s worth pointing out the Siren, Meet Mike, etc., are rendered on a reasonable machine? By no means exceptional, but it does have two Nvidia decent graphics cards in it.” His point is that the hardware is evolving rapidly and the animation quality is scaling along with it.

Alexa Lee, playing Siren, “a digital human.”

“Personally, I’m looking forward to the console generation next? I think these consoles will be able to render a great deal more than the current console offerings. And the fidelity of characters is, as far as we can determine from game production projects we are currently working on, is being pushed in this direction? So I’d expect a lot more realism and retention of the rendering details.”

This year at Siggraph 2018, Jensen Huang introduced their new Turing processor and declared this new generation of GPU ready for realtime interactive rendering. It doesn’t take long for the new, high-end killer graphics to make their way into mid-range platforms and Intel and AMD are both racing to keep up.

Realtime performance streaming is coming. At the moment, it’s not at all clear how it will actually be used by consumers but we can expect to see quite a bit of it being used for commercials, movies, live presentations, and VR experiences. It will take time for the quality we’re seeing in demos to become a reality for consumers and in that time the usage models will change.

So, in the meantime, think about how cool it would be to be in a 3D world and have a live chat with famous people, animated characters, and 3D rendered friends. There is a lot of computer power and money going into making this a reality.