Digital Domain’s Digital Humans Group introduces the latest developments in real-time autonomous digital humans.
Zoey is not really who you think she is. That is, if you believe her to be an actual human. And why wouldn’t you? She looks real. She engages in natural conversation. She can even cop a bit of an attitude at times.
Introduced at FMX 2022 recently, Zoey is the latest iteration of the R&D into real-time advanced autonomous digital humans by Digital Domain’s Digital Humans Group (DHG). She comprises many advanced technologies: artificial intelligence, machine learning, complex facial animation, text-to-speech tech, real-time rendering, and more.
“Zoey takes the concepts of virtual assistants like Alexa and Siri several steps further, creating a digital helper you can truly interact with,” says Daniel Seah, global CEO of Digital Domain. Zoey can engage in online face-to-face conversations, carrying on discussions with multiple people at a time, can answer questions accurately and without lag, and is able to learn and remember details about people she speaks with from previous encounters. She can also adjust her mood and the tone of her conversation.
A few years ago, Digital Domain formed the Digital Human Group, initially headed by two of the studio’s veterans, Darren Hendler and Doug Roble, to oversee the studio’s work on digital humans and CG creatures.
Last year, DHG introduced Douglas, a proof of concept. Like Zoey, who is based on actress Zoey Moses (Reflection, Yellowstone), Douglas is based Roble, one of his makers. DHG continued to improve upon their work and created Zoey but continue to develop and refine Douglas in parallel.
Creating model citizens
Both the Douglas and Zoey models were built in the classical way, which entailed high-res facial and body scanning. The group uses the in-house Charlatan facial animation tool to create a flexible digital face that could react in real time, based on the captured footage. A neural renderer is later applied to add more realism to the model before it appears in the application.
Many of the tech updates were made in the areas of language and voice. DHG employed AI text-to-speech technology from WellSaid Labs, which gave Zoey a vast vocabulary. The tone she takes can be controlled with DHG’s AI, which passes the information along to the voice system. In contrast, Douglas’ voice originally was one-dimensional, generated in-house at Digital Domain. “We wanted to get Douglas’ voice from Roble himself, but the result was a little more synthetic and not as naturalistic as it is from WellSaid,” says Matthias Wittmann, visual effects supervisor at Digital Domain within the Digital Humans Group Wittmann.
Three chatbots simultaneously generate three different types of answers for the character (conversational, scripted, and informational), and AI selects the most appropriate one.
Enabling realistic human characters to respond appropriately to a person through words and actions in real time is truly amazing, but a lot has to occur behind the curtain first—all in the blink of an eye. Once all the individual AI, machine learning, and other systems complete their processes, everything is then packaged up in Epic’s Unreal Engine and neural rendered. In the coming months, DHG is looking into making the application more scalable—currently it involves a very high-end system with multiple computers and AIBs.
By the end of the year, Zoey likely will be available for licensing from Digital Domain.