Now you, too, can build your own virtual assistant

Nvidia’s Omniverse Avatar Cloud Engine makes it easier to create realistic and interactive digital humans.

The metaverse is coming, and many are eagerly awaiting its arrival. According to Rev Lebaredian, vice president of Omniverse and simulation technology at Nvidia, three big technologies are needed to build the metaverse: neural graphics, an investment in USD, and lifelike virtual avatars.

Nvidia is addressing the first requirement through cutting-edge R&D, which is being presented at Siggraph 2022 in the areas of 3D content creation, animation, and AI experiences. The application of modern AI techniques and neural networks to traditional 3D computer graphics pipelines is essential for scaling the creation and simulations on the orders of the magnitude needed to populate the metaverse, said Lebaredian. The second is being addressed by Nvidia and partners, which are committed to their long-term vision of accelerating further open-source development and widespread adoption within and outside the M&E industry (Going all in on building the metaverse with USD). For the third element, Nvidia has announced its Omniverse Avatar Cloud Engine (ACE), a new cloud engine that makes it easier to build and customize lifelike digital humans.

ACE contains a suite of cloud-native AI models and tools that enable anyone to build and deploy interactive avatars specifically for their industry needs. ACE is graphics engine agnostic, meaning it can connect to virtually any engine, and contains all the core technologies needed to create robots driven by intelligence so they can converse, perceive, and behave realistically inside the virtual world.

“Avatars are essential and necessary for us as we create virtual worlds that become indistinguishable from the real one,” said Lebaredian. “The metaverse without representations of real humans inside it, and without humanlike representations of our artificial intelligence inside it, will be a very dull and sad place.”

ACE is built on top of Nvidia’s Unified Compute Framework, providing access to the tools and APIs for achieving realistic and fully interactive avatars. These including Riva for development of speech AI apps; Metropolis for computer vision and intelligent video analytics, Merline for high-performing recommender systems, NeMo Megatron for large language models with natural language understanding, and Omniverse for AI-enabled animation.

As Lebaredian noted, many may incorrectly think that the move from 2D to 3D is just 50% harder to do. But, that is not true. It is hundreds or thousands of times harder, and 3D has an insatiable appetite for computing power. The solution, he added, is to move as much compute as possible into the cloud.