NVIDIA Releases Digital Humans - Praxis
NVIDIA Releases Digital Humans

NVIDIA Releases Digital Humans

With NVIDIA ACE, we are closer to a future in which interacting with computers will be as natural as interacting with humans. But how safe will that be?


NVIDIA has made its ACE generative AI microservices publicly available. This suite of tools is intended to help developers create lifelike digital humans. This significant technological advance is expected to revolutionise industries such as gaming, healthcare, and customer service. At the same time, this has set off alarms in the technology industry as Audio2Face technology could potentially be misused by cyber criminals and bad actors for malicious purposes, particularly in the form of “deepfake” attacks.

The ACE suite includes various technologies that enable the creation and animation of realistic digital humans. These technologies are now generally available for Cloud deployment and early access for RTX AI PCs. Several companies are already integrating ACE into their operations, and this includes Dell Technologies, ServiceNow, and Perfect World Games.

How it Works

NVIDIA Audio2Face is a generative AI-based technology that enables real-time facial animation driven solely by an audio source. Let’s deep-dive to understand how it functions:

  • Audio Input: Audio2Face takes an audio file or a live audio stream as input. This can be a voice recording, music, or any other audio content.
  • AI Model: The audio input is analysed by an AI model, which is trained to recognise and interpret various audio features such as pitch, tone, and rhythm. This model is responsible for generating facial animations that match the audio input.
  • Facial Animation: The AI model generates facial animations by controlling various facial features such as the eyes, mouth, and facial muscles. This is achieved through a combination of AI-driven algorithms and real-time rendering using NVIDIA RTX technology.
  • Real-Time Rendering: Audio2Face uses real-time rendering to generate the facial animations. This allows for seamless and dynamic facial movements that are synchronised with the audio input.
  • Output: The final output is a real-time facial animation that accurately reflects the emotions and expressions conveyed in the audio input. This can be used in various applications such as video games, virtual reality, and animation production.

The ACE microservices are designed to simplify creating, animating, and operating lifelike digital humans across customer service, telehealth, gaming, and entertainment. The technologies are expected to bring about a future of intent-driven computing, where interacting with computers is as natural as interacting with humans.

Small Language Models

Till recently, ACE was offered to developers only as NIM microservices to operate in data centres. Now, NVIDIA is developing ACE PC NIM microservices for deployment across 100 million RTX AI PCs and laptops. This includes the company’s first SLM (Small-Language Model), Nemotron-3 4.5B, which has been purpose-built to run on device with similar levels of precision and accuracy as Large-Language Models running in the Cloud. SLMs have significantly fewer parameters than LLMs, typically ranging from thousands to a few million parameters. This reduced size makes them more accessible and feasible for deployment on resource-constrained devices such as smartphones

The new NVIDIA AI Inference Manager software development kit simplifies the deployment of ACE to PCs, preconfiguring the PC with necessary AI models, engines, and dependencies while orchestrating AI inference seamlessly across PCs and the Cloud.

Companies such as Aww Inc., Dell Technologies, Inventec, Perfect World Games, and ServiceNow are already using ACE technologies to enhance their offerings. For example, Aww Inc. plans to use ACE Audio2Face microservices for real-time animation that allows highly interactive communication with its users. Perfect World Games is adopting ACE in its new mythological wilderness tech demo, Legends, allowing players to interact with a fully interactive, realistic, multilingual, AI NPC in both English and Mandarin.

Deepfake Fears

The future of digital humans is expected to be shaped by these advancements in generative AI. With NVIDIA ACE, we are closer to a future in which interacting with computers will be as natural as interacting with humans.

While Audio2Face itself is a legitimate technology with many beneficial applications, it’s important to be aware of potential misuse by bad actors. Safeguards like digital watermarking and authentication tools are being developed to help detect deepfakes, but the technology is advancing rapidly. Individuals and organisations need to be vigilant about the authenticity of audio and video content, especially if it appears to be from a trusted source. Maintaining a healthy scepticism, being cautious about sharing personal information, and staying informed about the latest deepfake threats are key to protecting against this emerging risk.


Reference: NVIDIA Newsroom


Know more about the syllabus and placement record of our Top Ranked Data Science Course in KolkataData Science course in BangaloreData Science course in Hyderabad, and Data Science course in Chennai.

Leave a comment

Your email address will not be published. Required fields are marked *

© 2023 Praxis. All rights reserved. | Privacy Policy
   Contact Us