AI Disruption

AI Disruption

Share this post

AI Disruption
AI Disruption
ByteDance Releases OmniHuman: Generate Videos from a Single Image and Audio

ByteDance Releases OmniHuman: Generate Videos from a Single Image and Audio

OmniHuman by ByteDance revolutionizes portrait video generation with multimodal AI. Create vivid, high-quality videos from a single image and audio.

Meng Li's avatar
Meng Li
Feb 05, 2025
∙ Paid
4

Share this post

AI Disruption
AI Disruption
ByteDance Releases OmniHuman: Generate Videos from a Single Image and Audio
4
Share

"AI Disruption" publication New Year 30% discount link.


Do you remember Loopy, the portrait audio-driven technology that sparked heated discussions on X half a year ago?

An upgraded technology solution has arrived. The ByteDance Digital Human team has launched a new multimodal digital human scheme called OmniHuman. It can generate video by combining a single image of any size and person ratio with a piece of input audio. The generated human video is vivid and exhibits a very high degree of naturalness.

For example, given the image below:

Image

The characters generated by OmniHuman can move naturally in the video:

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Meng Li
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share