Alibaba's EMO: AI Brings Photos to Life as Singing, Talking Avatars
-
Alibaba researchers developed EMO, an AI video generator that creates animated avatars from photos that can sing and talk based on audio inputs.
-
EMO syncs lip movements to match input songs or speech in different languages and works with various artistic styles like photos, paintings, or cartoons.
-
Theoretically, the audio input doesn't have to be real - it could be computer-generated as well.
-
There are some flaws like over-softened skin and unnatural mouth movements, but the accuracy of lip syncing to audio is impressive.
-
The research is published on Github and ArXiv, with demo videos compiled by the RINKI YouTube channel.