Microsoft recently revealed VASA-1, an impressive generative AI model that can turn a single still photo into a believable video. That's fuckin scary. Here is simple explanation of how does it work: Essentially, VASA-1 examines a still image and uses it to generate a video where the person’s lips and facial expressions move in perfect sync with an accompanying audio track. The model is trained on thousands of images that capture a wide range of facial expressions. This training helps VASA-1 understand how different expressions correspond to various sounds or words in the audio. The animation includes not just the lips but also other subtle facial movements and even head motions that add life to the video. Now watch prototype here: https://twitter.com/minchoi/status/1780792793079632130?t=1gAF5ob7Gp6H_QCMNsIMlw&s=19 And share your views on security, advancement of this product and how many animator gonna lose their job.
Download the medial app to read full posts, comements and news.