Google’s DeepMind 'V2A' AI technology can create soundtracks for videos based on both their pixels and your text prompts

Google DeepMind V2A
(Image credit: Google DeepMind)

It’s one thing to have AI that can create videos for you, but what if you want them to have sound, too? Google’s DeepMind team now says that it’s come up with some video-to-audio (V2A) technology that can generate soundtracks - music, sound effects and speech - both from text prompts and the video’s pixels.

This is the kind of news that might have soundtrack composers shuffling awkwardly in their seat - all the more so because, as well as being able to work with automatic video generation services, V2A can also be applied to existing footage such as archive material and silent movies.

The text prompt aspect is interesting because, as well as being able to input ‘positive prompts’ that will guide the audio in the direction you want, you can also add ‘negative prompts’ which tell the AI to avoid certain things. This means that you can generate a potentially infinite number of different soundtracks for any one piece of video.

This clip was generated using the prompt "A drummer on a stage at a concert surrounded by flashing lights and a cheering crowd".

V2A Drums - YouTube V2A Drums - YouTube
Watch On

The system is also capable of creating audio using just video pixels, so no text prompts are required if you don’t want to use them.

Google DeepMind admits that V2A currently has some limitations - the quality of the audio is currently dependent on the quality of the video, and lip synchronisation when generating speech isn’t perfect - but says that it’s doing further research in a bid to address these.

Find out more and check out further examples on the Google DeepMind website

Get over 70 FREE plugin instruments and effects… image
Get over 70 FREE plugin instruments and effects…
…with the latest issue of Computer Music magazine
Ben Rogerson
Deputy Editor

I’m the Deputy Editor of MusicRadar, having worked on the site since its launch in 2007. I previously spent eight years working on our sister magazine, Computer Music. I’ve been playing the piano, gigging in bands and failing to finish tracks at home for more than 30 years, 24 of which I’ve also spent writing about music and the ever-changing technology used to make it.