Riffusion is a free web app that uses AI to create music from your text prompts
If you can type it, the robot can play it
You may well be aware of Stable Diffusion, the much-discussed open-source AI model that can generate images from text. Well, as a “hobby project”, a couple of developers - Seth Forsgren and Hayk Martiros - have now created Riffusion, which uses the same model to turn text into music.
Riffusion works by generating images from spectrograms, which are then converted into audio clips. We’re told that it can generate infinite variations of a text prompt by varying the ‘seed’.
Riffusion’s creators explain that a spectrogram can be computed from audio using what’s known as the Short-time Fourier transform (STFT), which approximates the audio as a combination of sine waves of varying amplitudes and phases.
However, in the case of Riffusion, the STFT is inverted so that the audio can be reconstructed from a spectrogram. Here, the images from the AI model only contain the amplitude of the sine waves and not the phases - these are approximated by something called the Griffin-Lim algorithm when reconstructing the audio clip.
As well as short loops, Riffusion is also capable of creating longer jams, which are based on subtle variations of one image.
The web app enables you to type in prompts and will keep on generating interpolated content in realtime for as long as you let it, while giving you a visual 3D representation of the spectrogram. You can also skip immediately to the next prompt; if there isn’t one, Riffusion will interpolate between different seeds of the same prompt.
We can’t pretend to understand exactly how it all works but Riffusion is impressive and terrifying in equal measure. This kind of technology is in its infancy but it’s not hard to imagine how capable it will become in the future.
Get the MusicRadar Newsletter
Want all the hottest music and gear news, reviews, deals, features and more, direct to your inbox? Sign up here.
See and hear for yourself on the Riffusion website
I’m the Deputy Editor of MusicRadar, having worked on the site since its launch in 2007. I previously spent eight years working on our sister magazine, Computer Music. I’ve been playing the piano, gigging in bands and failing to finish tracks at home for more than 30 years, 24 of which I’ve also spent writing about music and the ever-changing technology used to make it.