Rabban
45
I gave it a shot. Using the musicgen large model I generated a bunch of desert trance clips/songs. The results are very interesting. The words you use in the prompt can produce wildly different results.
I went for 3.5 minutes of output in each clip, and perhaps that was a bit much. I’m interested to see how Octatrack slices these up, and perhaps I should have made them shorter. Using the large model spiked around 9GB GPU memory, which really is not much compared to big LLMs or image/video generation.
I see a big future for this type of thing and I’m surprised Ableton is not leading the way. Look at adobe with Photoshop. You take a picture of a lake and then draw a circle in the middle and say “add some ducks” and it gives you a bunch of realistic options to paint in the ducks. This should be seamless in Ableton where you highlight an area and say “Give me a syncopated drum fill here” and it inserts a context aware clip… I.e., not some random drum sample but one tailored to what’s going on in the timeline. Maybe eventually.
1 Like