I don’t believe you can’t do it like that now.
AI’s might make it happen faster and/or allow you to request imaginary signal flows which are currently out of your reach.
But… the thing is… at the moment, these models are based on existing music. So, at best, they’ll be the same as what you can get now by working with existing gear and people. Not “copies of” but “inspired by”. Humans are still, currently, better at making up new useful stuff.
I think it’ll get more powerful when AI’s are able to write novel algorithms and engines in realtime in response to suggestions, rather than by generating “inspired by” sounds carved out of white noise like they do today. We’re an away off doing unmonitored full application development. Most of the code AI generates now needs manual reshaping to be useful.
I mean… I’ve heard a few full song generators that are amazing in terms of audio quality, but … they sound like people doing stuff you’ve heard before. And I’ve heard stem generators that are basically churning out the equivalent of sample CDs. I don’t research this too deep because the whole field scares me (what do I do with my gear now it’s redundant?), so feel free to post better suggestions.