To me this sounds like perhaps the main issue. For me at least immediacy is critical for making music, it’s great when you have everything set up just right and ready to start making sound within 0-20 seconds. It means you can get ideas and feelings down and evolving as soon as inspiration takes you, and it’s nice to be able to walk away for a meal or something, come back and get sound again immediately by hitting play or pushing up the faders.
I think having a spatial ‘room scale’ element to your interaction with sound and music is really important too, having controls physically seperated and spaced out means you can more easily utilise your muscle memory and work intuitively in my experience. You can definitely get the same kind of experience with a minimal setup like a synth/sequencer or two and some effects on a table, you might just miss the feeling of ‘piloting the spaceship’ that you had with a room full of cool controls.
Another way to think about a bigger hardware based setup is that you basically end up needing to dance with your whole body. You could think of your body as an integral part of what ‘computes’ the music in that sense, and why it probably sounds more groovy and creative.
The biggest issue with ITB is that the screen/Kb+M interface is a kind of bottleneck for your interaction with the system, and they’re not really that congruent with our natural abilities for dealing with the world that were evolved over hundreds of thousands of years. In general the more we can get away from staring at a screen and moving icons around with keyboard + mouse and instead interact with computers in a way that uses our spatial and haptic senses the better. If you can break out control of the ITB stuff to hardware so that you don’t really need to look at the screen much that helps a lot I think.
A bit off topic but this is a really good articulation on what is wrong with modern human/computer interaction that I highly recommend people watch if they’re interested in this kind of thing -