GPU Audio

This is very interesting stuff, and I hope future DAWs support GPU compute.
I’ve been predicting it would happen for a long time, and a bit surprised it’s not a mainstream reality yet.

7 Likes

I’m 99% sure the reason GPU audio isn’t widespread yet is because of high latency. It’s probably doable on modern hardware though.

I’ve considered the NVIDIA Jetson products, along with the JetPak SDK for stand alone electronic music products.

Price performance can’t be beat, for any compute intensive application.

I’ve been wondering whether the GPU Audio SDK could be shoehorned into a standalone, or if that is even necessary.

1 Like

The issue historically has been that GPU pipelines are built for 120 frames a second, not 48,000. A friend was looking at the AMD Ryzen APU architecture as a pretty likely option for this, but I’m guessing that as Nvidia has genericid their architecture for ML compute tasks, stuff on that side will become more viable too.

5 Likes

I don’t know anything about any of that, Dymaxion.

On using GPUs to generate audio :

There is an open source application, originating from IRCAM, called the ACIDS ( Artificial Creative Intelligence and Data Science ) Neurorack. It’s on Github, and is a Eurorack front end to the inexpensive NVIDIA Jetson Nano GPU.

ACIDS Neurorack
Neurorack interface on right, Jetson Nano on left.

The system uses PyTorch which is an Python open source machine learning framework that is made with this to run on the 128 core Jetson Nano GPU. ( For reference the Nano is the baby of the family. The biggest Jetson GPU product has 2048 cores. )

A couple of good places to read about this are :

Go listen to the “Audio example” of “impacts” created by this system in the second link here.

This is not using the GPU Audio product specifically, that hasn’t been around long enough, but is an example of a system using a GPU based neural network system for audio processing.

6 Likes

That’s exactly the kind of genricized system I’m talking about — my friend would have been looking at this five or six years ago, when GPU hardware was still more narrowly optimized for video. That looks properly slick!

To be clear, said friend was looking at building very traditional audio pipelines in the digital mixer space, not generative/ML stuff at all, just using the GPU as a SIMD pipeline for a bunch of channels.

1 Like

Curious enough that I signed up for the beta. Before signing up, they listed the requirements were windows 10 and an Nvidia 10-series or greater GPU, which is a fairly recent minimum requirement. My first thought, like @skinpop was that latency was an issue which is becoming lesser and lesser in newer GPUs - Microsoft Direct Storage will see data being pushed into GPUs much faster even on the same old PCI data bus, and Nividia and AMD (the two current main GPU makers) both have bought high speed interconnect specialists to help cut latency in supercomputing applications where hundreds of these cards need to swap data. But the 10-series isn’t benefiting from any of that, it’s 5 or 6 years old now.

My guess is that the challenge is two-fold:

  1. CPUs are pretty fast, I can spend the same money I did 5 years ago and get 5-10x the performance. So if you need more performance than your current PC can give you it is out there if it’s been a few years.

  2. As @Dymaxion touched on GPUs aren’t really designed for audio. They’re designed for matrix multiplication, and audio isn’t a matrix. We’re after a very high speed solution to a single problem (what is the value of the next sample in 1/48000 of a second?) and GPUs are more designed to find 1920*1080 values for the next samples in 1/60 of a second. That’s just two fundamentally different problems.

Personally, my guess is that as fast as CPU power has grown the last few generations, GPUs have just grown so much faster that it’s finally worthwhile to figure out how to reconcile the differences in “skillset” between GPU and audio and unlock that performance. GPU compute has been a thing for over a decade at this point, so there could be maturity in other software I’m not aware of that’s making this more feasible now as well.

2 Likes

I’m not an audio guy but to me the most compelling use case for GPU audio would be to render real time raytraced 3D/spatial audio.

1 Like

Taking into account a buffer/vector size of, say, 256 that gives 48000/256=187,5 “frames”/sec. Which is largely within what a GPU could do. Not that I know anything about GPU audio.

2 Likes

The math doesn’t work that way, because the way you treat the next sample will depend on the previous sample.

Isn’t the problem that today’s CPUs are more than powerful enough and cheaper than GPUs?

2 Likes

I had the same thought, as CPUs have really jumped in capability in the last 5 years.
ARM cpus in particular are now very cost effective.

For GPU compute to take off, if would need a compelling reason. It’s not compelling to say “instead of being limited to running 100’s of plug-ins, you can now run 1000’s!”.

It’s like saying, “Hey, would you like 1 or 2 slices of pizza for lunch? How about 100 slices? How about 10,000 slices!”

There’s a point where it doesn’t matter :slight_smile:

Still I do believe that GPU compute could open new DSP possibilities. Just not sure how it will play out.

1 Like

Sounds like we got a pizza party ! Yeah !

The issue is always to match the problem to be solved to the architecture and the tools.

GPUs have been useful for general purpose computing, solving non-graphical problems for a very long time. They now are not limited to SIMD, and do fit well with streaming data that can be handled with parallel algorithms, and many other compute bound tasks as well.

( You can see lists of those sorts of things, that have already been implemented on the “General-purpose computing on graphics processing units” Wikipedia page. Scroll down about half ways. )

Regular processors can handle those sorts of things too with threads and/or task switching. So you need problems more complicated and compute intensive, when you need the pizza party machine.

So take for instance a 32 channel digital audio mixer. That’s inside what a normal processor can do well. But what about if that 32 channel digital audio mixer was made to run in the frequency domain with all the capabilities that would bring ? For that you need more horsepower, and parallel algorithms fit well.

=-=-=-=-==-=

The details on this GPU Audio product interface and tools need to be fully disclosed. I’m not joining their beta, just to find out.

2 Likes

Sonic State talks to Jonathan Roweden of GPU Audio at NAMM 2022 about the current status, and what is being done by them to develop the use of GPU hardware for audio processing.

The video is basically a talky and Jonathan explains a lot of the specific applications, and how using a GPU for audio processing can improve on things that until now were being done with the standard CPU, or have been beyond the capability of such technology.

1 Like

Thanks for sharing. Once I have a home again (living out of hotels until I find something sweet in Chicago), I’m going to see how quickly I can get data in and out of my RTX2060.

Music tech is almost the last business I’d ever want to be in, but some hobby projects here could be fun.

Fun Fact: vector machines were among the first supercomputers. The Cray-1 is a particularly iconic example. While I’d love to have a Cray-1 as living room furniture one day, it’s pretty cool to have a little-yet-vastly-more-powerful Cray sitting in my Linux box.

It’s also full circle: early supercomputers we’re primarily used for military research. One branch of that research was training simulations. Which became video games. Which are now the primary driver of GPU design (don’t mention the C-word).

Sure GPU’s are not limited to SIMD, but not using SIMD is throwing away anywhere from 90% to 99% of the work. SIMD and the whole pipeline that feeds those SIMD units is where GPU’s get their performance from. People talk about GPU cores as if they are like simpler versions of CPU cores but that’s not the case, a GPU “core” is basically a single lane on a SIMD unit.

1 Like

Back in the day, I had an account on the largest unclassified C90, with a whopping 16G of ram in 1995, IIRC. It’s still amazing to me how fast stuff has evolved. I’m very curious to see when/if the GPUs in mobile SOCs, like the ARM boards folks like Mod Devices are building on, start becoming functional for audio work.

1 Like