There also exists cuda-gdb[1], a first-party GDB for NVIDIA's CUDA. I've found it to be pretty good. Since CUDA uses a threading model, it works well with the GDB thread ergonomics (though you can only single-step at the warp granularity IIRC by the nature of SM execution).
Tangent: is anyone using a 7900 XTX for local inference/diffusion? I finally installed Linux on my gaming pc, and about 95% of the time it is just sitting off collecting dust. I would love to put this card to work in some capacity.
For NVIDIA cards, you can use NSight. There's also RenderDoc that works on a large number of GPUs.
Is there not an official tool from AMD?
GDB supports it https://sourceware.org/gdb/current/onlinedocs/gdb.html/AMD-G...
You also get UMR from AMD https://gitlab.freedesktop.org/tomstdenis/umr
There is also a bunch of other tools provided: https://gpuopen.com/radeon-gpu-detective/ https://gpuopen.com/news/introducing-radeon-developer-tool-s...
> After searching for solutions, I came across rocgdb, a debugger for AMD’s ROCm environment.
It's like the 3rd sentence in the blog post.......
to be fair it wasn't clear that was an official AMD debugger and besides that's only for debugging ROCm applications.
To be fair, typing two words into any search engine to verify would be better than asking in a comment on HN.