• vblanco 2 hours ago

    A truly incredible profiler for the great price of free. There is nothing coming at this level of features and performance even on paid software. Tracy could cost thousands of dollars a year and would still be the best profiler.

    Tracy requires you to add macros to your codebase to log functions/scopes, so its not an automatic sampling profiler like superluminal, verysleepy, VS profiler, or others. Each of those macros has around 50 nanoseconds of overhead, so you can liberally use them in the millions. On the UI, it has a stats window that will record average, deviation, min/max of those profiler zones, which can be used to profile functions at the level of single nanoseconds.

    Its the main thing i use for all my profiling and optimization work. I combine it with superluminal (sampling profiler) to get a high level overview of the program, then i put tracy zones on the important places to get the detailed information.

    • eagle2com 18 minutes ago

      Doesn't Tracy have the capability to do sampling as well? I remember using it at some point, even if it was finicky to setup because windows.

      • Flex247A 2 hours ago

        Hello! Going through your tutorial and it's been a great ride!

        Thanks for the good work.

      • Flex247A 2 hours ago

        I am a beginner in graphics programming, and I came across this amazing frame profiler.

        Web demo of Tracy: https://tracy.nereid.pl/

        This blows my mind. It's so fast and responsive I never expected a WebAssembly application to be!

        • cwbaker400 an hour ago

          Tracy is brilliant. @wolfpld I hope you're enjoying reading this and all of the other great comments in this thread. Great work and thank you very very much!

          • drpossum an hour ago

            Can someone explain how this achieves nanosecond resolution? That's an extremely difficult target to reach on computing hardware due to inherent clock resolutions and interrupt timing.

            • simonask 29 minutes ago

              There are several sources of timing information, and I think in this context "nanosecond precision" just means that Tracy is able to accurately represent and handle input in nanoseconds.

              The resolution of the actual measurements depends on the kind of measurement:

              1. If the measurement is based on high resolution timers on the CPU, the resolution depends on the hardware and the OS. On Windows, `QueryPerformanceFrequency()` returns the resolution, and I believe it is often in the order of 10s or 100s of nanoseconds.

              2. If the measurement is based on GPU-side performance counters, it depends on the driver and the hardware. Graphics APIs allow you to query the "time-per-tick" value to translate from performance counters to nanoseconds. Performance counters can be down to "number of instructions executed", and since a single instruction can be on the order of 1-2 nanoseconds in some cases, translating a performance counter value to a time period requires nanosecond precision.

              3. Modern GPUs also include their own high-precision timers for profiling things that are not necessarily easy to capture with performance counters (like barriers, contention, and complicated cache interactions).

              • drpossum 18 minutes ago

                Yes, that's my understanding and why I asked. I disagree about "in this context", though, which is a pitch. If I was going to buy hardware that claimed ns resolution for something I was building I would expect 1ns resolution, not "something around a few ns" and not qualified "only on particular hardware". If such a product were presenting itself in a straightforward way to be compared to similar products and respecting the potential user it would say "resolutions down to a few ns" or something more specific but accurate.

                There was even a discussion on this not long ago on how to market to technical folks and things to not do (this is one of the things not to do)

                https://www.bly.com/Pages/documents/STIKFS.html

                https://news.ycombinator.com/item?id=41368583

              • Galanwe 16 minutes ago

                It does reach nanosecond only in the sense that its sampling profiler can report nanosecond resolution. I've tried the event profiler for microsecond sensitive projets though, and it blows up the timings and latency even at low event frequency.

                • vardump 41 minutes ago