• ismailmaj 2 days ago

    What would be the benefit of using ZML instead of relying on StableHLO/PJRT? Because the cost of porting models is for sure high.

    • gwenzek a day ago

      ZML the zig library is mostly a wrapper of StableHLO/pjrt. But it's a high quality wrapper, and the tagged tensor syntax is really helpful to write complex ops like dot, or gather.

      And ZML the framework also resolve issues with the complex dependency graph of stablehlo/pjrt.

    • hsjdhdvsk 3 days ago

      Hi ya! Want to say this looks awesome :) really interested in the sharded inference demo!!! You said it was experimental, is it in the examples folder at all?? (On phone atm, so apologies for not investigating further)

      • onurcel 3 days ago

        First of all, great job! I think the inference will become more and more important.

        That being said, I have a question regarding the ease of use. How difficult it is for someone with python/c++ background to get used to zig and (re)write a model to use with zml?

        • gwenzek 3 days ago

          Hi co-author here. Zig is way simpler than C++. Simple like in an afternoon I was able to onboard in the language and rewrote the core meat of a C++ algorithm and see speed gains (fastBPE for reference).

          Coming from Python, the hardest part is learning memory management. What helps with ZML is that the model code is mostly meta programming, so we can be a bit flexible there.

          We have a high level API, that should feel familiar to Pytorch user (as myself), but improves in a few ways

          • steeve 3 days ago

            pretty easy, usually the hardest part is figuring out what the python code is doing

          • Palmik 2 days ago

            Given that the focus is performance, do you have any benchmarks to compare against the likes of TensoRT-LLM.

            • gwenzek 2 days ago

              It' s a bit early to compare directly to TensorRT because we don't have a full-blown equivalent.

              Note that our focus is being platform agnostic, easy to deploy/integrate, good performance all-around, and ease of tweaking. We are using the same compiler than Jax, so our performances are on par. But generally we believe we can gain on overall "tok/s/$" by having shorter startup time, choosing the most efficient hardware available, and easily implementing new tricks like multi-token prediction.

              • koe123 2 days ago

                I second this, it would help to justify the time investment into a framework if its clear how it stacks up!

              • montyanderson 3 days ago

                my dreams have come true. hardware-agnostic ml primitives in a typed, compiled language.

                my only question is: is zig stable enough to base such a project on?

                • gwenzek 2 days ago

                  Zig has been relatively stable for the past few years for the main Zig code. What has changed the most is the `build.zig` build system (which we aren't using).

                  We are also looking ahead at Zig roadmap, and trying to anticipate upcoming breaking changes, and isolate our users from that.

                  • dartos 3 days ago

                    Stable as in unchanging, no.

                    Stable as in reliable enough, I’d say so.