It would be nice to have a more indepth discussion of the issues that have been found with compile-time programming, rather than uncritical acclaim. Staged programming is not new, and people have run into many issues and design tradeoffs in that time. (E.g. the same stuff has been done in Lisps for decades, though most Lisps don't have a type system, which makes things a bit more complicated.)
Some of the issues that come to mind:
* Implementing generics in this way breaks parametricity. Simply put, parametricity means being able to reason about functions just from their type signature. You can't do this when the function can do arbitrary computation based on the concrete type a generic type is instantiated with.
* It's not clear to me how Zig handles recursive generic types. Generally, type systems are lazy to allow recursion. So I can write something like
type Example = Something[Example]
(Yes, this is useful.)
* Type checking and compile-time computation can interact in interesting ways. Does type checking take place before compile-time code runs, after it runs, or can they be interleaved? Different choices give different trade-offs. It's not clear to me what Zig does and hence what tradeoffs it makes.
* The article suggests that compile-time code can generate code (not just values) but doesn't discuss hygiene.
There is a good discussion of some issues here: https://typesanitizer.com/blog/zig-generics.html
I'm a pretty big fan of Zig--I've been following it and writing it on-and-off for a couple of years. I think that comptime has a couple of use-cases where it is very cool. Generics, initializing complex data-structures at compile-time, and target-specific code-generation are the big three where comptime shines.
However, in other situations seeing "comptime" in Zig code has makes me go "oh no" because, like Lisp macros, it's very easy to use comptime to avoid a problem that doesn't exist or wouldn't exist if you structured other parts of your code better. For example, the OP's example of iterating the fields of a struct to sum the values is unfortunately characteristic of how people use comptime in the wild--when they would often be better served by using a data-structure that is actually iterable (e.g. std.enums.EnumArray).
This feels like it's it's a constant problem with all more advanced language features. I've had the same reaction to uses of lisp macros, C-style macros, Java compiler extensions, Ruby's method_missing, Python monkey patching, JavaScript prototype inheritance, monads, inheritance...
Maybe the real WTF is the friends we made along the way. <3 <3 <3
That’s because it’s a human problem not a technology one.
Can only be fixed by fixing humans.
this is why Go is so great....
Go is not immune. See: interface pollution.
And why it draws so much dislike: It doesn't offer many ways to paper over poor design decisions.
If I had a dollar for every Go code generator
Lately I read Graham's On Lisp and first felt it was one the greatest programming books I'd ever read and felt it was so close to perfect that the little things like he made me look "nconc" up in the CL manual (so far he'd introduced everything he talked about) made me want to go through and do just a little editing. And his explanation of how continuations work isn't very clear to me which is a problem because I can't find a better one online (the only way I think I'll understand continuations is if I write the explanation I want to read)
Then I start thinking things like: "if he was using Clojure he wouldn't be having the problems with nconc that he talks about" and "I can work most of the examples in Python because the magic is mostly in functions, not in the macros" and "I'm disappointed that he doesn't do anything that really transform the tree"
(It's still a great book that's worth reading but anything about Lisp has to be seen in the context the world has moved on... Almost every example in https://www.amazon.com/Paradigms-Artificial-Intelligence-Pro... can be easily coded up in Python because it was the garbage collection, hashtables on your fingertips, first class functions that changed the world, not the parens)
Lately I've been thinking about the gradient from the various tricks such as internal DSLs and simple forms of metaprogramming which are weak beer compared to what you can do if you know how compilers work.
> if he was using Clojure he wouldn't be having the problems with nconc that he talks about"
Yeah, one would write the implementation in Java.
Common Lisp (and Lisp in general) often aspires to be written in itself, efficiently. Thus it has all the operations, which a hosted language may get from the imperative/mutable/object-oriented language underneath. That's why CL implementations may have type declarations, type inference, various optimizations, stack allocation, TCO and other features - directly in the language implementation. See for example the SBCL manual. https://sbcl.org/manual/index.html
For example the SBCL implementation is largely written in itself, whereas Clojure runs on top of a virtual machine written in a few zillion lines of C/C++ and Java. Even the core compiler is written in 10KLOC of Java code. https://github.com/clojure/clojure/blob/master/src/jvm/cloju...
Where the SBCL compiler is largely written Common Lisp, incl. the machine code backends for various platforms. https://github.com/sbcl/sbcl/tree/master/src/compiler
The original Clojure developer made the conscious decision to inherit the JIT compiler from the JVM, write the Clojure compiler in Java and reuse the JVM in general -> this reuses a lot of technology maintained by others and makes integration into the Java ecosystem easier.
The language implementations differ: Lots of CL + C and Assembler compared to a much smaller amount of Clojure with lots of Java and C/C++.
CL has for a reason a lot of low-level, mutable and imperative features. It was designed for that, so that people code write efficient software largely in Lisp itself.
Interesting points.
> Implementing generics in this way breaks parametricity. Simply put, parametricity means being able to reason about functions just from their type signature. You can't do this when the function can do arbitrary computation based on the concrete type a generic type is instantiated with.
Do you mean reasoning about a function in the sense of just understanding what a functions does (or can do), i.e. in the view of the practical programmer, or reasoning about the function in a typed theoretical system (e.g. typed lambda calculus or maybe even more exotic)? Or maybe a bit of both? There is certainly a concern from the theoretical viewpoint but how important is that for a practical programming language?
For example, I believe C++ template programming also breaks "parametricity" by supporting template specialisation. While there are many mundane issues with C++ templates, breaking parametricity is not a very big deal in practice. In contrast, it enables optimisations that are not otherwise possible (for templates). Consider for example std::vector<bool>: implementations can be made that actually store a single bit per vector element (instead of how a bool normally is represented using an int or char). Maybe this is even required by the standard, I don't recall. My point is that in makes sense for C++ to allow this, I think.
In terms of implementation, you can view parametricity as meaning that within the body of a function with a generic type, the only operations that can be applied to values of that type are also arguments to that function.
This means you cannot write
fn sort<A>(elts: Vec<A>): Vec<A>
because you cannot compare values of type A within the implementation of sort with this definition. You can write
fn sort<A>(elts: Vec<A>, lessThan: (A, A) -> Bool): Vec<A>
because a comparison function is now a parameter to sort.
This helps both the programmer and the compiler. The practical upshot is that functions are modular: they specify everything they require. It follows from this that if you can compile a call to a function there is a subset of errors that cannot occur.
In a language without parametricity, functions can work with only a subset of possible calls. If we take the first definition of sort, it means a call to sort could fail at compile-time, or worse, at run-time, because the body of the function doesn't have a case that knows how to compare elements of that particular type. This leads to a language that is full of special cases and arbitrary decisions.
Javascript / Typescript is an example of a language without parametricity. sort in Javascript has what are, to me, insane semantics: converting values to strings and comparing them lexicographically. (See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...) This in turn can lead to amazing bugs, which are only prevented by the programmer remembering to do the right thing. Remembering to do the right thing is fine in the small but it doesn't scale.
Breaking parametricity definitely has uses. The question becomes one about the tradeoffs one makes. That's why I'd rather have a discussion about those tradeoffs than just "constime good" or "parametricity good". Better yet are neat ideas that capture the good parts of both. (E.g. type classes / implicit parameters reduce the notational overhead of calling functions with constrained generic types, but this bring their own tradeoffs around modularity and coherence.)
Do you have a blog or other site where you post your writing? Your explanations are quite good and easy to follow for someone like me, an interested/curious onlooker.
Thanks! I appreciate that. A few things:
- https://scalawithcats.com/ is a book I'm writing. There is an associated newsletter to which I post blog articles and the like.
- https://noelwelsh.com/ is my personal site, which hosts my blog.
As a former user of Scala and at times cats, I look forward to your work.
Functions can crash anyway. I don't see how what you describe is different from a function on integers that errors on inputs that are too big. The programmer has to actively choose to make function break parametricity, and they can equally chose not to do that.
In an ideal world, a function that only works for small integers would bake that into its type system. Ie, rather than accepting “any integer”, the function would accept a u8, or a value compile-time guaranteed in the range of 0..10 or something.
Your point still stands though. Modern programming languages don’t constrain you much at all with their type systems.
I spent a little time in Haskell a few years ago and it the kind of reasoning you can do about functions is wild. Eg, if a function has the type signature of A -> A, we know the function has to be the identity function because that’s the only function that matches the type signature. (Well that or the “bottom function”, which loops forever). Lots of functions are like that - where the type definitions alone are enough to reason about a lot of code.
> In an ideal world, a function that only works for small integers would bake that into its type system. Ie, rather than accepting “any integer”, the function would accept a u8, or a value compile-time guaranteed in the range of 0..10 or something.
Ada takes this approach.
Haskell has quite a few partial functions in the standard library if I recall. Bog standard (!!), head, and tail for lists; fromJust to unwrap Maybes; and even some more interesting examples like "all" which mightn't work on infinite lists. Indeed any Turing-complete language includes partial functions of that final variety.
This is _also_ doable with the ability to constrain generics.
sort<A> where A implements Comparable
Simpler explanation IMO.
Not really an "also", "implements" is just syntax sugar for what the GP is saying.
Sure, but it's the same thing in 10x fewer words. Having parametric generic types that accept constraints allows your functions to have the best of all worlds.
So a language _with_ that is superior. All zig needs to do is add some way to allow for constraints.
Fair point about parametricity. A language could in the macro expansion do the equivalent of a scala implicit lookup for a sorting function for the type and return an error at macro expansion time if it can't find one. That avoids the doing the right thing requires discipline problem but I agree it is still less clear from the type signature alone what the type requirements are.
> For example, I believe C++ template programming also breaks "parametricity" by supporting template specialisation.
C++ breaks parametricity even with normal templates, since you can e.g. call a method that exists/is valid only on some instantiations of the template.
The issue is that the compiler can't help you check whether your template type checks or not, you will only figure out when you instantiate it with a concrete type. Things get worse when you call a templated function from within another templated function, since the error can then be arbitrarily levels deep.
> My point is that in makes sense for C++ to allow this, I think.
Whether it makes sense or not it's a big pain point and some are trying to move away from it (see e.g. Carbon's approach to generics)
> C++ breaks parametricity even with normal templates
I might be wrong here, but as I understand it "parametricity" means loosely that all instantiations use the same function body. To quote wikipedia:
"parametricity is an abstract uniformity property enjoyed by parametrically polymorphic functions, which captures the intuition that all instances of a polymorphic function act the same way"
In this view, C++ does not break parametricity with "normal" (i.e. non-specialised) templates. Of course, C++ does not type check a template body against its parameters (unless concepts/trairs are used), leading to the problems you describe, but it's a different thing as far as I understand.
To be parametric it needs to be the same body semantically, not just textually. Particularly in C++ with its heavy operator overloading and limited safety, you can very easily write a template whose body will do the right thing for some types and be undefined behaviour for others (e.g. if your template has comparisons in it and then you instantiate it with a pointer or something).
Hm, wouldn't any use of if constexpr break that definition?
e.g.
template<typename T>
void f() {
if constexpr (is_int<T>) { return 0; }
else ...
Parametricity is about behavior, not code. A function parametric in a variable should bevave identically for all values of the variable. If one instance of a C++ template fails to compile and another instance of the same template does compile it is a stretch to say they behave identically, even though they use the same code.
one thing you can reason about a function is: does it exist at all? if you don't have parametricity, you can't even be sure about that. in Rust, as long as your type satisfies a generic function's bounds, you can be sure instantiating that function with this type will compile; in C++ you don't have that luxury.
> Consider for example std::vector<bool>: implementations can be made that actually store a single bit per vector element (instead of how a bool normally is represented using an int or char).
Your example is considered a misfeature and demonstrates why breaking parametricity is a problem: the specialized vector<bool> is not a standard STL container even though vector<anythingelse> is. That's at best confusing -- and can leads to very confusing problems in generic code. (In this specific case, C++11's "auto" and AAA lessens some of the issues, but even then it can cause hard-to-diagnose performance problems even when the code compiles)
See https://stackoverflow.com/a/17797560 for more details.
The C++ vector<bool> specialization is bad because breaking many important implicit contracts about taking the address of vector<> elements makes it practically unusable if a normal vector<> is expected, but it isn't specialized incorrectly in a formally meaningful sense: all errors are outside the class (unsatisfied expectations from client code) and implicit (particularly for C++ as it was at the time).
You're not wrong, but is at the very least weird that a specialization doesn't conform to the concept that the general template does. Something which proper parametricity would have avoided -- if it were available.
(The Hysterical Raisins are understandable, esp. given that it wasn't even possible to specify Concepts in C++ until 2020...)
The point is exactly that the "concept" of what the template should do is informal and a careful, detailed specification in a far more expressive language than vintage C++ would be needed to elicit good compilation errors from something like vector<bool>.
Proper parametricity is only a starting point: types that specify alignment, equality and lifetimes would be needed to make it useful.
Vector bool may not have to store a range of space optimized bool values but the interface is still different enough and guarantees different enough that is is largely thought of as a mistake. For one the const reference type is bool and not bool const &. Plus other members like flip… mostly the issue is in generic code expecting a normal vector
Hi, article author here. I was motivated to write this post after having trouble articulating some of its points while at a meetup, so that's why the goal of this post was focused on explaining things, and not being critical.
So at least address your points here:
* I do agree this is a direct trade-off with Zig style comptime, versus more statically defined function signatures. I don't think this affects all code, only code which does such reasoning with types, so it's a trade-off between reasoning and expressivity that you can make depending on your needs. On the other hand, per the post's view 0, I have found that just going in and reading the source code easily answers the questions I have when the type signature doesn't. I don't think I've ever been confused about how to use something for more than the time it takes to read a few dozen lines of code.
* Your specific example for recursive generic types poses a problem because a name being used in the declaration causes a "dependency loop detected" error. There are ways around this. The generics example in the post for example references itself. If you had a concrete example showing a case where this does something, I could perhaps show you the zig code that does it.
* Type checking happens during comptime. Eg, this code:
pub fn main() void {
@compileLog("Hi");
const a: u32 = "42";
_ = a;
@compileLog("Bye");
}
Gives this error: when_typecheck.zig:3:17: error: expected type 'u32', found '*const [2:0]u8'
const a: u32 = "42";
^~~~
Compile Log Output:
@as(*const [2:0]u8, "Hi")
So the first @compileLog statement was run by comptime, but then the type check error stopped it from continuing to the second @compileLog statement. If you dig into the Zig issues, there are some subtle ways the type checking between comptime and runtime can cause problems. However it takes some pretty esoteric code to hit them, and they're easily resolved. Also, they're well known by the core team and I expect them to be addressed before 1.0.* I'm not sure what you mean by hygiene, can you elaborate?
“Hygiene” in the context of macro systems refers to the user’s code and the macro’s inserted code being unable to capture each other’s variables (either at all or without explicit action on part of the macro author). If, say, you’re writing a macro and your generated code declares a variable called ‘x’ for its own purposes, you most probably don’t want that variable to interfere with a chunk of user’s code you received that uses an ‘x’ from an enclosing scope, even if naïvely the user’s ‘x’ is shadowed by the macro’s ‘x’ at the insertion point of the chunk.
It’s possible but tedious and error-prone to avoid this problem by hand by generating unique identifier names for all macro-defined runtime variables (this usually goes by the Lisp name GENSYM). But what you actually want, arguably, is an extended notion of lexical scope where it also applies to the macro’s text and macro user’s program as written instead of the macroexpanded output, so the macro’s and user’s variables can’t interfere with each other simply because they appear in completely different places of the program—again, as written, not as macroexpanded. That’s possible to implement, and many Scheme implementations do it for example, but it’s tricky. And it becomes less clear-cut what this even means when the macro is allowed to peer into the user’s code and change pieces inside.
(Sorry for the lack of examples; I don’t know enough to write one in Zig, and I’m not sure giving one in Scheme would be helpful.)
zig comptime is not a macro system and you can't really generate code in a way that makes hygeine a thing to worry about (there is no ast manipulation, you can't "create variables"). the only sort of codegen you can do is via explicit conditionals (switch, if) or loops conditioned on compile time accessible values.
thats still powerful, you could probably build a compile time ABNF parser, for example.
Surely there's a way to generate code by manipulating an AST structure? Is there some reason this can't be done in Zig or is it just that no one has bothered?
Doing it this way is more verbose but sidesteps all hygiene issues.
Zig lets you inspect type info (including, eg, function signatures), but it doesn't give you raw access to the AST. There's no way to access the ast of the body of the function. As highlighted by view 0 in my article, I consider this a good thing. Zig code can be read without consideration for which pieces are comptime or not, something that heavy AST manipulation loses.
Though, if you really wanted to do stupid things, you could use @embedFile to load a Zig source file, then use the Zig compiler's tokenizer/ast parser (which are in the standard library) to parse that file into an AST. Don't do that, but you could.
This reminds me of comptime brainfuck interpreter https://github.com/edqx/ccb
Generating code during comptime is explicitly forbidden by the author. You can still generate code during build.zig of course.
Zig disallows ALL shadowing (basically variable name collisions where in the absence of the second variable declaration the first declaration would be reachable by the same identifier name).
Generating a text file via a writer with the intent to compile it as source code is no worse in Zig than it is in any other language out there. If that's what you want to do with your life, go ahead.
> being able to reason about functions just from their type signature.
This has nothing to do with compile-time execution, though. You can reason about a function from its declaration if it has a clear logical purpose, is well named, and has well named parameters. You can consider any part of a parameter the programmer can specify as part of the name, including label, type name, etc.
> There is a good discussion of some issues here: https://typesanitizer.com/blog/zig-generics.html
That's actually not a great article. While I agree with the conclusion stated in the title, it's a kind of "debate team" approach to argumentation which tries to win points rather than make meaningful arguments.
The better way to frame the debate is flexibility vs complexity. A fixed function generics system in a language is simpler (if well designed) than a programmable one, but less flexible. The more flexibility you give a generics system, the more complex it becomes, and the closer it becomes to a programming language in its own right. The nice thing about zig's approach is that the meta-programming language is practically the same thing as the regular programming language (which, itself, is a simple language). That minimizes the incremental complexity cost.
It does introduce an extra complexity though: it's harder for the programmer to keep straight what code is executing at compile time vs runtime because the code is interleaved and the context clues are minimal. I wonder if a "comptime shader" could be added to the language server/editor plugin that puts a different background color on comptime code.
>You can _reason_ about a function from its declaration if it has a clear logical purpose, is well named, and has well named parameters.
I think "reason" in gp's context is "compile-time reasoning" as in the compiler's deterministic algorithm to parse the code and assign properties etc. This has downstream effects with generating compiler errors, etc.
It's not about the human programmer's ability to reason so any "improved naming" of function names or parameters still won't help the compiler out because it's still just an arbitrary "symbol" in the eyes of the parser.
Downstream effects with generating compiler errors is still about the human programmer's ability to reason about the code, and error messages can only reference the identifier names provided.
The compiler doesn't do anything you, the programmer, don't tell it to do. You tell it what to do by writing code using a certain syntax, connecting identifiers, keywords, and symbols. That's it. If the meaning isn't in the identifiers you provide and how you connect them together with keywords and symbols, it isn't in there at all. The compiler doesn't care what identifier names you use, but that's true whether the identifier is for a parameter label, type name, function name or any other kind of name. The programmer gives those meaning to human readers by choosing meaningful names.
Anyway, zig's compile errors seem OK to me so far.
Actually, the zig comptime programmer can do better than a non-programmable compiler when it comes to error messages. You can detect arbitrary logical errors and provide your own compiler error messages.
I elaborated on parametricity in this comment: https://news.ycombinator.com/item?id=42621239
There are many ways one can reason about functions, and I think all of us use multiple methods. Parametricity provides one way to do so. One nice feature is that its supported by the programming language, unlike, say, names.
I saw that. But I don't think it has bearing on zig comptime.
zig generates a compile error when you try to pass a non-conforming type to a generic function that places conditions/restrictions on that type (such as by calling a certain predicate on instances of that type).
It's probably important to note that parametricity is a property of specific solution spaces, and not really in the ultimate problem domain (writing correct and reliable software for specific contexts), so isn't necessarily meaningful here.
> zig generates a compile error when you try to pass a non-conforming type to a generic function that places conditions/restrictions on that type (such as by calling a certain predicate on instances of that type).
Sure, but only after it's fully expanded, which is much harder to debug. And if a generic function doesn't fail to compile but rather silently behaves differently (e.g. if it calls a function that behaves unexpectedly, but still exists, on the type in question) then you don't get an error at all.
> parametricity is a property of specific solution spaces, and not really in the ultimate problem domain (writing correct and reliable software for specific contexts)
Nonsense. Without parametricity your software is not compositional and it becomes impossible to write correct software to solve complex problems.
> Sure, but only after it's fully expanded,
Code goes into the compiler. Either compiled code or errors come out. There's no partial expansion step to cause confusion.
You're probably referring to something about the flexibility zig's comptime allows, but it's important to note a zig programmer can be as picky as they want about what types a generic function will accept. People are really just talking about what the syntax for expression type restrictions is.
> Without parametricity your software is not compositional and it becomes impossible to write correct software to solve complex problems.
You can hold that opinion, but it's not a fact. The overall question isn't binary. It's one of balancing complexity and flexibility. A fixed system for specifying type restrictions is simpler and provides fewer opportunities for mistakes (assuming it's well designed), and may have parametricity. However, the lack of flexibility can just push the complexity elsewhere, e.g., leading to convoluted usage patterns, which could lead to more mistakes. A programmable system for specifying type restrictions offers more flexibility at the cost of more up-front complexity, but in a well-designed system the flexibility could lead to less overall complexity, and fewer mistakes. A nice thing about zig's approach is that the generics metaprogramming language is pretty much the same as the regular language, which mitigates the increase in complexity. I actually think it should be possible to create some kind of generics system that could credibly be said to be programmable and have parametricity, though I don't think there's any point to doing so.
> Code goes into the compiler. Either compiled code or errors come out. There's no partial expansion step to cause confusion.
You could make the same argument against having a separate compilation step at all - code goes into the language, it gets executed, any other step would just add confusion. But most of us tend to find that having a compilation step that catches errors earlier is helpful and makes it easier to produce correct code. Similarly, being able to build and check generic code as-is (in the simplest case, because generic code really is just parametric code and isn't getting monomorphised) is a lot nicer than only being able to build and check individual expansions of it.
> the lack of flexibility can just push the complexity elsewhere, e.g., leading to convoluted usage patterns, which could lead to more mistakes. A programmable system for specifying type restrictions offers more flexibility at the cost of more up-front complexity, but in a well-designed system the flexibility could lead to less overall complexity, and fewer mistakes
Some way of doing ad-hoc polymorphism is probably desirable, but only if it's set up so that people don't default to it. Generic things should be parametric most of the time, and non-parametricity should be visible, so that people don't do it accidentally. It's similar to e.g. nullability - yes, you probably do want to have some way to represent absent/missing/error states, but if you just make every value nullable and say that any function that wants non-nullable inputs can check itself, that "flexibility" ends up being mostly ways to shoot yourself in the foot.
> Generic things should be parametric most of the time, and non-parametricity should be visible, so that people don't do it accidentally.
Economy of mechanism is powerful though, it's one of the reasons C is still so popular. The comptime approach that provides both parametric and ad-hoc polymorphism using a single mechanism seems to fit Zig quite well. I'm still a bit of a typaholic, but I've really come to appreciate economy of mechanism instead of deeply inscrutable types.
I think a good language would take something like Zig's approach to comptime, where the template/macro language is the same as the value language, with a deep consideration of TURNSTILE from "Type Systems as Macros":
https://www.khoury.northeastern.edu/home/stchang/pubs/ckg-po...
You can even get to dependent type systems as macros:
https://www.williamjbowman.com/resources/wjb2019-depmacros.p...
I think a good design would involve stage polymorphism, but it does need to be the kind of polymorphism that preserves the distinction rather than just smooshing stages together or having ad-hoc special cases. (I've gone through this in the past, with Java 1.2 style types vs untyped vs typed with polymorphism, or more recently with the "function colouring" debate; Rust-style linearity is heading in the same direction too).
As far as I can see from a quick skim your links are one meta level up, about using macros to implement a type system for a DSL (which may have polymorphism and the like within that DSL) rather than using them to implement typing itself. There's still a distinction between types and macros, it's just that macros are being used to process types.
> Nonsense. Without parametricity your software is not compositional and it becomes impossible to write correct software to solve complex problems.
In theory there is no difference between theory and practice. In practice there is.
> type Example = Something[Example]
You can't use the binding early like this, but inside of the type definition you can use the @This() builtin to get a value that's the type you're in, and you can presumably do whatever you like with it.
The type system barely does anything, so it's not very interesting when type checking runs. comptime code is type checked and executed. Normal code is typechecked and not executed.
comptime is not a macro system. It doesn't have the ability to be unhygienic. It can cleverly monomorphize code, or it can unroll code, or it can omit code, but I don't think it can generate code.
Until version 0.12.0 (April 2024), you could make arbitrary syscalls, allowing you to generate code at comptime, and promote vars between comptime and runtime. [0] Before then, you could do some rather funky things with pointers and memory, and was very much not hygienic.
[0] https://ziglang.org/download/0.12.0/release-notes.html#Compt...
And I would add:
* Documentation. In a sufficiently-powerful comptime system, you can write a function that takes in a path to a .proto file and returns the types defined in that file. How should this function be documented? What happens when you click a reference to such a generated type in the documentation viewer?
* IDE autocompletions, go to definition, type hinting etc. A similar problem, especially when you're working on some half-written code and actual compilation isn't possible yet.
Also: security. Does this feature imply that merely building someone else’s program executes their code on your machine?
Considering that pretty much every non-toy project isn't built by directly calling the compiler but through build tools like make, cmake, autotools, etc or even scripts like `build.sh` that can call arbitrary commands and that even IDEs have functionality to let you call arbitrary commands before and after builds (and had since the 90s at least), i do not see this as a realistic concern worth of limiting a language's functionality.
There is a difference between actually building a project and merely opening a file in an IDE though. See for example the recent emacs completion vulnerability [1], which AFAIK is still open.
Syscalls aren't available to comptime code
I think the reason people are so in love with zig comptime is because even rust lovers realize that rust macros are a pile of poo poo
Scheme has a "hygienic" macro system that allows you to do arbitrary computation and code alteration at compile time.
The language doesn't see wide adoption in industry, so maybe its most important lessons have yet to be learned, but one problem with meta-programming is that it turns part of your program into a compiler.
This happens to an extent in every language. When you're writing a library, you're solving the problem "I want users to be able to write THIS and have it be the same as if they had written THAT." A compiler. Meta-programming facilities just expand how different THIS and THAT can be.
Understanding compilers is hard. So, that's at least one potential issue with compile-time programming.
By your definition practically any code is a compiler unless you literally typed out every individual thing the machine should do, one by one.
"Understanding compilers is hard."
I think this is just unnecessarily pessimistic or embracing incompetence as the norm. It's really not hard to understand the concept of an "inline" loop. And so what if I do write a compiler so that when I do `print("%d", x)` it just gives me a piece of code that converts `x` to a "digit" number and doesn't include float handling? That's not hard to understand.
C++ has so many complexities like SFINAE that I wouldn't say you are "able to reason about functions just from their type signature".
C++ templates are not parametric by design.
Parametricity is a neat trick for mathematicians, it's not really worth it in a low-level language (not least because the type system is badly unsound anyway).
I think most of those points one only stumbles over after a few thousand lines of Zig and going really deep into the comptime features.
And some features in your list are of questionable value IMHO (e.g. the "reasoning over a function type signature" - Rust could be a much more ergonomic language if the compiler wouldn't have to rely on function signatures alone but instead could peek into called function bodies).
There are definitely some tradeoffs in Zig's comptime system, but I think the more important point is that nothing about it is surprising when working with it, it's only when coming from languages like Rust or C++ where Zig's comptime, generics and reflection might look 'weird'.
> Rust could be a much more ergonomic language if the compiler wouldn't have to rely on function signatures alone but instead could peek into called function bodies
This path leads to unbounded runtime for the type checker/borrow checker. We’re not happy about build times as is.
And also lead to subtle errors for users of crates. Semver is hard enough to do when the compiler only deals with signatures, but if the body of the functions now contain invariants that can be relied on by function callers then it becomes intractable.
> most of those points one only stumbles over after a few thousand lines of Zig and going really deep into the comptime features.
> nothing about it is surprising when working with it
I think there's a contradiction here - when you get deep into using this kind of feature in a complex way is precisely when you most need it to behave consistently, and tends to be where this kind of ad-hoc approach breaks down.
if youve never gone deep into zig comptime... really nothing is surprising, except possibly when you can't do things and usually after a bit of thinking about it you understand why you can't.
You often want parametricity, until you want type-specific behaviour, which is very, very common. Haskell uses this to great effect with type classes.
> parametricity
That feels like the wrong word for the thing you're describing. Linguistic arguments aside, yes, you're absolutely right.
In Zig though, that issue is completely orthogonal to generics. The first implementation `foo` is the "only" option available for "truly arbitrary" `T` if you don't magic up some extra information from somewhere. The second implementation `bar` uses an extra language feature unrelated to generics to return a different valid value (it's valid so long as the result of `bar(T, x)` is never accessed). The third option `baz` works on any type with non-zero width and just clobbers some data for fun (you could golf it some more, but I think the 5-line implementation makes it easier to read for non-Zig programmers).
Notice that we haven't performed a computation with `T` and were still able to do things that particular definition of parametricity would not approve of.
fn foo(T: type, x: T) T {
return x;
}
fn bar(T: type, x: T) T {
_ = x;
return undefined;
}
fn baz(T: type, x: T) T {
var result: T = x;
const result_ptr: *T = &result;
const dangerous_shenanigans_ptr: *u8 = @ptrCast(result_ptr);
dangerous_shenanigans_ptr.* = 42;
return result;
}
Zig does give up that particular property (being able to rely on just a type signature to understand what's going on). Its model is closer to "compile-time duck-typing." The constraints on `T` aren't an explicitly enumerated list of constraints; they're an in vivo set of properties the code using `T` actually requires.That fact is extremely annoying from time to time (e.g., for one or two major releases the reference Reader/Writer didn't include the full set of methods, but all functions using readers and writers just took in an `anytype`, so implementers either had to read a lot of source or play a game of whack-a-mole with the compiler errors to find the true interface), but for most code it's really not hard to handle.
E.g., if you've seen the `Iterator` pattern once, the following isn't all that hard to understand. Your constraints on `It` are that it tell you what the return type is, that return type ought to be some sort of non-comptime numeric, and it should have a `fn next(self: *It) ?T` method whose return values after the first `null` you're allowed to ignore. If you violate any of those constraints (except, perhaps, the last one -- maybe your iterator chooses to return null and then a few more values) then the code will fail at comptime. If you're afraid of excessive compiler error message lengths, you can use `@compileError()` to create a friendlier message documenting your constraints.
It's a different pattern from what you're describing, but it's absolutely not hard to use correctly.
fn sum(It: type, it: *It) It.T {
var total: T = 0;
while (it.next()) |item|
total += item;
return total;
}
> recursive genericsA decent mental model (most of which follows from "view 4" in TFA, where the runtime code is the residue after the interpreter resolves everything it can at comptime) is treating types as immutable and treating comptime evaluation like an interpreted language.
With that view, `type Example = Something[Example]` can't work because `Example` must be fully defined before you can pass it into `Something`. The laziness you see in ordinary non-generic type instantiations doesn't cross function boundaries. I'm not sure if there's a feature request for that (nothing obvious is standing out), but I'd be a fan @AndyKelley if you're interested.
In terms of that causing problems IRL, it's only been annoying a few times in the last few years for me. The most recent one involved some comptime parser combinators, and there was a recursive message structure I wanted to handle. I worked around it by creating a concrete `FooParser` type with its associated manually implemented `parse` function (which itself was able to mostly call into rather than re-implement other parsers) instead of building up `FooParser` using combinators, so that the normal type instantiation laziness would work without issues.
> when does type checking run
Type inference is simplistic enough that this is almost a non-issue in Zig, aside from the normal tradeoffs from limited type inference (last I checked, they plan to keep it that way because it's not very important to them, it actively hinders the goal of being able to understand code by looking at a local snapshot, and that sort of complexity and constraint might keep the project from hitting more important goals like incremental compilation and binary editing). They are interleaved though (at least in the observable behavior, if you treat comptime execution as an interpreter).
100%. So tiring that the discourse around this is based on 15 minute demos and not actual understandings of the trade offs. Varun Gandhi's post that you link to is great.
Based on my experience with Rust, a lot of what people want to do with its "constant generics" probably would be easier to do with a feature like comptime. Letting you do math on constant generics while maintaining parametricity is hard to implement, and when all you really want is "a trait for a hash function with an output size of N," probably giving up parametricity for that purpose and generating the trait from N as an earlier codegen step is fine for you, but Rust's macros are too flexible and annoying for doing it that way. But as soon as you replace parametric polymorphism with a naive code generation feature, you're in for a world of hurt.
> Lisps don't have a type system, which makes things a bit more complicated
SBCL, which is a very popular Common Lisp implementation, is indeed strongly typed. Coalton, which is an addon, is even statically typed
medo-bear, you might want to know that you appear to be shadowbanned -- all your comments for the last ~10 days are dead. (I have some plausible guesses at what provoked this, though I have to say that I didn't see anything that looks to me like sufficient justification.)
(I "vouched" for your comment to which this is a reply, and that seems to have been sufficient to un-dead it. Unless the system's set up so that vouching for a shadowbanned comment un-deads it only for the person who does the vouching, I guess...)
Thanks
D had it 17 years ago! D features steadily move into other languages.
> Here the comptime keyword indicates that the block it precedes will run during the compile.
D doesn't use a keyword to trigger it. What triggers it is being a "const expression". Naturally, const expressions must be evaluatable at compile time. For example:
int sum(int a, int b) => a + b;
void test()
{
int s = sum(3, 4); // runs at run time
enum e = sum(3, 4); // runs at compile time
}
By avoiding use of non-constant globals, I/O and calling system functions like malloc(), quite a large percentage of functions can be run at compile time without any changes.Even memory can be allocated with it (using D's automatic memory management).
Here's one of my favorite uses for it. I used to write a separate program to generate static tables. With compile time function execution, this was no longer necessary. Here's an example:
__gshared uint[256] tytab = tytab_init;
extern (D) private enum tytab_init =
() {
uint[256] tab;
foreach (i; TXptr) { tab[i] |= TYFLptr; }
foreach (i; TXptr_nflat) { tab[i] |= TYFLptr; }
foreach (i; TXreal) { tab[i] |= TYFLreal; }
/* more lines removed for brevity */
return tab;
} ();
The initializer for the array `tytab` is returned by a lambda that computes the array and then returns it.A link to the full glory of it:
https://github.com/dlang/dmd/blob/master/compiler/src/dmd/ba...
Another common use for CTFE is to use it to create a DSL.
Walter, I'll take any chance I can get to say: thank you for creating D! One thing I was wondering about is the limits of compile time execution.
How does the D compiler ensure correctness if the machine the compiler runs on is different from the machine the program will execute on?
For example, how does the compiler know that "int s = sum(100000, 1000000)" is the same value on every x86 machine?
I'm thinking there could be subtle differences between generations of CPU, how can a compiler guarantee that a computation on the host machine will result in the same value on the target machine in practice, or is it assuming that host and target are sufficiently similar, as long as the architecture matches? (which is fine, I'm wondering as to what approaches exist)
> thank you for creating D!
My pleasure!
> is the same value on every x86 machine?
It's the same value on all machines, because integer types are fixed size (not implementation dependent) and 2's complement arithmetic is mandated.
Floating point results can vary, however, due to different orders in which constants are evaluated. The x87, for example, evaluates to a higher precision and then rounds it only when writing to memory.
I'll second thanking you for making D. I still haven't found a language with more compile time capabilities that I can/would actually use. So I'm still using D.
Any thoughts on adding something like Zig's setFloatMode(strict)? I have a project idea or 2 where for some of the computation I need determinism then performance. But very much still need the performance floating point can provide.
Thanks for the kind words!
Your best bet for floating point determinism is to stick with doubles. Then, in 64 bit code, the double math will be done with the XMM registers and instructions, which will stick with 64 bit arithmetic.
D's ImportC also can do CTFE with C code!
int sum(int a, int b) { return a + b; }
_Static_assert(sum(3, 4) == 7, "look ma, check at compile time!");
Why doesn't the C Standard add this? It works great!Tbf, Zig allows that too (calling the same function in a runtime and comptime context):
fn square(num: i32) i32 {
return num * num;
}
pub fn main() void {
_ = square(2);
_ = comptime square(3);
}
...and the comptime invocation will produce a compile error if anything isn't comptime-compatible (which IMHO is an important feature, because it rings the alarm bells if code that's expected to run at comptime accidentially moves into runtime because some input args have changed from comptime- to runtime-evaluated).Zig looks interesting, I just wish it had operator overloading. I don't really buy most of the arguments against operator overloading. A common argument is that with operator overloading you don't know what actually happens under the hood. Which doesn't work, because you might as well create a function named "add" which does multiplication. Another argument is iostreams in C++ or boost::spirit as examples of operator overloading abuse. But I haven't really seen that happen in other languages that have operator overloading, it seems to be C++ specific.
You don't know the amout of magic that goes behind the scenes in python and php with the __ functions. I think zig's approach is refreshing. Being able to follow the code trumps the seconds wasted typing the extra code.
Depends on domain I think. In some cases it can be very beneficial to keep the code close to the source, say math equations, to ensure they've been correctly implemented.
In this case the operators should be unsurprising, so they do what one would expect based on the source domain. Multiplying a vector and a scalar for example should return the scaled vector, but one should most likely not implement multiplication between vectors as that would likely cause confusion.
I don't know about PHP, what amount of magic goes in behind Python's dunder methods? You can open it and see
There are many gotchas to Python dunder methods. An example is there is a bunch of functions that can be called when you do something like 's.d' where s is an object. Does it call "getattr" on the object, getattr on the class or get a property, or execute a descriptor? It is very hard to tell unless you're an expert
In my humble opinion, a lot of the dislike of operator overloading is related to unexpected runtime performance.
My ideal solution would be for the language to introduce custom operators that clearly indicate an overload. Something like a prefix/postfix (e.g. `let c = a |+| b`). That way it is clear to the person viewing the code that the |+| operation is actually a function call.
This is still open to abuse but I think it at least removes one of the major concerns.
I feel like the ocaml solution would fit zigs usecase well.
In ocaml you can redefine operators... but only in the context of another module.
So if I re-define + in some module Vec3, I can do:
Vec3.(a + b + c + d)
Or even: let open Vec3 in
a + b + c + d
So there you go, no "where did the + operator come from?" questions when reading the source, and still much nicer than: a.add(b).add(c).add(d)
I doubt zig will change though. The language is starting to crystallize and anything that solved this challenge would be massive.Maybe such operators for basic linear algebra (for arrays of numbers) should be just built into the language instead of overloading operations. I'm not sure if such a proposal doesn't already exists.
There is a specialized `@Vector` builtin for SIMD operations like this.
Yeah I never got the aversion to operator overloading either.
"+ can do anything!" As you said, so can plus().
"Hidden function calls?" Have they never programmed a soft float or microcontroller without a div instruction? Function calls for every floating point op.
The problem is not that + calls a function. The problem is that + could call one of many different functions, i.e. it is overloaded. Zig does not allow overloading plus() based on the argument types. When you see plus(), you know there is exactly one function named “plus” in scope and it calls that.
operator + is overloaded even in plain C: it will generate different instructions for pointers, floats, integers, _Complex, _Atomic and the quasi standard __float128. Sometimes it will even generate function calls.
I suspect zig might be similar.
Not if `plus` is a pointer. Then `plus()` is a conditional branch where the condition can be arbitrarily far away in space (dynamically scoped) and time. That's why I think invisible indirection is a mistake. (C used to require `(*plus)()`.)
or 'plus' could be a macro...
Ah ‘fieldNames’, looks very similar to Nim’s ‘fieldPairs’. It’s an incredibly handy construct! It makes doing efficient serialization a breeze. I recently implemented a compile time check for thread safety checks on types using ‘fieldPairs’ in about 20 lines.
This needs to become a standard feature of programming languages IMHO.
It’s actually one of the biggest things I find lacking in Rust which is limited to non-typed macros (last I tried). It’s so limiting not to have it. You just have to hope serde is implemented on the structs in a crate. You can’t even make your own structs with the same fields in Rust programmatically.
At some point there was a discussion about compile time reflection, which I guess could include functionality like that, but I think the topic died along with some kind of drama around it. Quite a bummer, cause things like serde would have been so much easier to imeplement with compile time reflection
Another example applying compile-time reflection is something like https://github.com/c-blake/cligen { but it helps if your host prog.lang has named parameters like Python's foo(a=1, b=2) }.
With comp-time reflection you can build frameworks like ORMs or web frameworks. The only trade-off is that you have to include such a library in the form of source code.
After having written a somewhat complete C parser library I don't really get the big deal about needing meta programming in the language itself. If I want to generate structs, serialization, properties, instrumentation, etc, I just write a regular C program that processes some source files and output source files and run that first in by build script.
How do you people debug and test these meta programs? Mine are just regular C programs that uses the exact same debuggers and tools as anything else.
>I don't really get the big deal about needing meta programming in the language itself. If I want to generate structs, serialization, properties, instrumentation, etc, I just write a regular C program that processes some source files and output source files and run that first in by build script.
This describes exactly what people don't want to do.
But exactly why?
If you just walked up to me out of the blue and asked "what computer language do you know is the worst for processing strings?", well, technically I might answer "assembler", but if you excluded that, my next answer would be C.
Furthermore, you want some sort of AST representation, at one level of convenience or another (I include this compgen-style "being 'in' the AST" to be included in that, even if it doesn't necessarily directly manipulate AST nodes), and C isn't particularly great at manipulating those, either, in a lot of different ways.
A consequence of C being the definitive language that pretty much every other language has had to react to, one way or another through however many layers of indirection, for the past 40+ years, is that pretty much every language created since then is better than C at these things. C's pretty long in the tooth now, even with the various polishings it has received over the years.
Because after enough hands have touched a codegen script, debugging it becomes impossible.
In jai you use the same language for programming and metaprogramming. The compiler knows how to execute the bytecode it generates. The compiler also has a builtin debugger for the bytecode.
C# (strictly, Roslyn/dotnet) provides this in a pretty nice way: because the compiler is itself written in the language, you can just drop in plugins which have (readonly!) access to the AST and emit C# source.
Debugging .. well, you have to do a bit more work to set up a nice test framework, but you can then run the compiler with your plugin from inside your standard unit test framework, inside the interactive debugger.
Yes, this is the same approach Ryan Fleury and others advocate, and it's perfectly good:
> Arbitrary compile-time execution in C:
> cl /nologo /Zi metaprogram.c && metaprogram.exe
> cl /nologo /Zi program.c
> Compile-time code runs at native speed, can be debugged, and is completely procedural & arbitrary
> You do not need your compiler to execute code for you
The only benefit that some (certainly more rare) compilers can provide is type metadata/compile-time reflection. Otherwise, totally.
MS DOS choice of / for commandline arguments and \ for paths always hurts my eyes
I don't know about zig bit the power of lisp is that youre manipulating the s-expressions or to put it another way, you're manipulating the ast. To do that in C you'd need to write a full C parser for your C program that processes source files.
I used to do that in Python with the numba jit. Write Python code that generates a Python code that then gets compiled.
It's a fragile horrible mess, and the need to do this was a major reason for me to switch away from Python. It's a bit like asking why we don't just pass all arguments to functions as strings. Yeah, people write stringly typed code, but it should rarely be necessary, and your language should provide means to avoid it.
Whether you consider it a big deal or not is up to you, but with zig's approach you don't have to write/maintain a separate parser, nor worry about whether it's complete enough to process your source files.
I don't know a lot about debugging zig comptime, though. I use printf-style debugging and the built-in unit test blocks. That's all I've needed so far. (Perhaps that's all there is.)
Well put. I always have the feeling that any language which has an `eval` function or an invokable compiler can do meta program. That said, I think the "big deal" is in UX/DX. It's really nice to have meta programming support built-in to the language when you need it.
> How do you people debug and test these meta programs?
I couldn't find any other answer than using @compileLog to print-debug [1]. In lisp, apparently some implementations allow to trace macros [2]. Couln'd find anything about Nim's macro debugging capabilities.
This whole thing looks like a severe limitation that is not balanced by the benefit of having all code in the same place. Do you know other languages that provide sensible meta-programming facilities?
[1] https://www.reddit.com/r/Zig/comments/jkol30/is_there_a_way_... [2] https://stackoverflow.com/questions/44872280/macros-and-how-...
In lisp, macros are just ordinary functions whose input and output is an AST. So you can debug them as you would any other function, by tracing, print debugging, unit tests or even stepping through them in a debugger.
To debug macros in Nim, you'll likely need to print arguments and expansions at compile-time, inspect the output, change things to see what happens, repeat...
https://nim-lang.org/docs/macros.html#toStrLit%2CNimNode
https://nim-lang.org/docs/macros.html#astGenRepr%2CNimNode
https://nim-lang.org/docs/macros.html#dumpAstGen.m%2Cuntyped
https://nim-lang.org/docs/macros.html#treeRepr%2CNimNode
> I just write a regular C program that processes some source files and output source files and run that first in by build script.
Cool, you now invented your own DSL and half-baked meta programming macro language for something that shall have been in the language to begin with.
In addition, any complex interaction between your "own made template engine" and the native code is now a pile of hack. E.g write a generic function: Good luck to interpret any error based on the typing.
Code generation is almost consistently the worst solution to a meta-programming problem.
> Cool, you now invented your own DSL and half-baked meta programming macro language
I'm not sure I'm following your statement. What he said was to use a C program to parse C code and emit additional C code. There is no mention of DSL.
> for something that shall have been in the language to begin with.
The whole point of this discussion is to debate on that.
I have no strong opinion (yet) but the meta program looks easy to understand compared to the pandora box of metaprogrammaing withing the language (since it requires standardization, limitations etc.).
> What he said was to use a C program to parse C code and emit additional C code. There is no mention of DSL
Because it is always the same story:
- You start by writing a little meta-compiler to solve one specific codegen problem in a specific portion your code.
- Then you realize their is many slight variations of this problem in other areas of your project or other projects... because it is exactly what *Genericity* is all about. And we know that since literally the 1970s and freaking LISP.
- To avoid your meta-compiler to become a Frankenstein of options with endless hardcoded logic: you make it interpret some annotations in C comments, some preprocessors or some template files somewhere.
-> Congratulations: you invented your own half baked DSL.
I have seen that many time, in many places. Again and again. Often because there is a category of C programmers that would prefer to swim in their own shit instead of using few C++ templates.
Codegen is consistently a terrible solution to a well studied problem: meta-programming. The fear of the feature creep (templates, macros) shall never be a justification to create some half baked complexity monster that will alaways finish worst than the problem they try to avoid.
If you doubt about that: Just use a lexer or parser generator. Or better, the quintessence of codegen: Autotools. They are a perfect illustration of how terrible and how fucking unmaintainable Codegen is.
I can think of a single use case where a meta-compiler and a DSL are appropriate solutions: *Serialization* (Protobuf, Thrift, CapNProto, ...). Because in this precise case, you actually do want a language neutral way to express your interface: you want an IDL.
Currently here, Zig does the right thing: Comptime execution for meta-programming is one order of magnitude better than anything available in C or C++ before C++20
I think the parsed c program is the DSL. You have to write a parser and compiler for the c-like source to the actual c code.
The notion of "something that shall have been in the language" is not properly defined, as you can basically start from that and arrive at C++. So it is perfectly fine to assume that compile-time programming does not belong into the language and write your own processor.
As for DSL, any DSL is a separate language that needs to be learned, so it is not very different from creating your own processor.
Because it goes hand-in-hand with your code self-describing with static reflection.
Another interesting pattern is the ability to generate structs at compile time.
Ive ran experiments where a neural net is implemented by creating a json file from pytorch, reading it in using @embedFile, and generating the subsequent a struct with a specific “run” method.
This in theory allows the compiler to optimize the neural network directly (I havent proven a great benefit from this though). Also the whole network lived on the stack, which is means not having any dynamic allocation (not sure if this is good?).
I've done this sort of thing by writing a code generator in python instead of using comptime. I'm not confident that comptime zig is particularly fast, and I don't want to run the json parser that generates the struct all the time.
Another thing I tried as an alternative is using ZON (zig object notation) instead of json. This can natively be included directly as a source file. It involved writing a custom python exporter though (read: I gave up).
FWIW the goal for comptime Zig execution is to be at least as fast as Python. I can’t find it now but I remember Andrew saying this in one of his talks at some point.
I believe that Zig build system can cache comptime processes, so if the JSON didn't change it doesn't run again.
I think if you integrated with the build system, yes, Zig can do things only when the file changed. But I'm not sure that Zig figured out incremental comptime yet. That's way harder to accomplish.
How does this affect the compile times?
They become quite long, but it was surprisingly tolerable. I recall it vaguely but a 100MB neural network was on the order of minutes with all optimizations turned on. I guess it would be fair to say it scaled more or less linearly with the file size (from what I saw). Moreover I work in essentially a tinyml field so my neural networks are on the order of 1 to 2 MB for the most part. For me it wouldve been reasonable!
I guess in theory you could compile once into a static library and just link that into a main program. Also there will be incremental compilation in zig I believe, maybe that helps? Not sure on the details there.
It's nothing like C++ templates.
While interesting, this is one of the cases, where I agree with "D did it first" kind of comments.
sure, and hygienically, it's not a preprocessor thing.
If you're surprised by Zig's comptime, you should definitely take a look at Nim which also has compile-time code evaluation, plus a full AST macro system.
Nim is a fun language but I wouldn't consider it for "serious" work. It has the same issues as other niche languages (ie. ecosystem), plus: a polarising maintainer (most core contributors don't seem to last long) and primarily funded by a crypto company (if you care about that). Then again, 10 years ago none of that would have bothered me.
All these organizations[1] using nim in production must disagree with you then.
[1]: https://github.com/nim-lang/Nim/wiki/Organizations-using-Nim
Zig has the feature of not having exceptions. I see that Nim is trying to move away from them, but exceptions color functions, which means that you have to account for them even if you don't use functions that throw them[1]. Life is too short to deal with invisible control flow.
Whether you want to handle every error is context dependent. StatusIM has long running servers & clients as primary products and so tilt away from exceptions, but for a CLI utility you might want the convenience of a stack trace instead. I've seen this many times in Python CL apps, for example.
Alternatively, there is also a Nim effects tracking system that lets the compiler help you track the hidden control flow for you. So, at the top of a module, you can say {.push raises: [].} to make sure that you handled all exceptions somewhere. So, it may not be as "Wild West" as other exceptions systems that you are used to.
As with so many aspects, Nim is Choice. For many choice is good. For others they want language designers to constrain choice a lot (Go is probably a big recent example, because fresh out of school kids need to be kept away from sharper tools or similar rationales). A lot of these prog.lang. battles mirror bigger societal debates/divides between centralized controls and more laissez-faire arrangements. Nim is more in the Voltaire/Spiderman's Uncle Ben "With great power comes great responsibility" camp, but how much power you use is usually "up to you" (well, and things you choose to depend upon).
> {.push raises: [].}
Will this transitively enforce exception handling? i.e. if a 3rd-party dependency that I am using calls into another dependency that raises exceptions, but doesn't handle them in any way (including not using that pragma), will Nim assert that? Otherwise, that's precisely the function coloring problem I mentioned: if you can't statically assert that a callee, or it's descendant callees, doesn't throw an exception then you have to assume that it will.
> Will this transitively enforce exception handling?
Yes. The module you put the pragma in is about the "roots" of the call graph, but it covers the whole call stack from those roots down wherever the code is defined. Sorry I didn't make that clear enough { but in my defense I was trying to address the problem you raised. :-) }
I'll be taking another look at Nim :)
Zig is overall pretty good as a language and it does what it needs to: staying in the lane of the purpose is very important. It is why I do not particularly care for some languages being used just because.
I hope we can have something that combines the meta-programming capabilities of Zig with the vast ecosystem, community and safety of Rust.
Looking at the language design, I really prefer Zig to Rust, but as an incompetent, amateur programmer, I couldn't write anything in Zig that's actually useful (or reliable), at least for now.
I Agree. I tried briefly Zig and quickly gave up because, as someone used to Rust, the compiler wasn't helping me find those issues at compile time. I know that Zig doesn't make those promises, but for me, it's a deal breaker, so I suppose Zig isn't the language for me.
On the other hand, I do like the concept of comptime vs Rust macros.
Please keep the Rust community away from Zig. (I joke. Mostly...)
Is anyone here using Zig for audio plugin development? It seems like a good candidate as an alternative to C++ but lacks the ecosystem (like JUCE). Are there any ongoing efforts to bring DSP/audio plugin development to Zig?
IIRC Andrew Kelley's original goal for developing Zig was to build a DAW.
I'm using D for audio plugins and we do use CTFE extensively (named comptime in Zig). Zig might be a bit more fit maybe because of the easier C and C++ interop and targetting, but I'm not sure about the COM and OOP story.
Mojo's compiletime metaprogramming [1] is inspired by Zig's. Though Mojo takes things further by implementing a fully-featured generic programming system.
What can you do in Mojo that you can't do in Zig?
Pay a company for the privilege of being allowed to develop with it. It has a commercial license, where Zig is MIT’d.
I haven’t written a line of either. I could see using Zig, but there’s no plausible scenario where I’d ever write Mojo. Weird proprietary languages tend to be a career pigeonhole: “you’ve been doing what for the last 5 years?”
Weird proprietary languages _can_ also be much better for a particular task than anything else and can thus be smart business. Someone who will dismiss something they don't know on the grounds that is weird and proprietary is not someone I'd want to work with. But of course if this is how a lot of people think then there may be no choice but for most people to try and stick with the tried and true.
Nobody is paying anybody to use Mojo, its main issue is cross-platform support, specifically lack of native Windows support.
Like I always say, most languages start off closed, incubated for some years by a tiny group, before being opened. Mojo is no different, in fact, Modular have given a pretty solid timeline about when they plan to open source the compiler - https://youtu.be/XYzp5rzlXqM?si=nmvghH3KWX6SrDzz&t=1025
Every language I've used in the last few years has been FOSS from the very beginning time that it was shared outside its original developers. A proprietary language is the odd exception, not the common case.
Looks like I was wrong about having to pay to use Mojo itself. It's their "MAX" product you have to pay for, at least today. The language currently free-of-charge, although proprietary.
MAX is also free to use even for commercial purposes (as long you're not a competitor)
GPU kernels.
The article went off the rails at partial evaluation as it doesn’t even show an example of partial evaluation. And then the section on generating code really went nowhere useful.
As a disclaimer, the last time I gave Zig a solid shot was when 0.12 released. The last time I played with comptime properly was in 0.11.
There's a heap of praise thrown at zig comptime. I can certainly see why. From a programming language perspective it's an elegant and very powerful solution. It's a shame that Rust doesn't have a similar system in place. It works wonderfully if you need to precompute something or do some light reflection work.
But, from an actual user perspective it's not very fun or easy to use as soon as you try something harder. The biggest issue I see is that there's no static trait/interface/concept in the language. Any comptime type you receive as a parameter is essentially the `any` type from TypeScript or `void` from C/C++. If you want to do something specific* with it, like call a specific method on it, you have to make sure to check that the type has it. You can of course ignore it and try to call it without checking it, but you're not going to like the errors. Of course, since there are no interfaces you have to do that manually. This is done by reading the Zig stdlib source code to figure out the type enum/structures and then pattern-matching like 6 levels deep. For every field, every method, every parameter of a method. This sucks hard. Of course, once you do check for the type you still won't get any intellisense or any help at all from your IDE/editor.
Now, there are generally two solutions to this:
One would be to add static interfaces/concepts to the language. At the time this was shot down as "unnecessary". Maybe, but it does make this feature extremely difficult to use for anyone but the absolutely most experienced programmers. Honestly, it feels very similar to how Rust proc macros are impenetrable for most people.
The second one is to take a hint from TypeScript and take their relatively complex type system and type assertions. Eg. `(a: unknown): a is number => typeof a === 'number'`. This one also seems like a bust as it seems to go against the "minimal language" mantra. Also, I don't get the feeling that the language dev team particularly cares about IDEs/LSPs as the Zig LSP server was quite bad the last time I tried it.
Now, the third solution and the one the people behind the Zig LSP server went with is to just execute your comptime functions to get the required type information. Of course, this can't really make the experience of writing comptime any easier, just makes it so that your IDE knows what the result of a comptime invocation was.
So in short it is as difficult to use as it is cool. Really, most of the language is like this. The C interop isn't that great and is severly overhyped. The docs suck. The stdlib docs are even worse. I guess I'm mostly dissapointed since I was hoping Zig could be used where unsafe Rust sucks, but I walked away unsatisfied.
Regarding the 'missing interface feature' it's quite trivial to write a generic comptime function which checks whether an input value has the expected shape by checking it against a struct type which describes the interface (basically a comptime typeguard), such a helper function could probably go into std.meta and would provide better error messages then the followup compile errors - won't necessairly help with the LSP problem though.
Whether that's a good or bad idea remains to be seen (usually I prefer syntax sugar over such a building blocks system - reminds me too much of the C++ stdlib approach), but a surprising amount of dedicated typesystem features in other languages can be done with comptime coding in Zig.
Yeah, that should be possible and is something I considered when I was playing around with comp time. But, having a language-provided or even recommended way makes things much easier. Also, I wouldn't call it trivial. I think you need to deal with recursively comparing types and also need to handle Self and other quirks.
It kind of reminds me of the other side of static/dynamic polymorphism in that much of the language lives only on undocumented conventions.
At some point, the folks writing Zig decided that they really needed some runtime polymorphism for their stdlib implementation. Of course, this would be implemented in the language (using dyn traits in rust, or OOP in C++) in every other langauge. Ok, but this is Zig, so maybe it's implemented as a std.meta helper in the stdlib? Of course not. Instead you have to figure out what the stdlib does, which is of course to implement it manually for every type that needs it. Making it worse is that at some point the way to do this has changed and so at least at the time there were actually two different ways to do dynamic dispatch in the stdlib and other people's code/tutorials/whatever.
What a mess of a language. So many good ideas hindered by baffling decisions. Writing Zig is like grinding teeth.
Tiny bug report: The second code example's output is still ">>array's<< sum is 6" (emphasis mine) even though the code snippet's printout is "struct's sum is {d}"
Fixed, thanks for the report.
A spelling error: "unnessisary" → "unnecessary".
Is there such a thing as a language that is always comptime by default? I.e. the main source code of the language runs at compile time, but emits as output another object in some data structure which then becomes the runnable program?
I believe Jonathan Bow’s Jai is trying to do a lot of this.
I see a lot of people in the comments basically saying "well X did it first" and that it's not worth talking about. This missed the point for me, zig is an interesting one personally and not out of semantics of std lib or anything really, it's just something nice to play around with so far. I think with the above attitude we probably could have stopped systems programming at c++, that wouldn't be too fun at all, what we all do without java to laugh at?
Just admit that discussions about proglangs are just so delicate. They turn people mad. End of the story.
> In the beginning, computers were invented. This has made lots of people very angry and was generally considered a bad move.
With reflection and code generation coming to C++26, and the already existing constexpr/consteval machinery, C++ will be able to do all of this.
Reminds me of "Der Untergang". Mit dem Angriff C++26 wird das alles in Ordnung kommen!
Interesting, I had the exact same image in my mind :D
LOL
Mein Fuehrer, Dieses feature... dieses feature wird von MVCC nicht eingebaut.
C++ could do generic programming long before Zig even was an idea, but writing generic code in Zig is still much more straightforward - also Zig is usable now, while C++26 features will probably land in real world C++ compilers around 2036 ;)
Bash scripts and (optionally) an assembler can do this. That's not really the point.
That's something different, code generation. The point is that C++26 will have compile-time reflection without any need for an external code generation tool, similar to Zig.
Ergonomics matters, though. In C++, this feature is bolted onto the language decades after it was originally designed. Zig is designed around this from the get go (which is why it doesn't have e.g. templates as a distinct feature).
With the main difference being that if you start Zig today, you’ll get to use that a good decade before it’s reliably available in C++.
Hmm. Actually, now that makes me want to learn Zig.
Consteval is usable today for a lot of scenarios that would require code generation tooling in years past.
How does consteval interact with thread_local? Defined (although I'm not sure what definition could make sense), error, or the dreaded "No diagnostic required"?
In the end, generics, comptime and code generation are just different steps on the 'stamping out specialized code and data' ladder though (e.g. all three are useful).
> Bash scripts and (optionally) an assembler
What do you mean by that ?
In much the same way as "being Turing complete means you can do anything", it just being possible to do compile-time execution isn't the reason Zig is exciting here.
C++ adding the same features means it'll be possible, but it has a much larger intersection of alternative features to introduce edge cases with, and _many_ implementations that will have this feature with varying quirks: all of this means C++ will have to do a _much_ better job than Zig does here to achieve nearly the same result.
Thanks for that. I loathe replys like "but lang/framework/... can do/will be able to do something similar/does something else which I like better/...". Well, it's not about that. It's about how easy it is to use, how good it is from preventing you to shoot yourself in the foot, sometimes how performant it is, ...
It's suggesting that Zig is convenient and that C/C++ won't be.
I think both will be nice.
Well C++ has a history of not really being convenient and makes it easy to shoot you in the foot.
Good. Keep on adding stuff to an already bloated language.
All programming languages either die, or become bloated, as any software product.
I bet a Fortran 77 developer will think the same of Fortran 2023, a COBOL 60 developer of COBOL 2023, a K&R C developer from C23, a 1975 Scheme developer from R7RS, a Python 1.0 developer from Python 3.13,... even Go 1.0 developers from 1.24 with generics, generators,...
One trick that can help you here is Rust's Editions (and the proposed but never implemented C++ Epochs).
This lets the language throw away bad ideas, without throwing away the code people wrote in the era when we didn't realise that's a bad idea.
I am not sure I necessarily agree with that. It matters what means of extending the language itself the language provides and thereby enables the users of the language to add to it, in form of libraries, that don't have to be part of the standard distribution of the language, but can still be reached for by anyone who wants to select them.
Languages with good macro systems have the upper hand in that regard.
Compile-time reflection will remove countless of lines of bloat from real C++ code bases due to eliminating the need to manually write formatting, hashing and serialisation implementations for classes.
…10 years from now
If not more; the words "modern" and "C++11" are still used in conjunction despite the fact that 2014 was a long time ago.
comptime simplifies and unifies much of that machinery into a clean, conceptual framework.
The examples in thist post seem very similar to how you do metaprogramming in the D language. That has existed for years yet you rarely hear about that.
With a much better ecosystem, 40 years of IDE tooling, frameworks, OS support.
Which is what people always forget when comparing language grammars.
I'm not sure if you actually want to call "copy these header files into your local file system" an ecosystem at all. The last 40 years have brought heaps of improvements in software development ergonomics. Zig is growing just fine, as other languages like Rust or Go.
That is what script kiddies do when using compiled languages.
Rust has had an almost usable implementation for affine types, and being the second coming of Ada, to win the hearths of the industry, including all major OS vendors and hyperscallers.
Go got lucky with Docker and Kubernetes rewrites, and their adoption across the industry.
So far Zig is basically Modula-2 with C like syntax, and compile time execution, relies on the same tooling that C and C++ have had for decades for use-after-free, doesn't support a binary libraries ecosystem by design, and really Bun isn't going to be the killer project that triggers a Rewrite in Zig movement.
It remains to be seen if Zig 1.0 happens, and how its adoption story at scale will be like.
> Bun isn't going to be the killer project that triggers a Rewrite in Zig movement.
It might be enough to make the zig ecosystem viable. This along with tiger beetle (they have raised tens of millions).
I think a lot of time is spent right now on the tooling, I hope that in a near feature the zig team will be able to switch to the event loop / standard library topics which really need love.
I agree that zig is taking too long to be finalised. And rust has made certain questionable life choices.
But people really, REALLY want to get off c and c++ for all the numerous reasons everybody knows.
They might want, yet while Khronos keeps publishing standards using C and C++, GCC/LLVM/CUDA/.NET/Java/V8/Metal make use of C++, Nintendo/PlayStation/XBox rely on C++, ... they are going to stay around no matter what.
> I agree that zig is taking too long to be finalised. And rust has made certain questionable life choices.
Any language that's older than 10 years is going to make questionable life choices. It's very easy to be Captain Hindsight, and ask why didn't you do X, 5 years ago? But adding feature X also makes another feature or property impossible, either via opportunity cost or features/properties being at odds.
That said, what do you mean by questionable life choices?
> Any language that's older than 10 years..
Yes, true. Every language has to make some early foundational choices, as few as possible, and try to carefully think about any new addition to the core because of the extra congnitive load that comes with it.
Go is an extreme example here, leaning towards the conservative side. C as well. Zig. Not a fan of Java but it also is kinda slow to add things. Python used to be very careful as well but that epoch is gone.
C++ is the opposite example. It tries to add as much as possible, and it was always the case. C compatibility! And classes! Templates! RAII! Metaprogramming! More of everything! Until it reached a point where it's unforgiving hard to add things. Or even learn it properly.
Now, Rust feels like a C++ reimplementation, complete with a culture of adding as much as possible as quickly as possible, and ignoring the resulting cognitive load.
I mean, it's a choice. Rust definitely has some great, even amazing, ideas to it. But I am afraid of thinking what the language will feel like in 10 years.
> Now, Rust feels like a C++ reimplementation, complete with a culture of adding as much as possible as quickly as possible, and ignoring the resulting cognitive load.
Ok, but I did ask for what specifically do you mean by questionable life choices? I feel Java is moving at a fast pace (and adding everything and the kitchen sink). Hence, why I wanted specific examples. Can you separate your feelings from facts, and see from where the feelings come from? I'm not saying you're wrong, I'm saying I want to understand your basis for that.
> Go is an extreme example here, leaning towards the conservative side.
Is it? Didn't it also start adding features that it swore not to add (generics)?
I honestly don't want to go the route of arguing over language trivia. Let's say between C++ and C I always pick C. And my first serious language was C++ of the "modern cpp" flavour.
I mean, there are things I like: bits taken from the ML language family, tooling, sane approach to OOP, error handling.
But the culture of trying to pull everything in... It is a road to hell.
The difference between Rust and Go here is that Go took 10 years to come up with a generics proposal, and it does solve a massive problem with the language.
There has been an awful amount of stagnation in those 40 years though. The Visual Studio debugger is still the best overall debugger but has hardly advanced since the late 90s. While the Debug Adapter Protocol and Language Server Protocol invented by the VSCode team are not perfect, they both make debugger and 'Intellisense' support in IDEs for new languages quite trivial and easily get into the 80% 'good enough' area of what a 'proper' IDE offers.
It has advanced quite a lot, but people don't care to learn about debuggers are they care about low level details of their programming languages.
Also it is quite telling that outside IDE friendly languages, debbugers are kind of stuck in the 80's, so no wonder that many think 80% of Visual Studio and friends is good enough.
Comptime to replace macros is indeed good, comptime to replace generics on the other hand isn't and that really makes me think of the “when all you have is a hammer” quote.
It's a tradeoff. Advanced generic programming as implemented in many other languages requires you to learn a completely new language. That new language is better suited for some use cases. Functions that take types and return types, on the other hand, can be more intuitive in other cases.
The “new language ” is dramatically simpler than most programming language though because its expressing power is limited and as you say, you only need it for “advanced generics” which is only a small fraction of all generic code one writes.
It's actually a DSL, tuned to the specific use-case it's doing.
Why?
The problem is that the alternatives to comptime for generics generally seems to have a hideous effect on compile times (see: C++ and Rust).
Is there a language that does generics in such a way that doesn't send compile times to the moon?
Shouldn't comptime have the same compile time implications as templates? In both cases you're essentially recompiling the code for every distinct set of comptime args/template parameters.
Zig is lazy, and C++ is eager. I can define an infinite set of mutually recursive types in Zig, and only the ones I actually use will be instantiated (not an everyday need, but occasionally interesting -- I had fun building an autodiff package that way with no virtual function overhead, and the set of type descriptors being closed under VJP meant that you could support arbitrary (still only finite) derivative-like tensors, not just first and second order).
Zig doesn't instantiate anything the doesn't get called. So, it doesn't have to generate a whole bunch of templated functions and then optimize down to the ones that actually get used.
The upside is that if you only call a generic function with a u32, you don't instantiate an f32 as well. The downside is that when you do decide to call that function with an f32, all the comptime stuff suddenly gets compiled for the f32 and might have an error.
In practice, I feel that I gain way more from the fast compile than I lose from having a path that accidentally never got compiled as my unit tests almost always force those paths to be compiled at least once.
> it doesn't have to generate a whole bunch of templated functions and then optimize down to the ones that actually get used.
It's been a long time since I've dealt with templated C++, but I thought this was how C++ does it too.
C++ will only generate functions for template parameters that are actually used, because it compiles a version of the templated function for each unique template parameters.
C++ is at the very least less lazy than Zig. As an example, if you write some constexpr expression that evaluates a ternary and instantiates a function differently in the two prongs, both will be instantiated, even the one that doesn't end up in the final program. Yes, there are workarounds, but I didn't end up using them. I just moved the offending assert from compile time to runtime because this particular code was not that important.
But the question is if that's actually decisive in the slow compilation problem. The solution in an eager language for evaluating too much stuff is basically "more if statements". Same thing in C++ metaprogramming, use more "if constexpr". If that's all it took to fix C++ compile times, it would have been done a decade ago. The actual problem is all the stuff that you do actually use that has to get repeatedly inlined and optimized away for zero cost abstraction to work.
No, I don't think it is a big deal for compilation speed.
C++ monomorphises generics on demand too. That's why it can have errors specific to specialization and why template error messages spam long causal chains.
C++ compile times are due to headers. Which in case of templates result in a lot of redundant work then deduplicated by the linker.
As proven by C++ with modules and binary libraries, compile times can be better in C++.
Rust suffers because they compile everything from source, and the frontend sends piles of unprocessed LLVM IR to the traditional slow backend.
This can be improved with better tooling, one example is the Cranelift backend, there could be an interpreter, and so on.
Examples of languages that don't send compile times to the moon with similar polymorphic power, Standard ML, OCaml, Haskell, D, Ada.
AFAIK Part of the problem with Rust is also that it compiles crates individually before linking them and because of that cannot use the upfront knowledge of what's going to be needed, and as such a generic function that crosses the crate boundary is going to be handled twice by the compiler.
This was initially done so that Rust could compile things in parallel between crates by with spawning more rustc processes, which is obviously much easier than building a parallel compiler directly, but in the end it's suboptimal for performance.
comptime for generics is a superset of the things that C++ and Rust do for generics
Ocaml
OCaml doesn't monomorphize functions. Instead references to every type are the same size (either a tagged int or a pointer). This is a sweet spot for OCaml, but doesn't really work for a language that doesn't allocate everything on the heap.
Indeed. Ocaml is GC'd, and that makes the implementation different. However, the question was about compile times, and i dare to say Ocaml is one if the fastest ones out there, even tho it has an rich and expressive typesystem. The conclusion then needs to be that the type system expressiveness (complexity?) does not alone make for slow compile times.
Rust really needs comptime, I love cargo and the ecosystem but trait level programming is weird, macros are weird, why can’t you just be normal? (rust screams)
Yeah it's good, and tech evangelism is just marketing.
Zig is alright, but Odin is amazing.
Why?
This website is not mobile ready.