« BackQuirks of Common Lisp Typesfosskers.caSubmitted by todsacerdoti 3 days ago
  • alhazraed 5 hours ago

    While the post references a link to the 'Type Hierarchy of Strings'[0], here is the full type tree[1] for Common Lisp.

    [0]: https://lispcookbook.github.io/cl-cookbook/strings.html#stri...

    [1]: https://sellout.github.io/2012/03/03/common-lisp-type-hierar...

    • perihelions 4 hours ago

      It's also easy to draw type graphs like these dynamically: DO-EXTERNAL-SYMBOLS + FIND-CLASS yields you list of types—and CLASS-DIRECT-SUPERCLASSES, from MOP, gives you the edge relations between them. In my personal SLIME setup I render graphs like this through graphviz and into an Emacs buffer.

      • selimthegrim 4 hours ago

        The type tree is screaming for the Apple Pascal poster treatment

      • kazinator 3 days ago

        > Note also that this isn't a linear hierarchy: in the String example above, simple-array and string are unrelated.

        Are you sure? If that's the case, it would have to be that the object is not a suitable argument to a method parameter of type string.

        Don't guess about whether they're considered unrelated in that implementation, use subtypep and typep.

        If (typep "X" 'string) yields true, then the type of "X", whatever it is, is related to string.

        • kazinator 3 days ago

          From article:

            (list (typep "漣" 'simple-array)
                  (typep "漣" 'string)
                  (typep "漣" 'vector)
                  (typep "漣" 'array)
                  (typep "漣" t))
          
            -> (T T T T T)
          
          There you go. The type of "漣" is a subtype of string. Since we know that the type is (SIMPLE-ARRAY CHARACTER (1)), it means that this type is a subtype of string. They are therefore related by the supertype-subtype relationship.
      • massysett 5 hours ago

        Since the title includes the word "quirks," the observation I'll make here is entirely appropriate: the post oversimplifies this.

        It says: "So as we can see, at both run-time and compile-time, Common Lisp does typechecking to prevent silly errors."

        This is not entirely accurate, as the Common Lisp standard does not require such typechecking.

        The Common Lisp standard specifies the types of the arguments that a function expects. It generally does not specify what happens if the function receives arguments of unexpected types. It explicitly does not specify this: the standard says that if you give a function unexpected types, the consequences are undefined:

        https://www.lispworks.com/documentation/HyperSpec/Body/01_dd...

        where "undefined" means that any behavior, from an error message to a harmless failure to a catastrophic failure, can occur:

        https://www.lispworks.com/documentation/HyperSpec/Body/01_db...

        Let's take + (the addition function) as an example. It "might signal" a type error if some argument is not a number:

        https://www.lispworks.com/documentation/HyperSpec/Body/f_pl....

        where "might signal" means that the result is unpredictable but if the function does signal an error, it will be of the given type:

        https://www.lispworks.com/documentation/HyperSpec/Body/01_db...

        I can understand why the standard is written this way. Checking the types of arguments takes time. Sometimes you might not want to check the types for performance reasons.

        It seems the author is using SBCL. As a practical matter, SBCL on the default settings will check the types of arguments. SBCL's manual (along with the manual of its predecessor, CMUCL) discusses how to manipulate these settings - the author discusses this, with the DECLARE form. But Common Lisp does not require this checking, so if you need your types checked, consult your implementation's manual. Indeed, SBCL allows you to change the settings for SPEED and SAFETY to specify how much type-checking you want.

        All these caveats are also true for structures. The author says structure types are checked - again, as a practical matter, SBCL will check these unless you tell it not to. But that's not what the standard requires. Indeed, the standard explicitly states that "It is implementation-dependent whether the type is checked when initializing a slot or when assigning to it."

        https://www.lispworks.com/documentation/HyperSpec/Body/m_def...

        [edit] Relevant part of SBCL manual is here:

        https://www.sbcl.org/manual/#Declarations-as-Assertions

        • colingw 4 hours ago

          Thank you for this. There is occasionally disagreement about what "Common Lisp" even means, and the spec is often cited, but as far as all of my posts, library work, and application work are concerned, Common Lisp means "the current reality of the major compilers as implemented in 2025". This is a descriptive / bottom-up definition, and as an active author of software it is the one I'm more concerned with. For instance, `:local-nicknames` have been essentially universally implemented among the compilers, despite not being part of the spec. To me, this makes that feature "part of Common Lisp", especially since basically all CL software written today assumes its availability.

          You're right to point out too that the post is somewhat SBCL-centric - this too reflects a descriptive reality that most new CL software is written with SBCL in mind first. Despite that I'd always encourage library authors to write as compatible code as possible, since it's really not that hard, and other compilers absolutely have value (I use several).

          • pklausler 40 minutes ago

            Every programming language has a practical definition: it is the intersection of the sets of features that are accepted by the various relevant production compilers and interpreted identically enough to be portable to all of them.

            Formal language definitions, standards, and books are great, but you can't compile with them. Abstract language specs that don't have reference implementations or conformance test suites are not particularly useful to either implementors or users.

        • pfdietz 3 hours ago

          The type hierarchy in the standard has an amusing quirk.

          Arrays can have an element type specified, and this type is "upgraded" to a type that is actually store in the array. This upgraded type is a supertype of the actual element type. Upgrading must also preserve subtypes: if T1 is a subtype of T2, then upgrade(T1) must be a subtype of upgrade(T2) (not necessarily a proper subtype, even if T1 != T2).

          Now, the standard requires that the upgrade of BIT is BIT (that is, (INTEGER 0 1)), and the upgrade of CHARACTER is CHARACTER. So, what is the upgrade of NIL, the empty type? It must be a subtype of both BIT and CHARACTER, but the only type with that property is NIL itself.

          So, the standard requires there must be a specialized array type with element type NIL. That is, it cannot store any values at all.

          • lapsed_lisper 11 minutes ago

            And then, since the type NIL is a subtype of all types, it's a subtype of CHARACTER. So because the type STRING is the union of all array types whose element type is a subtype of CHARACTER, an array that can't store any values is also a string. Oops.

            (Also, just for onlookers, in ANSI Common Lisp, but not its ancestors or its sorta-sibling Emacs Lisp, characters are disjoint from integers. That's why the intersection of BIT and CHARACTER is empty.)

            • munificent an hour ago

              > That is, it cannot store any values at all.

              NIL is a value in Lisp, no? So that means an array with element type NIL should be able to store exactly one value, whose index is NIL.

              • lapsed_lisper 18 minutes ago

                Yes, in Common Lisp, NIL is a value (it's a symbol, and by convention also the empty list).

                But when used as a type specifier, NIL denotes the empty set. So no Lisp object is of that type, and an array with that element type cannot store any object.

            • shadowgovt 4 hours ago

              This reminds me of my favorite special form in common-lisp: `the`. https://www.lispworks.com/documentation/HyperSpec/Body/s_the...

              (the fixnum (compute-my-fixnum-value))

              This specifies that the value in the form in the second argument is of the type specified by the first argument. One might assume that this is what other languages call `assert`, and it can be; your compiler / interpreter can be configured to assert if it detects a mismatch between the two.

              ... but the spec actually specifies that if they don't match, the behavior is undefined. `the` is also an opportunity for the compiler to throw away some dynamic typechecking logic; essentially, it's a chance to paint racing stripes on the implementation so it goes faster at the cost of risking undefined behavior.

              • dreamcompiler 3 hours ago

                Exactly. The primary purpose of type declarations in Common Lisp is to give the compiler a hint that you are taking responsibility for type management so the machine code doesn't need to check the type at runtime. But the spec does not require that the compiler obey that hint, and if the compiler does obey the hint the spec says it's not the compiler's fault if you screw up the type that you took responsibility for.

                An additional purpose for type declarations is not mentioned in the spec: Static analysis. SBCL goes the extra mile and does do some static analysis with your type declarations, which means it catches many kinds of programmer errors at compile time. Not all CL implementations do this.

                If you want a Common Lisp that performs thorough static analysis, use Coalton.

                • shadowgovt 3 hours ago

                  This is one of the things I appreciate about Common LISP's approach: it allows for undefined behavior, but UB is (mostly, as far as I know, I'm sure someone will jump in and correct me ;) ) off by default.

                  And I think that's the right place for the default to be. C and C++ support UB as well and there are definitely times where you want that for the speed benefits it affords the compiler's output, but getting to UB in those languages is as easy as creating a character array and then copying a string that is too long into that character array (gcc will warn you, but about the conversion of a string literal into a char pointer, not about the buffer overrun).

                  And with the C++ spec having a page count exceeding the entire Lord of the Rings trilogy, that's a lot of language definition for undefined behavior to live in.