• CodesInChaos 12 hours ago

    I'd expect negative integer ids in an API to break even more integrations than unexpectedly large integers.

    Though I guess that likelyhood is influenced by the choice of protocol. For example when using protobuf the client code generated from the specification file will use a 32-bit integer, if that's how it was defined. While in JSON I'd generally assume it's a positive integer smaller than 2^53.

    • swiftcoder 6 hours ago

      You don’t have to expose the negative to the customer - convert it to unsigned at the API layer, and bobs your uncle

      • OptionOfT 12 hours ago

        Right. You can have the best documentation:

        If they expose them as string and mention they're opaque? Then customers who parse them to uint will get bugs and be unhappy.

        Did they expose them as ints? Customers who used uints will be unhappy.

        At `jobs[-2]` the front-end parsed the ids (exposed as strings, but ints under the cover).

        The backend left them alone.

        That caused some issues when building out shared libraries.

        • Demiurge 11 hours ago

          What kind of API specifies that your number is int, uint, or bigint? According to a quick search, the formats for APIs are: JSON ~80%, XML ~15%, ~5% other.

          • lazide 11 hours ago

            Anyone storing them in a DB, or using them in internal fields will likely have a surprise on their hands. Unless they store them as opaque strings anyway, which is the saner thing to do in these situations anyway.

            • arjvik 5 hours ago

              SQL requires setting the max length of a string, and its quite reasonable to set it to len(2147483647)=10 if you were expecting 32-bit int IDs.

              • lazide 3 hours ago

                If your goal is storing opaque strings, that is a very silly thing to do.

                At that point you’re just blowing up storage for no reason. Just use an int if you’re that sure.

                Setting a string length to coincidentally the length of a int serialized to a string while doing no other validation on it is…. Just special.

          • layer8 10 hours ago

            If you expose them as strings, you might as well convert them to unsigned at the conversion point.

        • zeograd 15 hours ago

          I often see code relying on the increasing property of primary key (keeping track of processed vs unprocessed by the last processed pk only).

          This wrap into negative domain would wreck havoc for sure.

          • CodesInChaos 12 hours ago

            You generally can't rely on strict monotonicity of primary keys, since the order in which transactions commit isn't necessarily the order in which the ids were generated. But I have relied on primary keys being "monotonic enough" to sort output by creation time for display purposes.

            • OptionOfT 12 hours ago

              I've worked on invoicing software where we had to introduce a public, always +1 counter to ensure there are no gaps between invoices. Not +2, not +5.

              That way you couldn't make them disappear.

              • chiph 9 hours ago

                In the days when you used custom printed forms that had a number printed on them by the printer - when you loaded a new box of paper into your printer you had to input the first form number into the system so they'd match.

                If you opened boxes in "whatever" order you'd have invoice numbers that would run contiguous for 150 or so counts (the number of forms in the box), then skip to the next multiple of 150 to correspond to when the next (or previous!) box had been used.

                • stefs 11 hours ago

                  That mustn't be the primary key, though, but a serial that counts (and is unique) per-customer.

                  • OptionOfT 10 hours ago

                    This was before the SaaS days.

                    On-prem, single company who issued invoices to customers.

                    When there was an audit the government could ask to see invoices in a certain range. If some of them were missing, what does that mean? Paid under the table?

                    My wife worked at a place where they did manual PDFs, but there they had a tool to change properties of a PDF to change the creation time / last editing time, for when 'modifications' were needed.

                    And this reminds me of the other post here where some people assume cash means shady. Definitely the case there.

              • veyh 14 hours ago

                Well, I'd imagine that before returning the value through their API they could just check that if the number is negative, then add 2^32 to it, which would make it look like an unsigned 32 bit integer.

                • conradfr 8 hours ago

                  But isn't that exactly what they were trying to not do as their problem was the api users and not their internal use?

                  • veyh 8 hours ago

                    It was definitely a problem with their database but I suppose it's possible that the customers were also expecting 32 bit signed ints.

                  • layer8 10 hours ago

                    In most languages that support differently sized integer types and/or unsigned integer types, you wouldn’t have to check, but can just apply the appropriate modulo or bit operation on all values.

                • Demiurge 11 hours ago

                  I don't understand, what was the issue with changing the column type from `int` to `bigint`? What does exposing the IDs have to do with how large those ints can be? This seems like a backend issue, if we're talking about HTTP/REST APIs. Now, if we're talking compiled C style APIs, then yes, obviously widening the types will cause issues. This is very important context that is missing from this article.

                  • icedchai 10 hours ago

                    The issue was probably database migration time. I was once at a startup that had close over 1 billion+ rows in MySQL. We were approaching the `int` limit in another year or so. Many tables would need to be migrated due to foreign key constraints. Migrating one of the tables required significant downtime (6 to 8 hours, IIRC) due to slow spinning disks. Some servers didn't have enough space to rebuild the tables, so we'd want to add disks just in case. There were several servers.

                    A few "alter table" commands cascades to an operational PITA.

                    • yawnr 4 hours ago

                      I guess if in the API documentation you are saying the pkey is an int, then someone consuming that data and storing it in their own table would also likely make that the column type. So when it crosses that threshold, your customers’ tables will break.

                      I think he did a pretty bad job of explaining it if that’s the case though.

                    • TheCowboy 16 hours ago

                      > No-one really likes engineering war stories

                      Is that really true? I did keep reading the entire piece. I think they're often interesting and can contain nuggets of wisdom or insight. Or sometimes they're just funny. When I meet someone who worked on something interesting, I often start trying to pry stories like this post out of them.

                      • shermantanktop 15 hours ago

                        Everyone likes engineering war stories!!! Never heard of an engineer who didn’t.

                        • yk 15 hours ago

                          No, but it is amazing first sentence. Everybody goes, this story is specifically for me, I'm very special.

                          • eCa 14 hours ago

                            I read the piece (and enjoyed it) despite the first sentence. I’ve become increasingly sensitive to this kind of fluff.

                            It’s not a hook, it’s bad read-bait.

                            • tclancy 10 hours ago

                              Well then you are very special.

                              Introverts hate this one weird trick!

                              • bobthebuilders 13 hours ago

                                Half the time I read the stories they're just a thinly disguised ad for some flavor the day SaaS, so at least in this instance the hook was somewhat useful. Now if everyone uses this to shill their SaaS, then maybe not.

                            • derekcheng08 9 hours ago

                              LOL came here to say this exactly. Everyone LOVES war stories in my experience :)

                            • dusted 12 hours ago

                              > No-one really likes engineering war stories,

                              I love engineering war stories

                              • sorrythanks 15 hours ago

                                maybe i'm too far gone, but this doesn't even feel hacky to me. the key needs to be a unique number, -1 and 1 are two different numbers.

                                • slipperybeluga 11 hours ago

                                  Yeah but how many of those customers were relying on the key not being a negative number?

                                  • NetMageSCW 5 hours ago

                                    Assuming the API was properly documented as returning signed int, that’s not my problem. Abuse of the API or misunderstanding of the API doesn’t trump running out of space.

                                    • drob518 4 hours ago

                                      Exactly. I mean, if the end solution is to convert to a big int, who’s to say that some customer didn’t assume it would always be 32 bits and blow up then, too.

                                      This does highlight the fact that 32 bit is just a small number these days. Personally, I prefer UUIDs instead of incrementing integers for primary keys since they also scale out without having to have global coordination, but at least choose a 64-bit number.

                                • sheepscreek 6 hours ago

                                  This is engineering at its finest. Working within tight constraints to find solutions that minimize impact. An equally important part of the “solution” is communication - to the leadership, departments and customers. Start early, communicate often and you will almost always come out ahead, even if mistakes are made.

                                  • thewisenerd 18 hours ago

                                    can't wait for solutions of a similar nature around 2038-01-19

                                    a free 68 more years!

                                    (hopefully nobody optimized for the 1 signed bit when allocating memory tho)

                                  • 1egg0myegg0 12 hours ago

                                    Whoever gets that magical -2,147,483,648 is going to be really surprised that things keep working

                                    • IshKebab 13 hours ago

                                      Hard to believe that all their customers had written their code to work with signed IDs though.

                                      Honestly I would expect that to break more users code (and in weirder ways) than just changing the type. It's unclear from the story how the type was exposed though.

                                      • gcanyon 12 hours ago

                                        Yeah, this was my immediate thought as well, but if the spec for the API says signed int, then at least you're defensible: you haven't broken the letter of the spec, even if you're pounding on the spirit of the spec pretty hard. You have a fairly reasonable likelihood that most/all of your customers have implemented to your spec, and therefore any negative consequences are down to secondary effects of how they handle the negative values, not directly because of failure to be able to store them.

                                        That said, to your point, there was almost certainly someone comparing IDs to determine recency, and during the transition from large-positive to large-negative, that would absolutely cause havoc.

                                        I'd be curious if their API spec actually said anywhere that the IDs increased consistently.

                                        • hn92726819 11 hours ago

                                          I'd believe it. Not sure when this is, but if it's a few years old and business software, they could probably asume everyone uses java, which doesn't even have unsigned integers.

                                          • swsieber 11 hours ago

                                            With $MY_JOB in java, that was my assumption

                                            • IshKebab 8 hours ago

                                              Right but just because it's `int id` doesn't mean all code that uses it will still work when it's negative.

                                            • crazygringo 13 hours ago

                                              Came here to say exactly this. Programming languages usually default to signed, but if you're storing these things in databases it's common to explicitly choose unsigned, since ID's are virtually always unsigned and it gives you twice the space until you run out.

                                              Like, instead of using negative primary keys, they could have also just have converted to an unsigned int32. I would assume both of those would break a bunch of customer implementations though.

                                              • sgarland 13 hours ago

                                                Postgres doesn’t have unsigned column types out of the box. There’s an extension that enables it, but you’d have to know about that (which you should, if you’re managing a DB, but I digress).

                                                MySQL does have unsigned ints out of the box, FWIW.

                                                • NetMageSCW 5 hours ago

                                                  One of them would presumably break every customer if the API was properly documented.

                                              • estimator7292 5 hours ago

                                                As my last job was winding down (much to the disbelief and utter denial of the CEO) we'd ran out of money for Unity licenses and ran out of staff to use Unity. CEO decided that we absolutely must have a Unity demo that worked with the slightly newer generation of hardware I was wrapping up. Being the only programmer left, it was of course my problem to figure out. Oh and also this has to be ready for a show next week, so chop-chop.

                                                I ended up decompiling some android APKs our last Unity dev had built like eight months prior. I figured out how to extract our device driver library, then painstakingly rewrote the entire library to support new hardware while also maintaining a compatible ABI and stuffed it all back into the APK. I think I also had to forge some keys or something? It was a fucking mess. Anyway, that was the last work I ever did for him because he didn't pay me for about two months after that, and I quit the moment he gave me the wages he owed me.

                                                He's only got one employee and zero customers, but hey his stupid demo worked for all that mattered.

                                                • VladVladikoff 12 hours ago

                                                  I wonder how many Unix timestamps are going to wrap around to negative in 2032?

                                                  • OptionOfT 12 hours ago

                                                    None!

                                                    But 2038 is gonna be awesome!

                                                    • drob518 4 hours ago

                                                      I wonder how many airplanes are going to fall out of the sky? Or maybe we have to wait until January 1, 10000, for that.

                                                      • 6stringmerc 12 hours ago

                                                        I applaud your enthusiasm for life in the wasteland post US collapse / Balkanization and will meet you at the last functioning terminal in 2042!

                                                    • zmj 7 hours ago

                                                      If you're not doing math with it, it's a string.

                                                      • hobs 3 hours ago

                                                        I would say there are times that doing math with a primary key is a useful property (say, getting the Nth primary key (or so)) but if you are exposing it in an API I would say you would never even want a primary key projected in the first place.

                                                        A primary key is almost an implementation detail - a key that an API knows about something is one of many things that might point to this thing, might need to change, and generally might need a different representation (so don't make it your primary key.)

                                                        I also tell people to just use the bottom of any primary key space (when choosing monotonic stuff) but so many engineers just complain that they dont like the numbers (and yet many of them have had to deal with the migration a few years later so ... enjoy that I guess.)

                                                      • nikanj 15 hours ago

                                                        I wonder how many API users needed the attribute to be an integer (instead of just treating it as an opaque handle string), but didn't mind the integer turning negative

                                                        • cesaref 14 hours ago

                                                          I think the point is that the API doesn't specify that the returned integers are positive, or are monotonically increasing, then it's fine for the service to return any unique integer.

                                                          If a client application makes an assumption about this, then their engineers will accept this as being their bad and will fix it.

                                                          I'd defend this as being pragmatic - minimising disruption to clients instead of the more 'correct' solution of changing the API. I'm hoping that they managed to roll out the new API update alongside the old one and avoid a 'big bang' API change with this. Sometimes this isn't possible, but it's great when that works out.

                                                          • CodesInChaos 12 hours ago

                                                            I'm far more likely to assume that an integer-id I get from an API is non-negative or even positive than to assume that they're always smaller than 2^31. And I'd be far more likely to blame the API provider for violating the former assumption.

                                                            • NetMageSCW 5 hours ago

                                                              That sounds like a you problem.

                                                          • Kwpolska 13 hours ago

                                                            Probably none needed it to be an integer. At the same time, if the API contract says {id: integer, name: string}, then you are likely to have developers, especially in statically-typed languages, that will create a class with an int32 field, and tell the JSON parsing library to create instances of that class when deserializing the API response.

                                                          • valicord 14 hours ago

                                                            I don't get it. How would switching to bigint break the existing integrations?

                                                            • masklinn 13 hours ago

                                                              Any call from a typed langage distinguishing between 32b and 64b integers (that being most popular typed languages I reckon) would break if it had assumed / used the smaller of the two.

                                                              TBF using the negative range could also break callers distinguishing between signed and unsigned if they’d used the latter on their side depending how the API was documented.

                                                              • Kwpolska 13 hours ago

                                                                If the existing code was using int32, a switch to anything larger would cause integer overflows or JSON parsing errors in languages with strongly-typed fixed-width integer types.

                                                              • faxmeyourcode 3 hours ago

                                                                > No-one really likes engineering war stories

                                                                This is so wrong. I love reading these kinds of stories