« BackThe semver trick (2019)github.comSubmitted by jcbhmr 2 days ago
  • tantalor 2 days ago

    Not knowing rust, this is pretty meaningless to me.

    Is this trick relevant outside of rust?

    • adastra22 2 days ago

      Rust has a nice semver-based package management tool called cargo. One thing rust/cargo does really well is packaging up different versions of the same dependency for the same build. E.g. your project can update to widget=2.0 while some of your downstream dependencies are still stuck on the 1.x release branch. Cargo's semver support means that those dependencies can be automatically upgrade to support newer 1.x releases, but they will not work with 2.x without the package maintainer's intervention. In the mean time it just compiles and links to both, while keeping them isolated.

      Usually this is great. You avoid dependency hell while only trading off some compiled binary size. Sometimes it fails when a package must have only one version, e.g. because you're passing around data types from that package to/from your dependencies which use the old version, but part of rust/cargo's trick is that types drawn from different versions of the same package are distinct types. So you get a compile-time type check error.

      Solution: the widget package maintainer pushes out a final 1.x release which itself depends on widget=2, and is simply a shim that re-exports the API. This semver-violating trickery is basically a stealth upgrade that forces all upstream widget=1 pegged crates onto the new widget=2 release branch.

      This trick is generic and relevant outside of rust, at least so far as other package managers user semver-aware dependency management.

      • undefined 2 days ago
        [deleted]
        • wakawaka28 2 days ago

          I think I get how this works but other package managers would likely solve the problem by specifying that the v2 package "provides" the v1 interface, and then possibly deprecating or removing v1. At least I think that would work. It's not something I've sought to do. Most libraries in C or C++ do not support linkage of different versions in one app, so you must pick exactly one version to run with.

          • Tuna-Fish 2 days ago

            The problem with this is that v2 by definition does not provide the v1 interface, or there would be no reason to bump up the major version number.

            Using the trick, the last v1 package is a facade that constructs the old interface using the new package. To provide equivalent functionality, you basically have to do the same thing.

            • wakawaka28 2 days ago

              OK then if you want to keep the v1 interface exactly then you have to build a facade yourself. I don't think most libraries are capable of supporting that use case because of their semantic structure. Instead of employing a bunch of tricks to avoid fixing a couple of functions (the case where a facade makes sense), you should just fix the callers. That might not be easy for packages that are independently published (for example pypi or cargo packages) but it is possible in many package managers that have central teams of maintainers.

              • adastra22 2 days ago

                The whole thing is sus. If widget=2 can work as a drop-in replacement (non-breaking) for widget=1, then why the version bump in the first place?

                But in the real world there are times when major versions are bumped without breaking changes. E.g. a bunch of crates maintained as part of a single project with a uniform versioning scheme.

                • wakawaka28 2 days ago

                  I think now there must be some kind of adapter code, even if auto-generated, that is a facade from v1 to v2. You can essentially use that to release a new v1 minor revision that depends on v2 for its implementation. But yeah the whole thing sounds sus, not easily applicable to other languages (where does this adapter code come from and how does it coexist with v2?), and like a bad practice even when it can work. You should really just migrate everyone to the new version instead of patching together something to handle stubborn users of the old version.

        • edflsafoiewq 2 days ago

          It's relevant to any kind of dependency graph. Essentially it's converting

                A                  A       
               / \                / \      
             Cv2  B     into     |   B     
                   \              \   \    
                   Cv1             \  Cv1  
                                    \ /    
                                    Cv2    
          
          Suppose some exported symbol, X, didn't change from Cv1 to Cv2. In the former, A gets two copies of X depending on which path it took up the dependency tree. In the latter, there's only one copy since both paths terminate in the same place.
          • jerf 2 days ago

            I've used something very similar in Go. I can't point to it because it's an internal repository, but it allowed for relatively smooth co-existence between a v1 and a v2 of a particular package that had one aspect that was hard-to-upgrade, but also, generally rarely used compared to the rest. The rest of the types were fully identical because v1 simply re-exported the v2 types, so Go only complains if you try to cross the streams with the specifically-changed but rarely-used types, which is exactly the complaint I want.

            In the case of Go I'm not sure if it's important that v1 specifically re-export the v2; I think it would work equally well either way. But it's at least very similar.

            • Xylakant 2 days ago

              Theoretically, it can be. However, it’s only relevant in specific circumstances that Rust happens to provide, for example one of the requirements is that the package manager / build system allows you to have two different versions of the same dependencies in a single binary - which is comparatively rare.

              • clhodapp 2 days ago

                This feels like something similar would almost be relevant in Java Platform Modules, since modules also have the ability to re-export classes that they've imported. However, there are some small limitations that would make it annoying to adopt this pattern: Having more than one version of the same classname available in different modules requires boilerplate and it's difficult to define over some of the classes of a module that your module imports.

              • hinkley 2 days ago

                > The cause of the difficulty was the large number of crates using types from these libraries in their public API.

                This has bitten me on the ass so many times. Exposing third party types means you have to upgrade the library simultaneously across the entire application, even if the language allows you to load different versions of the same library in different parts of the app. And if you’re trying to skirt Conway, having multiple teams sharing a single deployable, you’re gonna have a bad time. Because good luck getting all teams to agree to drop everything and upgrade next sprint.

                I’d much rather my coworkers use libraries instead of NIH, but you have to be very careful how you use them. It gives me a little sympathy for microservices, but that’s often just swapping in a devil you don’t know.

                • do_not_redeem 2 days ago

                  Microservices only solve this because you have to serialize objects on the wire, no? You could accomplish the same within an app or lib by accepting a String and calling `serde_json::from_str` on it, avoiding the problems of microservices. Or more realistically, define your own struct and convert it to/from the dependency type at the interface boundary.

                  • hinkley 2 days ago

                    Process boundaries tend to increase friction to exposing internal implementation details like this. They don’t fix the problem but they do discourage it.

                    • undefined 2 days ago
                      [deleted]
                  • mgaunard 2 days ago

                    Using third-party libraries is usually a bad idea, and must be done so very carefully. If you do, onboard it onto your own build system, and pin it to a version you're going to provide support for and ensure compatibility with.

                    You should never have multiple versions of the same code in objects that end up linked together, that's a recipe for ODR.

                    • progval 2 days ago

                      Concrete example: the Apache Arrow format is designed to allow passing arrays across languages and libraries without conversion; so many public functions/methods (and FFIs) will have Arrow arrays in their signature.

                      However, the Rust Arrow library bumps its major version every few months because it changes methods and auxiliary types, without changing the memory layout of arrays. But this means you can't have two dependencies (eg. an ORC parser and a CSV serializer) that depend on different versions of Arrow if you want to pass arrays from one dependency's functions to the other's.

                      And vendoring like you mention won't help, because the Rust compiler wouldn't recognize both vendored Arrow crates as compatible.

                      • mgaunard 2 days ago

                        Use a single version of the arrow library, and build everything against that one. I don't see where the problem is.

                        I don't use Rust or third-party build or packaging systems -- I usually recommend that people don't do so either, but I understand that cargo is part of the appeal of Rust.

                        I'd say just build whatever you need to do, this way you won't be limited by arbitrary restrictions others have decided.

                        • progval 2 days ago

                          I can't always do that because dependency A depends dependency B that depends on arrow 52, and dependency C depends on arrow 53.

                          And sure, I can soft-fork dependency B to add support for arrow 53, and dependency A to depend on my soft-fork of dependency B, but it quickly becomes messy. Especially when in parallel I send unrelated patches to A and/or B.

                          • mgaunard a day ago

                            your problem is caring about what other people have decided. Surely metadata is irrelevant and can be changed/overridden.

                            You onboard a version and make sure everything you need can build against it, patching software as needed.

                          • hinkley a day ago

                            Security patches say no, doesn’t work that way.

                          • hinkley a day ago

                            Sounds like a team of misanthropes. How can you be that blind to DevEx?

                            • mgaunard 7 hours ago

                              For the best developer experience, you need to write custom tools.

                              • hinkley an hour ago

                                Eh. Software is full of cliches and idioms. I shouldn’t be writing my own logging library and I sure as fuck shouldn’t be writing my own security library.

                                Also don’t write your own billing system unless you’re a finance company.

                          • eastbound 2 days ago

                            CVEs are detected on third party libraries. This is why you must upgrade.

                          • alexchamberlain 2 days ago

                            If you have a single deployable(1), I think you need to have a person/team responsible for the whole. They need to be trusted to work on any of the code to make the whole build, without overstepping into micromanaging the whole code base. It's a tough balance.

                            (1) it has to be said I don't have much experience with mono-builds, but having brought together many services ( micro or not) from different teams and backgrounds, I wish for a little more coordination and governance.

                            • sethammons 2 days ago

                              > And if you’re trying to skirt Conway, having multiple teams sharing a single deployable, you’re gonna have a bad time. Because good luck getting all teams to agree to drop everything and upgrade next sprint.

                              this is why I'm not on team monorepo. Let other teams vendor their dependencies and upgrade on their own time. Just don't go trying to release N versions. Measure who is on an old version and work with them to update. But again, this slow team allows the other N-1 teams to adopt the new version and provide value to the org.

                              for those of you who are pro monorepo and have worked in one with dozens of teams in different orgs who have competing priorities and can't upgrade a dependency in lock-step, how have you solved this?

                              • mgaunard 2 days ago

                                1. ensure you have continuous integration against both stable and latest versions of everything, with the stable set of all transitive dependencies defined on a per-project basis, and what gets used for releases.

                                2. people are expected to fix the latest build whenever they have spare cycles. The incentive is that it is part of the pipeline for code reviews and can only be bypassed by the project owner. This means you'll need to support both stable and latest concurrently (bumping stable is also an option to simplify the requirement).

                                3. if you added conditional support for old stable/new latest and latest was eventually made stable, you can refactor that code.

                                • osigurdson 2 days ago

                                  Does monorepo confer that all projects use the same dependencies? Binary dependencies seem orthogonal to how source code is managed.

                                  • hinkley an hour ago

                                    I’ve done monorepo, separate compilation units before. It has its own set of issues but it cuts down on circular dependencies and makes more sense than having sixty repositories checked out. Those are hell.

                                    • adastra22 2 days ago

                                      Usually, yes. Monorepo means all dependencies are checked into the same repo. Even external dependencies (when source available) are checked in.

                                      • osigurdson 2 days ago

                                        Usually we are using package managers and just referencing packages. However, even in the hypothetical where I am downloading binaries and checking them in, I still don't see the relationship to the monorepo. I can have a monorepo with 100 folders representing 100 programs, each of these can have completely separate dependencies, languages, etc.

                                        Dependency versions matter when they are all part of the same program or library. Then you have the diamond dependency issue and related problems. However, none of this is related maintaining code in a monorepo. One program can be maintain in 10 different repos (impractical but possible) and still have the diamond dependency issue.

                                    • frizlab 2 days ago

                                      Wouldn’t subtrees fix this instead of a mono-repo? Or even (gasp) submodules?

                                      • osigurdson 2 days ago

                                        You mentioned sub modules - you are now on call until someone else mentions them again.

                                        • frizlab 2 days ago

                                          I’m fine with that :)

                                        • dallasg3 2 days ago

                                          I was looking into Paket for this on .NET.

                                          https://github.com/fsprojects/Paket

                                    • cortesoft 2 days ago

                                      Am I misunderstanding this, or will it break dependencies that actually rely on the “rarely used api”? Seems to me like this trick is just to get the compiler to ignore semver?

                                      • aldonius 2 days ago

                                        I see it not so much as getting the compiler to ignore semver but rather giving it more granular information about the nature of the changes.

                                        As I understand it:

                                        - Anything which is breaking-changed in any way for 0.3.0 (e.g. in the article a type change from i32 to u32) is the exact same code in 0.2.1 as it was in 0.2.0.

                                        - Anything which stays the same for 0.3.0 is re-exported in 0.2.1 (from 0.3.0). The code is replaced with a `pub use ...;`

                                        - Anything which moves around in the submodule hierarchy (but is otherwise unchanged) in 0.3.0 is re-exported in 0.2.1, from where it used to be.

                                        If crate Baz depends on something in Foo 0.2.x which changes in Foo 0.3.0, then when (if!) Baz updates to Foo 0.3.0 it will need to deal with that. But Baz can upgrade to Foo 0.2.1 without concern, and any types unchanged in Foo 0.3.0 will be interoperable.

                                        • felurx 2 days ago

                                          You only re-export the unchanged parts. EVFILT_AIO is the same in 0.2.0 and 0.2.1 (which is different - in practice and to the compiler - to EVFILT_AIO in 0.3). (Only) c_void is the same in 0.2.0, 0.2.1 and 0.3 (again, in practice and to the compiler.)

                                          • cpeterso 2 days ago

                                            And the user can postpone making their code work with u32 EVFILT_AIO until they update from v0.2.* to v0.3.

                                        • jwilk 2 days ago

                                          Discussed in 2020:

                                          https://news.ycombinator.com/item?id=24020254 (43 comments)

                                          • wpollock 2 days ago

                                            Isn't this issue caused by having multiple unrelated APIs in a single package? If they were in different crates, a breaking change on one would not affect users of the others.

                                            • ghssds 2 days ago

                                              >To the extent that it constitutes copyrightable work, the idea of depending on a future version of the same library is licensed under the CC0 1.0 Universal license (LICENSE-CC0) and may be used without attribution.

                                              An idea isn't copyrightable work; only the text explaining it is. Please don't propagate the idea that ideas are copyrightable.

                                              • conartist6 2 days ago

                                                Great point! Most people also aren't aware of the principle of convergence, which is why something like a specific semver pattern could not be the subject of copyright.

                                                This is the same principle that means that mathematical equations or basic API usage examples are not and cannot be the subject of copyright.

                                                • ironhaven 2 days ago

                                                  David tolnay should have filed a defensive software patent to be more precise then!

                                                  • conartist6 17 hours ago

                                                    It isn't clear to me that there is any such thing as a "defensive patent."

                                                    I've looked into trying to get such a thing for my work but my understanding is that it would offer me exactly 0 protection for being sued over what I have made.

                                                  • undefined 2 days ago
                                                    [deleted]
                                                  • nektro 2 days ago

                                                    this problem also doesn't happen when you build with HEAD and manage it with lockfiles

                                                    • adastra22 2 days ago

                                                      This problem can arise in your dependencies regardless of your use of lockfiles.