« BackSmolderingly Fast B-Treesscattered-thoughts.netSubmitted by surprisetalk 5 hours ago
  • BeeOnRope 2 hours ago

    To answer a question implied in the article, per-lookup timing with rdtscp hurts the hash more than the btree for the same reason the hash is hurt by the data-depending chaining: rdtscp is an execution barrier which prevents successive lookups from overlapping. rdtsc (no p) isn't, and would probably produce quite different timings.

    That the btree doesn't benefit from overlapping adjacent lookup/inserts is intereting.

    I suppose it is because btree access (here) involves data-dependent branches, and so with random access you'll get about L mispredicts per lookup in an L-deep tree, so adjacent lookups are separated by at least one mispredict: so adjacent lookup can overlap execution, but the overlapping is useless since everything beyond the next mispredict is useless as it is on the bad path.

    That's probably at least true for the small map regime. For the larger maps, the next iteration is actually very useful, even if on a mispredicted path, because the date accesses are at the right location so it serves to bring in all the nodes for the next iteration. This matters a lot outside of L2. At 5 instructions per comparison and 32-element nodes, however, there are just so many instructions in the window for 1 lookup it's hard to make it to the next iteration.

    So b-trees benefit a lot from a tight linear seach (e.g. 2 instructions per check, macro-fused to 1 op), or a branch-free linear search, or far better than those for big nodes, a vectorized branch-free search.

    • dwattttt an hour ago

      > the overlapping is useless since everything beyond the next mispredict is useless as it is on the bad path

      Is this a consequence of Spectre et al mitigations?

      • BeeOnRope an hour ago

        No, just a consequence of how mispredicts work: all execution after a mispredict is thrown away: though some traces remain in the cache, which can be very important for performance (and also, of course, Spectre).

        • dwattttt an hour ago

          That's the part I was curious about; whether there would've been a helpful cache impact, if not for modern Spectre prevention.

    • aidenn0 an hour ago

      It's been a while since I last tried things, but I found crit-bit trees[1] to be much faster than b-trees. Hash array-mapped tries are also good if you don't need the things that trees give you (e.g. in-order traversal, get all values in a certain range).

      1: https://cr.yp.to/critbit.html

      • josephg 43 minutes ago

        I'd be curious to see how performance would change from storing b-tree entries in a semi-sorted array, and applying various other optimizations from here:

        https://en.algorithmica.org/hpc/data-structures/b-tree/

        The aggregate performance improvements Sergey Slotin gets from applying various "tricks" is insane.

        • nialv7 5 minutes ago

          I feel I missed point of this article. I thought the author is trying to prove that b-tren isn't that bad compared to hashmaps. But taking 2~3x longer looks pretty bad.

          If I need predictable ordering (but not actually sorting the keys) I will use something like indexmap, not b-tree.

          • tekknolagi 3 hours ago

            I thought a lot of b(+)tree advantage was in bigger-than-RAM something or other for large databases and these benchmarks seem relatively small in comparison

            • foota 3 hours ago

              B-Trees are good for in memory data too because they have fairly good cache behavior.

              • robertclaus 2 hours ago

                Ya, I would be curious to see how this performs on out-of-cache data on an SSD and actual hard drive. On the other hand, the findings are definitely still relevant since RAM has gotten fairly cheap and most applications probably fit in it just fine.

                Regarding databases - Btrees also have a natural sort order, which hash tables don't. This means a btree as your main data structure helps with sort, range, or list operations in a way a hash tables can't. That being said, even traditional databases obviously still use hash tables extensively (ex. Hash joins).

                • scotty79 2 hours ago

                  In Rust thanks to it you can have BTreeSet of BTreeSet-s.

              • ur-whale an hour ago

                Unless I'm missing something, title of the article doesn't really correlate with its conclusion.

                • lsb 2 hours ago

                  Clojure, for example, uses Hash Array Mapped Tries as its associative data structure, and those work well