• dataflow 7 hours ago

    > There are problems for which deterministic exact sublinear time algorithms are known.

    I can imagine silly examples (like "find the minimum element in this list under the assumption that no more than O(sqrt(n)) elements exceed the minimum"...), but what's an interesting example of this?

    • _jab 7 hours ago

      Binary search is the obvious example.

      What it and your example have in common is that a significant constraint exists on the input. I can't imagine how a deterministic algorithm with unconstrained input can process that input in sublinear time, but I would love to learn otherwise.

      • karparov 2 hours ago

        Another obvious example: What's the mean of an unsorted list of integers? If you do a random sample of sqrt(n) values and mean over that, you guess is with high probability pretty good. Or a log(n) sample. (That's how election polling works too which uses even a O(1) sample though not random.)

        Edit: Ah, GP asked for deterministic exact.

        • amelius 25 minutes ago

          What if one if the integers is significantly larger than the rest?

        • dataflow 7 hours ago

          I can't imagine that's what they meant? The text very specifically says: "Indeed, it is hard to imagine doing much better than that, since for any nontrivial problem, it would seem that an algorithm must consider all of the input in order to make a decision." For them to be thinking of binary search, they would have to be effectively saying "it is hard to think of binary search", which would be a rather absurd position from a CS professor, especially given binary search is quite literally the first algorithm every programmer learns.

          So I took it to mean there's something interesting here where the inputs could literally be anything, not heavily constrained. But I can't imagine what that might be.

          • kadoban 7 hours ago

            > especially given binary search is quite literally the first algorithm every programmer learns.

            I get what you're saying, and it doesn't change your point, but: no _way_ is binary search the first algorithm people learn. For binary search to even be intelligible you already have to know, at a minumum linear search and the concept of sorting (you probably know a sorting algorithm first). You also learn dozens of other really simple algorithms first in practice.

            • ssivark 3 hours ago

              > no _way_ is binary search the first algorithm people learn. For binary search to even be intelligible you already have to know, at a minumum linear search and the concept of sorting

              Almost every 6-10 year old kid who had to use physical dictionaries intuitively learned (probably even discovered by themselves) something like binary search. It's a different matter whether they could formalize that into an algorithm and write code to handle all the edge cases. But the basic idea is very intuitive. Kids can also pick up the intuition to incorporate improvements even beyond balanced binary search eg. there might be a lot of words starting with "S" so split into two groups at a little less than the middle, etc.

              • karparov 2 hours ago

                Moving the goal post?

                If you are asking which is the "first algorithm" a human learns in their life then it's likely more related to movement (crawl? walk? move food towards mouth?) or selection (which item can I eat? who are my parents?) rather than a physical dictionary. Even considering that it's been a while since kids encountered a physical dictionary.

                If you are asking about formal algorithms then we're talking about the beginning of a programmers or computer scientists education and then it's usually some form of O(n^2) sort that they will encounter first, if we don't count things like "how to add two multi-digit integers" which is typically an algorithm every kid learns in primary school.

                Binary search tends to be one of the first recursive algorithms that are taught which is another level entirely regarding intellectual development.

                • ssivark 24 minutes ago

                  I guess my response was to how I read your comment fitting in with the higher level discussion. My main point is that many of these algorithms are intuitive, and kids learn these much earlier than when they learn formal programming (which might typically be in their teens).

                  Looking over your comment again, I also don't dispute that linear search and sorting are simpler -- even toddlers learn these.

              • bee_rider 5 hours ago

                Agreed, WRT the bigger picture; lots of little algorithms could come before binary search.

                But, giving them binary search before sorting kinda works. It is motivating. If you do sorting first, it just seems like a weird high-effort excursion into a niche bookkeeping thing about lists. Once they see how fast binary search is (just give them pre-sorted lists to run it on), sorting becomes interesting, right?

                • linguae 5 hours ago

                  This is what I do in my introductory data structures and algorithms course at a Bay Area community college: I teach binary search as part of my introduction to recursion, and then the following lectures are a series of lessons on sorting algorithms, beginning with selection sort and then insertion sort. After teaching these O(n^2) algorithms, I spend a lecture on merge sort and then have a lecture on quicksort, both for covering O(n lg n) sorts and for introducing my students to divide-and-conquer recursive algorithms.

                  • bee_rider 3 hours ago

                    It is a shame that quicksort has to be covered. I mean, it does have to be covered. But it has an O(n^2) cost for a particular input, despite being generally considered nlog(n), seems to me to introduce some fuzziness in an otherwise solid concept.

                    But it does need to be covered.

                    Unfortunately.

                    (Mergesort is best).

                    • ncruces 39 minutes ago

                      Quicksort is an important, extremely flexible, and very hard to beat unstable comparison sort.

                      It's based on a very simple but powerful idea/strategy (divide & conquer). Its flexibility means it can be adapted to partially sort, find top-N, find the median or any other rank, all optimally.

                      And it's so much faster in practice than everything else (why?), that even after mitigating its worse case, it often comes out ahead.

                      Also, it is relevant/necessary to teach the concept of average, best and worse cases in complexity analysis. What best way to do it than “the best sorting algorithm is terrible for some inputs”?

                      You can also use it to teach/learn adaptive algorithms (you're almost expected too): switch to something else on the base case; or on the worst case; can we do better for low cardinality; etc.

                      So, of course it needs to be covered. There's more to learn from 200 lines of Quicksort than from Mergesort: https://github.com/ncruces/sort/blob/main/quick/quick.go

                      • chongli 2 hours ago

                        But it has an O(n^2) cost for a particular input

                        Even worse is the fact for naive implementations (such as students might come up with) the worst case behaviour occurs in very common cases such as sorting already sorted lists or reverse-sorted lists.

                    • kadoban 5 hours ago

                      Yeah, it does work as something you learn with/right-before/right-after a sorting algorithm, depending on the teaching style.

                  • FreakLegion 6 hours ago

                    They say a little ways down:

                    > there are classical optimization problems whose values can be approximated in sublinear time

                    This can actually be quite useful if the approximation is guaranteed, or even if it isn't, as long as it works well in practice.

                    https://en.wikipedia.org/wiki/Hardness_of_approximation

                    • dzaima 6 hours ago

                      I read that as saying that binary search isn't among those "nontrivial problem"s, along with most other things with known exact deterministic sublinear time algorithms.

                      And your first quote is followed by "However, for most natural problems", which further indicates that the known exact algorithms are for trivial problems.

                    • spoaceman7777 5 hours ago

                      I mean, yeah, binary search is sublinear, but the data has to be ordered into a binary search tree to be able to use it, which has a much more familiar (non-sub-linear) runtime.

                      I have to assume the reason for the article wasn't to talk about the runtime of algorithms that operate on data that's already in a partially solved state.

                    • ssivark 4 hours ago

                      Think from an information theory perspective. It is rarely true that you cannot say anything more about the data than what is assumed by classical algorithms. We almost always have some more information depending on the specific domain under consideration. Eg: Sorting a list of ages might be very different from sorting a list of account balances.

                      Any time I have information that reduces the entropy of the dataset, I want to be able to leverage that into runtime improvements of algorithms for pertinent questions. And it would be great to develop a structured framework for that instead of handling special cases in an ad-hoc manner.

                      • ssivark 20 minutes ago

                        As one example of such a more general framework -- (variants of) belief propagation might be a good answer if dataset constraints could be cleanly formulated as distributions to be reasoned with.

                      • minutillo 5 hours ago
                        • dataflow 5 hours ago

                          Isn't that O(mn) worst-case run time?

                          • quuxplusone 4 hours ago

                            No, it's O(n) worst case. (The Wikipedia sidebar says "O(mn)," but that's apparently for a maimed version of the algorithm without a key part they're calling "the Galil rule." That's a special usage of the phrase "worst case"! In the absolute worst case, your implementation could have a bug and never terminate at all!)

                            Anyway, the point is that it's O(n/m) in the usual case. Which remains technically linear, not sub-linear; but at least it's O(n) with a constant factor smaller than 1.

                            • dataflow 4 hours ago

                              > No, it's O(n) worst case. (The Wikipedia sidebar says "O(mn)," but that's apparently for a maimed version of the algorithm without a key part they're calling "the Galil rule.")

                              It's not "maimed", it's literally the original algorithm. And what the article was specifically analyzing. And exactly what the parent was citing. "No" here makes no sense, unless your goal was just to write "no" to someone on the internet.

                              > That's a special usage of the phrase "worst case"! In the absolute worst case, your implementation could have a bug and never terminate at all!

                              Wikipedia is describing that algorithm, not a different broken one. If your code is buggy then you're not implementing that algorithm, you're implementing a different one that happens to be buggy. It's completely absurd to suggest "the absolute worst case" of an algorithm could include that of a different algorithm. Whether the latter is correct or buggy.

                              > Anyway, the point is that it's O(n/m) in the usual case.

                              Sure, and the halting problem is O(1) in the best case.

                              > Which remains technically linear, not sub-linear

                              So it's neither an example of what the page was talking about (sublinear) nor an answer to my question (interesting sublinear).

                              > but at least it's O(n) with a constant factor smaller than 1.

                              If "usually only reads a fraction of the input" was what I was looking for, I would've realized String.indexOf(char) or Array.find(element) is an answer, and not needed to ask a question here.

                        • qyph 5 hours ago

                          https://en.wikipedia.org/wiki/AKS_primality_test though it's number theory, and concerned with numbers of size n, rather than lists of length n.

                          Also relevant: https://www.cs.yale.edu/homes/aspnes/pinewiki/Derandomizatio...

                          • dataflow 5 hours ago

                            > https://en.wikipedia.org/wiki/AKS_primality_test though it's number theory, and concerned with numbers of size n, rather than lists of length n.

                            They were talking about not reading a lot of the input, so that's not it.

                            • alok-g 3 hours ago

                              For that case, a better 'n' to use could be the number of digits in the number.

                              • Ar-Curunir 4 hours ago

                                AKS is not sublinear. It runs in poly(n) time, where n is the number of bits in the input (i.e. input size).

                              • gleenn 7 hours ago

                                Anything probabilistic? There are so many interesting fields where you can assume the distribution of a dataset and the take a sample of data and assert things about it with a high degree of confidence. All of modern AI is built on so much of this. All the Deep Neural Nets are making grand assumptions about the shape of meaning of data, they literally assume convexity of the space and they have clearly very interesting results despite the imprecision of the model. Anything dealing with finance is also dealing in lack of data. So if you had a list of prices of a stock over time, you could probably start making assumptions exactly like that, tgat the probability that it doubles over a short time is so unlikely so you can subsample the data and have it still be super useful to make assumptions exactly, especially when you have intractably large data.

                                • dataflow 7 hours ago

                                  >> deterministic exact

                                  > Anything probabilistic?

                                  Are you sure you're answering the same question I'm asking?

                                • BugsJustFindMe 4 hours ago

                                  Consider that you often need to decide when to stop looking at data before making a decision using what you've seen so far.

                                  https://en.wikipedia.org/wiki/Optimal_stopping

                                  • dataflow 4 hours ago

                                    Cool as that is, I don't think that's a "deterministic exact sublinear time algorithm".

                                    • BugsJustFindMe 4 hours ago

                                      You're probably right. Apologies. I think I misread the question initially.

                                  • z2210558 6 hours ago

                                    Assuming shuffled list: estimate of mean, estimate of cardinality etc etc

                                    • munchler 5 hours ago

                                      I think any sort of estimation is ruled out by the word “exact”.

                                    • tbrownaw 7 hours ago

                                      Public opinion polling is sub-linear in the size of the population.

                                      • dataflow 7 hours ago

                                        > Public opinion polling is sub-linear in the size of the population.

                                        How is public opinion polling deterministic and exact?

                                      • latency-guy2 6 hours ago

                                        I wouldn't call that example silly IMO.

                                        I'd consider all the varieties of B-Tree to be real example, which goes to any DBMS. You can extend this out to any direction you want like logging for concrete examples.

                                        GIS/Mapping/computer vision has tons of algorithms and data structures that all needed to do better than linear time as well.

                                        Stream processing in general is another, but that ends up being probabilistic more often than not, so weak punt into that direction.

                                        If you expand the use case out to sublinear space as well, I'd argue for compression of all kinds.

                                      • shae 7 hours ago

                                        This sounds like a useful way to mix in statistics and get useful approximations. I'm reading one of the survey links and it's approximately eye opening.

                                        • dooglius 7 hours ago

                                          Often for problems taking integer input x, formal CS will define the input to be something like 1^x (the character '1' repeated x times, as opposed to binary) so that the time complexity is in terms of x. This class of problems seems amenable to sublinear time since only only needs log(x) steps to determine x.

                                          • LPisGood 2 hours ago

                                            Actually if you try to formally define sublinear time algorithms in this manner they all collapse to constant time algorithms.

                                            To see why, realize that to determine x, the TM needs to look at x many bits. If this algorithm only needed, say, log(x) many bits to produce output then all input values with more than log(x) bits are indistinguishable by this TM.

                                          • doormatt 5 hours ago

                                            So like HyperLogLog?

                                            • qyph 5 hours ago

                                              Hyperloglog analyses generally assume access to the full data stream, and so are O(n) at a minimum. Perhaps by running hyperloglog on a sublinear sample of the dataset you'd get an algorithm in this class.

                                              • doormatt 5 hours ago

                                                That makes sense, thanks for explaining!

                                              • dataflow 4 hours ago

                                                HyperLogLog uses sublinear space, not sublinear time.