• ChadNauseam 21 hours ago

    I know cryptocurrencies have a lot of problems, but this is one place where it’s good they exist: they’ve funneled a ton of money into researching new cryptography. The article claims that this research was developed for StarkWare, which developed its own form of zero knowledge proofs they called Starks, developed a programming language in which cryptographic proofs can be generated showing that the output of the program was truly produced from the inputs[^1], and continued to do more research like this. While I’m not a cryptographer, I think this sort of thing is super cool and I’m glad there’s a lot of attention and effort going into it.

    ^[1] the cool thing about these proofs is that they can be verified in much less time than is required to run the program. IIRC, starks can be verified in log-polynomial time, and snarks can be verified in constant time. This allows cryptocurrency protocols to have one computer in the network generate a proof of the state of the chain given a new block, and all other are able to quickly check it. Without this, you have a ton of waste because every computer in the network must duplicate the work of computing the new chain state. (although some new waste is introduced by the fact that producing a proof is very slow, although this is another area that cryptocurrency-funded research has improved dramatically)

    Another cool bit of cryptography research that I think was motivated by crypto was “verkle trees”, an improvement on merkle trees that uses vector commitments rather than your typical hash. A vector commitment is like a hash, but it hashes an array of items and you can check that the hash was produced from an array with a certain item at a certain index without knowing the other items in the array. If you use this to back a merkle tree, you can increase the branching factor a lot and end up with much smaller proofs that a particular item is in the tree. This is because the proof size for a merkle tree is b * log_b n (where b is the branching factor and n is the number of items in the tree). The proof size for a verkle tree is just log_b n, so you can increase b as much as you want without making the proofs bigger.

    Full disclosure: I work for a blockchain company, but not one mentioned here. I have no equity or stake in crypto outside of the fact that if the price goes to 0 my company probably can no longer afford to employ me

    • treyd 18 hours ago

      The other cool thing about verkle trees is that they were mostly figured out by a high schooler for a science fair (?) project.

      They're not actually that complicated in principle, just nobody thought of it before and that's pretty cool.

      • ChadNauseam 17 hours ago

        Wow, that is cool. I'm actually annoyed that vitalik didn't credit him in his article about them. I wonder how many people think that the v stands for vitalk

        • a1369209993 16 hours ago

          > vitalik didn't credit him in his article

          You mean this article?

          https://vitalik.eth.limo/general/2021/06/18/verkle.html

          > Verkle trees are still a new idea; they were first introduced by John Kuszmaul in this paper from 2018[link to [0]],

          0: https://math.mit.edu/research/highschool/primes/materials/20...

          • Zamiel_Snawley 14 hours ago

            I want to highlight the fact that the quote from Vitalik’s article is the first sentence of the second paragraph. Credit is prominently given, not buried in any way.

            • ChadNauseam 10 hours ago

              Oh, I wonder if he edited it, I googled it to find what the GP was talking about and saw this which claimed he didn't: https://news.ycombinator.com/item?id=27557503

              • olddustytrail 2 hours ago

                From your link:

                vbuterin on June 19, 2021 | prev | next [–]

                Thanks! I actually wasn't aware of the intellectual history. I added a link to your paper at the start of my post.

        • Ar-Curunir 2 hours ago

          This research was not developed by Starkware. STARKs (and indeed many other SNARKs) use techniques like those in this work, but the authors are not related to Starkware in any way

          • joshuamorton 21 hours ago

            > The article claims that this research was developed for StarkWare,

            Does it? Starkware is mentioned only as the company run by some.guy who commented on how pivotal this work was.

            It doesn't say he was related, and I can't find any affiliation between the researchers (all uk based at Cambridge and Warwick) and the US based StarkWare.

            • ChadNauseam 17 hours ago

              Oh, maybe it was unrelated, I think I misread the article

          • ljlolel 3 hours ago

            I once took a graduate level course at Princeton and the whole course for months was just spent understanding step by step the PCP proof. It was beautiful and subtle and had layers of complexity built on itself in layers. It was very hard —I can’t say I got it fully. (Arora himself taught it)

            Zero knowledge proofs are way simpler to understand but also brilliant. I applaud Quanta's effort to try to explain both these concepts.

            I love to learn the limits of our knowledge and what is provable and how to exploit that for fun and profit.

            • plank 4 hours ago

              There seems to be a problem with the article: the example shown is with colouring the map. But showing two adjacent regions to have different colours proves nothing: they will always be different (if colouring is possible).

              Perhaps if one shows two regions that do not share a border, and state that they are or are not the same colour...

              • kevinwang 2 hours ago

                I think the example is fine. Two adjacent regions will indeed always be different if the coloring is correct. But it may not be if the coloring is incorrect.

              • ianandrich 20 hours ago

                I know this is useful for crypto, but I think think I'm actually more interested in what new modes of remote code running on untrusted platforms this enables.

                • ChadNauseam 17 hours ago

                  That is exactly the reason it's useful for crypto, nodes need to verify the output of code running on other nodes without trusting them.

                  • drhodes 18 hours ago

                    It does seem like the article touches on concerns relevant to homomorphic encryption. Maybe someone knows if there is a connection.

                    • Ar-Curunir 2 hours ago

                      This scheme in particular is not useful for cryptocurrencies; we already knew how to make efficient zkSNARKs with perfect zero-knowledge before this result. This paper is a beautiful work of theory.

                    • ziofill 13 hours ago

                      A professor I know once gave me the most beautiful example of a zero knowledge proof: she said “Imagine I want to prove to you that I can count the leaves on a tree, but I don’t want to reveal how I do it. I start by telling you there are e.g. 756,912 of them. Then I close my eyes and I let you remove as many as you want. We can keep going until you are convinced, and you still won’t know how I did it.”

                      • firejake308 13 hours ago

                        Okay, but I wouldn't be convinced until I've counted up to 756,912 myself and seen that there are no more leaves left after that. Isn't it supposed to be less computationally intensive to verify the proof than to complete the original calculation?

                        I tried learning about zero knowledge proofs during the crypto craze, but I never understood the basic idea of how it was supposed to work, much less the math involved. I'd appreciate an ELI5 from anyone who has one.

                        • jonjonsonjr 12 hours ago

                          The proof is that they can tell you the number of leaves (again) after you have secretly removed some. Since only you know the number you removed, if the difference between their counts matches your number, it is likely that they can indeed count the leaves on the tree. It is possible they have correctly guessed, which is why you repeat the challenge until you are convinced.

                          • zmgsabst 10 hours ago

                            Or they have a way to measure your removal, eg, they don’t count leaves but branches without leaves (in the hypothetical tree example).

                        • auggierose 8 hours ago

                          Sorry, that doesn't make any sense.

                          Edit: From another comment, it starts do make sense. So you let them remove a number of leaves secretly, and then you tell them the new number again, and they can check if your answer makes sense because of the difference between the two numbers.

                          • ChadNauseam 10 hours ago

                            This is a nice example, but couldn't he add a constant to his count as long as he adds the same constant each time? Regardless, it's nice

                          • scotty79 18 hours ago

                            > If someone finds a valid solution, they can easily convince a skeptical “verifier” that it really is valid. The verifier, in turn, will always be able to spot if there’s a mistake. Problems with this property belong to a class that researchers call NP.

                            How do you quickly verify that traveling salesman path is indeed the shortest one?

                            • JohnKemeny 18 hours ago

                              You have discovered that that problem indeed is not in NP, for the reason that it is not a decision problem. The decision problem is in NP, and there the problem is: given a graph and some value k, does there exist a TSP with cost at most k. You can see how that problem then becomes verifiable.

                              • Zamiel_Snawley 13 hours ago

                                I think this is a great distinction that deserves elaboration for those less familiar with the topic. Can you explain a bit further?

                                • ChadNauseam 10 hours ago

                                  A decision problem is a problem with a yes or no answer. An NP problem is one where a "yes" answer can be verified quickly if they also give you some proof or "witness" of it. If I say "is there a route that goes through all these cities that's less than 500 miles long", and you answer yes, you can prove that such a route exists by showing me one of these and I can simply check how long the route is. So the problem is in NP because a "yes" answer can be proven or checked quickly.

                                  However, if you say "no, there is no such route", there's not obviously any way to quickly show that. Despite that, the problem is still in NP because to be in NP there only needs to be a quickly-checkable proof of a "yes" answer. If you want a quickly checkable proof of a "no" answer, you need a separate class of problem called co-np.

                                  A problem can also be in np and co-np at the same time, if both "yes" answers and "no" answers can have a proof that can be checked quickly.

                                  • progval 9 hours ago

                                    In this context we can divide problems into two cases: optimization problems ("find a solution that satisfies P, in a way that has the highest score") and decision problems ("is there a solution that satisfies P, with score >= X" or "find a solution that satisfies P, with score >= X").

                                    Complexity classes like NP are defined only for decision problems, not for optimization problems. NP can be defined either as

                                    * the set of decision problems where, given a solution, you can check it in polynomial time with a deterministic Turing machine, or * the set of decision problems solvable in polynomial type by a non-determistic Turing machine

                                    these two definitions being equivalent.

                                    When someone mentions an optimization problem being in P, NP, or NP-hard, they actually mean the associated decision problem being in P/NP/NP-hard.

                                    While technically incorrect, this is fine when working informally, because you can "translate" between the two in polynomial time.

                                    If I give you an oracle (ie. a magic box) that solves an optimization problem (ie. gives you a solution to P with the highest score) immediately, you can trivially write a polynomial time algorithm that solves the decision problem: call the oracle, check the score of its answer, then compare that with the X value you were given. And vice versa: if I give you an oracle that solves a decision problem immediately (ie. given a value X, gives you a solution satisfying P with score >= X), you can write a polynomial time algorithm that uses the oracle a few times with the right values of X (exponential search then bisection), to find the solution with the highest score.

                                    • akoboldfrying 4 hours ago

                                      Nitpick: Decision problems only have yes/no answers ("is there a solution that satisfies P, with score >= X"); problems that ask for actual solutions with a more complicated structure ("find a solution that satisfies P, with score >= X") are function problems, not decision problems.

                                      Interestingly, even though the solution to a decision problem only gives you 1 bit of information, for all NP problems I'm aware of, solving a polynomial number of problem instances is still enough to recover a full solution. For example, suppose we're looking for a maximum clique in a graph. First, binary search as you describe to find the size of the maximum clique. Call that size X. Then to find an actual clique of size X, repeat the following for each vertex v in the graph, in any order:

                                      1. Tentatively delete v and solve the problem "Is there now a clique of size >= X?".

                                      2. If the answer is yes, delete v permanently: we can ignore it from this point on since there is some X-clique that avoids it, and we already know from our initial binary search that that clique is best-possible.

                                      3. If the answer is no, v must belong to every X-clique in the graph. We can't do without it, so reinstate it in the graph.

                                      Afterwards, exactly X vertices will remain -- the vertices of some maximum clique in the original graph.

                                  • JohnKemeny 18 hours ago

                                    Verifying that a path/tour is shortest is in co-NP.