Comments Page - Computer scientists combine two 'beautiful' proof methods

« Back Computer scientists combine two 'beautiful' proof methodsquantamagazine.orgSubmitted by tambourine_man 9 months ago

ChadNauseam 9 months ago
I know cryptocurrencies have a lot of problems, but this is one place where it’s good they exist: they’ve funneled a ton of money into researching new cryptography. The article claims that this research was developed for StarkWare, which developed its own form of zero knowledge proofs they called Starks, developed a programming language in which cryptographic proofs can be generated showing that the output of the program was truly produced from the inputs[^1], and continued to do more research like this. While I’m not a cryptographer, I think this sort of thing is super cool and I’m glad there’s a lot of attention and effort going into it.
^[1] the cool thing about these proofs is that they can be verified in much less time than is required to run the program. IIRC, starks can be verified in log-polynomial time, and snarks can be verified in constant time. This allows cryptocurrency protocols to have one computer in the network generate a proof of the state of the chain given a new block, and all other are able to quickly check it. Without this, you have a ton of waste because every computer in the network must duplicate the work of computing the new chain state. (although some new waste is introduced by the fact that producing a proof is very slow, although this is another area that cryptocurrency-funded research has improved dramatically)
Another cool bit of cryptography research that I think was motivated by crypto was “verkle trees”, an improvement on merkle trees that uses vector commitments rather than your typical hash. A vector commitment is like a hash, but it hashes an array of items and you can check that the hash was produced from an array with a certain item at a certain index without knowing the other items in the array. If you use this to back a merkle tree, you can increase the branching factor a lot and end up with much smaller proofs that a particular item is in the tree. This is because the proof size for a merkle tree is b * log_b n (where b is the branching factor and n is the number of items in the tree). The proof size for a verkle tree is just log_b n, so you can increase b as much as you want without making the proofs bigger.
Full disclosure: I work for a blockchain company, but not one mentioned here. I have no equity or stake in crypto outside of the fact that if the price goes to 0 my company probably can no longer afford to employ me
- treyd 9 months ago
  The other cool thing about verkle trees is that they were mostly figured out by a high schooler for a science fair (?) project.
  They're not actually that complicated in principle, just nobody thought of it before and that's pretty cool.
  ChadNauseam 9 months ago
  Wow, that is cool. I'm actually annoyed that vitalik didn't credit him in his article about them. I wonder how many people think that the v stands for vitalk
  a1369209993 9 months ago
  > vitalik didn't credit him in his article
  You mean this article?
  https://vitalik.eth.limo/general/2021/06/18/verkle.html
  > Verkle trees are still a new idea; they were first introduced by John Kuszmaul in this paper from 2018[link to [0]],
  0: https://math.mit.edu/research/highschool/primes/materials/20...
  Zamiel_Snawley 9 months ago
  I want to highlight the fact that the quote from Vitalik’s article is the first sentence of the second paragraph. Credit is prominently given, not buried in any way.
  ChadNauseam 9 months ago
  Oh, I wonder if he edited it, I googled it to find what the GP was talking about and saw this which claimed he didn't: https://news.ycombinator.com/item?id=27557503
  olddustytrail 9 months ago
  From your link:
  vbuterin on June 19, 2021 | prev | next [–]
  Thanks! I actually wasn't aware of the intellectual history. I added a link to your paper at the start of my post.
- joshuamorton 9 months ago
  > The article claims that this research was developed for StarkWare,
  Does it? Starkware is mentioned only as the company run by some.guy who commented on how pivotal this work was.
  It doesn't say he was related, and I can't find any affiliation between the researchers (all uk based at Cambridge and Warwick) and the US based StarkWare.
  ChadNauseam 9 months ago
  Oh, maybe it was unrelated, I think I misread the article
- Ar-Curunir 9 months ago
  This research was not developed by Starkware. STARKs (and indeed many other SNARKs) use techniques like those in this work, but the authors are not related to Starkware in any way
- monero-xmr 9 months ago
  [dead]
ljlolel 9 months ago
I once took a graduate level course at Princeton and the whole course for months was just spent understanding step by step the PCP proof. It was beautiful and subtle and had layers of complexity built on itself in layers. It was very hard —I can’t say I got it fully. (Arora himself taught it)
Zero knowledge proofs are way simpler to understand but also brilliant. I applaud Quanta's effort to try to explain both these concepts.
I love to learn the limits of our knowledge and what is provable and how to exploit that for fun and profit.
ianandrich 9 months ago
I know this is useful for crypto, but I think think I'm actually more interested in what new modes of remote code running on untrusted platforms this enables.
- ChadNauseam 9 months ago
  That is exactly the reason it's useful for crypto, nodes need to verify the output of code running on other nodes without trusting them.
- Ar-Curunir 9 months ago
  This scheme in particular is not useful for cryptocurrencies; we already knew how to make efficient zkSNARKs with perfect zero-knowledge before this result. This paper is a beautiful work of theory.
- drhodes 9 months ago
  It does seem like the article touches on concerns relevant to homomorphic encryption. Maybe someone knows if there is a connection.
ziofill 9 months ago
A professor I know once gave me the most beautiful example of a zero knowledge proof: she said “Imagine I want to prove to you that I can count the leaves on a tree, but I don’t want to reveal how I do it. I start by telling you there are e.g. 756,912 of them. Then I close my eyes and I let you remove as many as you want. We can keep going until you are convinced, and you still won’t know how I did it.”
- firejake308 9 months ago
  Okay, but I wouldn't be convinced until I've counted up to 756,912 myself and seen that there are no more leaves left after that. Isn't it supposed to be less computationally intensive to verify the proof than to complete the original calculation?
  I tried learning about zero knowledge proofs during the crypto craze, but I never understood the basic idea of how it was supposed to work, much less the math involved. I'd appreciate an ELI5 from anyone who has one.
  jonjonsonjr 9 months ago
  The proof is that they can tell you the number of leaves (again) after you have secretly removed some. Since only you know the number you removed, if the difference between their counts matches your number, it is likely that they can indeed count the leaves on the tree. It is possible they have correctly guessed, which is why you repeat the challenge until you are convinced.
  zmgsabst 9 months ago
  Or they have a way to measure your removal, eg, they don’t count leaves but branches without leaves (in the hypothetical tree example).
- auggierose 9 months ago
  Sorry, that doesn't make any sense.
  Edit: From another comment, it starts do make sense. So you let them remove a number of leaves secretly, and then you tell them the new number again, and they can check if your answer makes sense because of the difference between the two numbers.
- ChadNauseam 9 months ago
  This is a nice example, but couldn't he add a constant to his count as long as he adds the same constant each time? Regardless, it's nice
plank 9 months ago
There seems to be a problem with the article: the example shown is with colouring the map. But showing two adjacent regions to have different colours proves nothing: they will always be different (if colouring is possible).
Perhaps if one shows two regions that do not share a border, and state that they are or are not the same colour...
- kevinwang 9 months ago
  I think the example is fine. Two adjacent regions will indeed always be different if the coloring is correct. But it may not be if the coloring is incorrect.
scotty79 9 months ago
> If someone finds a valid solution, they can easily convince a skeptical “verifier” that it really is valid. The verifier, in turn, will always be able to spot if there’s a mistake. Problems with this property belong to a class that researchers call NP.
How do you quickly verify that traveling salesman path is indeed the shortest one?
- JohnKemeny 9 months ago
  You have discovered that that problem indeed is not in NP, for the reason that it is not a decision problem. The decision problem is in NP, and there the problem is: given a graph and some value k, does there exist a TSP with cost at most k. You can see how that problem then becomes verifiable.
  JohnKemeny 9 months ago
  Verifying that a path/tour is shortest is in co-NP.
  Zamiel_Snawley 9 months ago
  I think this is a great distinction that deserves elaboration for those less familiar with the topic. Can you explain a bit further?
  ChadNauseam 9 months ago
  A decision problem is a problem with a yes or no answer. An NP problem is one where a "yes" answer can be verified quickly if they also give you some proof or "witness" of it. If I say "is there a route that goes through all these cities that's less than 500 miles long", and you answer yes, you can prove that such a route exists by showing me one of these and I can simply check how long the route is. So the problem is in NP because a "yes" answer can be proven or checked quickly.
  However, if you say "no, there is no such route", there's not obviously any way to quickly show that. Despite that, the problem is still in NP because to be in NP there only needs to be a quickly-checkable proof of a "yes" answer. If you want a quickly checkable proof of a "no" answer, you need a separate class of problem called co-np.
  A problem can also be in np and co-np at the same time, if both "yes" answers and "no" answers can have a proof that can be checked quickly.
  progval 9 months ago
  In this context we can divide problems into two cases: optimization problems ("find a solution that satisfies P, in a way that has the highest score") and decision problems ("is there a solution that satisfies P, with score >= X" or "find a solution that satisfies P, with score >= X").
  Complexity classes like NP are defined only for decision problems, not for optimization problems. NP can be defined either as
  * the set of decision problems where, given a solution, you can check it in polynomial time with a deterministic Turing machine, or * the set of decision problems solvable in polynomial type by a non-determistic Turing machine
  these two definitions being equivalent.
  When someone mentions an optimization problem being in P, NP, or NP-hard, they actually mean the associated decision problem being in P/NP/NP-hard.
  While technically incorrect, this is fine when working informally, because you can "translate" between the two in polynomial time.
  If I give you an oracle (ie. a magic box) that solves an optimization problem (ie. gives you a solution to P with the highest score) immediately, you can trivially write a polynomial time algorithm that solves the decision problem: call the oracle, check the score of its answer, then compare that with the X value you were given. And vice versa: if I give you an oracle that solves a decision problem immediately (ie. given a value X, gives you a solution satisfying P with score >= X), you can write a polynomial time algorithm that uses the oracle a few times with the right values of X (exponential search then bisection), to find the solution with the highest score.
  akoboldfrying 9 months ago
  Nitpick: Decision problems only have yes/no answers ("is there a solution that satisfies P, with score >= X"); problems that ask for actual solutions with a more complicated structure ("find a solution that satisfies P, with score >= X") are function problems, not decision problems.
  Interestingly, even though the solution to a decision problem only gives you 1 bit of information, for all NP problems I'm aware of, solving a polynomial number of problem instances is still enough to recover a full solution. For example, suppose we're looking for a maximum clique in a graph. First, binary search as you describe to find the size of the maximum clique. Call that size X. Then to find an actual clique of size X, repeat the following for each vertex v in the graph, in any order:
  1. Tentatively delete v and solve the problem "Is there now a clique of size >= X?".
  2. If the answer is yes, delete v permanently: we can ignore it from this point on since there is some X-clique that avoids it, and we already know from our initial binary search that that clique is best-possible.
  3. If the answer is no, v must belong to every X-clique in the graph. We can't do without it, so reinstate it in the graph.
  Afterwards, exactly X vertices will remain -- the vertices of some maximum clique in the original graph.