One of the perennial questions about proof automation has been the utility of proofs that cannot be understood by humans.
Generally, most computer scientists using proof automation don't care about the proof itself -- they care that one exists. It can contain as many lemmas and steps as needed. They're unlikely to ever read it.
It seems to me that LLMs would be decent at generating proofs this way: so long as they submit their tactics to the proof checker and the proof is found they can generate whatever is needed.
However for mathematicians, of which I am not a member of that distinguished group, seem to appreciate qualities in proofs such as elegance and simplicity. Many mathematicians that I've heard respond to the initial question believe that a proof generated by some future AI system will not be useful to humans if they cannot understand and appreciate it. The existence of a proof is not enough.
Now that we're getting close to having algorithms that can generate proofs it makes the question a bit more urgent, I think. What use is a proof that isn't elegant? Are proofs written for a particular audience or are they written for the result?
Mathematician here (trained as pure, working as applied). Non-elegant proofs are useful, if the result is important. e.g. People would still be excited by an ugly proof of the Riemann hypothesis.^1 It's important too a lot of other theorems if this is true or not. However, if the result is less central you won't get a lot of interest.
Part of it is, I think, that "elegance" is flowery language that hides what mathematicians really want: not so much new proofs as new proof techniques and frameworks. An "elegant" proof can, with some modification, prove a lot more than its literal statement. That way, even if you don't care much about the specific result, you may still be interested because it can be altered to solve a problem you _were_ interested in.
1: It doesn't have to be as big of a deal as this.
Then again, even an 'elegant' proof can be surprisingly inflexible. I've recently been working through Apéry's proof that ζ(3) is irrational. It's so simple that even a clueless dabbler like me can understand all the details. Yet no one has been able to make his construction work directly for anything else (that hasn't already been proven irrational). C'est la vie, I suppose.
There was a post yesterday of a quanta article: https://news.ycombinator.com/item?id=42644896.
The article explains that two mathematicians were able to place Apery's proof that ζ(3) is irrational into a much wider (and hence more powerful) framework. I doubt that framework is as easy to understand as the original proof. But in the end something with wider applicability did come out of the proof.
"It doesn't have to be as big of a deal as this."
Agree. The truthfulness of the four-colour theorem is good to know, although there is not yet any human-readable proof.
I feel like the four-color theorem automated proof is much more 'human-readable' than the proofs done with automated theorem provers. Because with the four-color theorem, there is a human readable proof that says "if this finite set of cases are all colorable, then all planar graphs are colorable". And then there is some rather concrete code that generates all the finite cases, and finds a coloring for them. Every step in there makes sense, and is fully understandable. The fact that the exhaustive checking wasn't done by hand doesn't mean its hard to understand how the proof works, or what is 'actually going on'.
For a general theorem prover, reading the code doesn't explain anything insightful about why the theorem is true. For the 4 color theorem, the code that proved it actually gives insight in how the proof works.
One thing that many mathematicians today don’t think about is how deeply intertwined the field has historically been with theology. This goes back to the Pythagoreans at least.
That survives in the culture of mathematics where we continue to see a high regard for truth, beauty, and goodness. Which, incidentally, are directly related to logic, aesthetics, and ethics.
The value of truth in a proof is most obvious.
The value of aesthetics is harder to explain, but there's no denying that it is in fact observably valued by mathematicians.
As for ethics, remember that human morality is a proper subset thereof. Ethics concerns itself with what is good. It may feel like a stretch, but it's perfectly reasonable to say that for two equally true proofs of the same thing, the one that is more beautiful is also more good. Also, obviously, given two equally beautiful proofs, if only one is true then it is also more good.
I believe knowing a proof exists will bring us closer to elegant human proofs.
I wanted to justify this with the “Roger Bannister Effect”. The thought is that we’re held back psychologically by the idea of the impossible. It takes one person to do it. And now everyone can do it, freed from the mind trap. But further reading shows we were incrementally approaching what Roger Bannister did first: the 4 minute mile. And the pause before that record was likely not psychological but physical with World War Two. [0] And this jives with the TFA when Mr. Wolfram writes about a quarter of a century not yielding a human interpretation of his computer’s output.
All I’m left with is my anecdotes. I had a math professor in college who assigned homework every class. Since it was his first time teaching, he came up with the questions live. I’d come to class red in the face after struggling with questions all night. Then the professor would sheepishly reveal some of his statements were false. That unknown sapped a lot of motivation. Dead ends felt more conclusive. Falsehood was an easy scapegoat.
[0] https://www.scienceofrunning.com/2017/05/the-roger-bannister...
I think there is something to this idea. There have been cases where person A was working on proving a result but struggled, then person B announced a proof of the result, and then person A was inspired to finish their proof. (Sadly, I don't remember the specifics.)
Hypothesis: an LLM capable of generating a correct proof in a formal language, not through random chance but through whatever passes for “reasoning,” should also be capable of describing the proof in a way meaningful to humans. Because LLMs have a limited context window and are trained on human behavior, they will generate solutions similar to what humans would generate.
We have already accepted some proofs we cannot fully understand, such as the proof of the four color theorem that used computational methods to explore a large solution space and demonstrate that no possible special-case combinations violate the theorem. But that was just one part of the proof.
I wonder what we know about proof space generally, and if we had an ASI that reasoned in a substantially different way than humans, what types of proofs it would be likely to generate. Do most proofs contain structural components that humans find pleasing? Do most devolve into convoluted case analyses? Is there a simplest form that a set of correct proofs could be reduced to?
It depends.
A proven conjecture is IMO better than an unproven one.
But usually, a new proof would shed new lights or build bridges between concepts that were before unrelated.
And in that sense, a proof not understandable by humans is disappointing, because it doesn't really fullfil the need to understand the reason behind why it's true.
i would imagine a proof has several "uses": 1) the proof itself is useful for some other result or proof, and 2), the proof is using a novel technique or uses novel maths, or links to previously unlinked fields, and it's not the proof's result itself that is useful, but the technique developed. This technique can then be applied in other areas to produce other kinds of proofs or knowledge.
I suspect it is the latter that will suffer in automated proofs perhaps - without understanding the techniques, or if the technique is not really novel but just tedious.
This isn't a new conundrum. This was a very contentious question in the end of the 19th century, where French mathematicians clashed with the German mathematicians. Poincare is known for describing proofs as texts intended to convince other mathematicians that something is the case, whereas Hilbert believed that automation is the way to go (i.e. have a "proof machine", plug in the question, get the answer and be done with it).
Temporarily, Germans won.
Personally, I don't think that proofs that cannot be understood have no value. We rely on such proofs all the time in our day-to-day interpretation of the world around us, our ability to navigate it and anticipate it. I.e. there's some sort of an automated proof tool in our brains that takes the visual input, feeling of muscle tonus, feeling of the force exerted on our body etc. and then gives an answer as to whether we are able to take the next step, pick up a rock and so on.
But, mathematicians also want proofs to be useful to explain the nature of the thing in question. Because another thing we want to do about things like picking up rocks, is we want to make that more efficient, make inanimate systems that can pick up rocks etc.
----
NB. I'm not entirely sure how much LLMs can contribute in this field. The first successes of AI were precisely in the field of automated proofs, and that's where symbolic AI seems to work great. But, I'm not at all an expert on LLMs. Maybe there's some way I cannot think about that they would be better at this task, but on the face of it they just aren't.
From what I have heard when talking to the people behind formal analysis of protocol security, the main problem currently with using LLMs to 'interact with the theorem prover for you' is that there is nowhere near enough proofs out there for the LLMs to learn how to generalize from them.
I think mathematicians want something more basic, though elegance and simplicity are appreciated. They want to know why something is true, in a way that they can understand. People will write new proof of existing results if they think they get at the "why" better, or even collect multiple proofs of the same result if they each get at the "why" in different ways.
Although Wolfram doesn't mention it by name, this is closely related to what he is trying to do: https://en.wikipedia.org/wiki/Reverse_mathematics
Perhaps using Grobner basis for formal proofs [1],[2] could be similar to what appears here, that is during the proof the length of the terms (or polynomials) can grow and then at the end there is a simplification and you obtain a short grobner basis (short axioms).
A simple question, since • is nand, the theorem ((a•b)•c)•(a•((a•c)•a))=c, can be proved trivially computing the value of both sides for the 8 values of a,b,c. Also there are 2^(2^2)=16 logic functions with two variables so is trivial to verify that the theorem is valid iff • is nand. Perhaps the difficulty is proving the theorem using rules?, there must be something that I don't see (I just took a glimpse in a few seconds).
Automatic formal proofs can be useful when you are able to test [1] https://doi.org/10.1016/j.jpaa.2008.11.043 [2] https://ceur-ws.org/Vol-3455/short4.pdf
The thing you're missing is that at no point is it assumed that there are exactly two elements in a boolean algebra. In fact you can have a boolean algebra with four elements (see https://en.wikipedia.org/wiki/Boolean_algebra_(structure)).
It seems the author is using the word logic, so logic boolean algebra suggests the classical case. Perhaps what is not trivial is that one can use that rule to deduce the other axioms. So that is not the theorem what is important but that one can prove any tautology using that simple axiom.
What is this central dot? I thought a central dot in boolean logic means logical and but then the axiom is clearly false..... I don't get what this is about.
As others have already said, think of it as NAND, although in traditional logic this is typically called the "Sheffer stroke".
In "The proof as we know" section he states that the dot is a NAND operation
Quote: "the · dot here can be thought of as representing the Nand operation"
The dot is not simply NAND or NOR.
Search for "What Actually Is the “·”?" for the answer, it's quite complex and fascinating.
The source uses ○, not •, for the NAND operation.
> What is this central dot?
Yeah, I wish he had started by defining that. The is hard to understand without it.
Search for "Is There a Better Notation?" in the article, it seems "." is NAND
Technically, his axiom is the definition for what the operator is. Any set together with an operator "•" that satisfies this law is a boolean algebra. Binary logic where •=NAND is one such example because it satisfies the axiom.
> as we’ll discuss, it’s a basic metamathematical fact that out of all possible theorems almost none have short proofs
Where in the article is this discussed?
For any possible definition of "short", there's only finitely many (and typically few) theorems that have a short proof, while there are infinitely many theorems (not all of them interesting).
More in detail: Proofs are nothing more than strings, and checking the validity of a proof can be done mechanically (and efficiently), so we can just enumerate all valid proofs up to length N and pick out its conclusion.
I wonder if he’s familiar with Peirce’s alpha existential graphs. They are a complete propositional logic with a single axiom and, depending how you count them, 3-6 inference rules. They use only negation and conjunction.
They also permit shockingly short proofs compared to the customary notation. Which, incidentally was also devised by Peirce. Peano freely acknowledges all he did is change some of the symbols to avoid confusion (Peirce used capital sigma and pi for existential and universal quantification).
Can you share a good reference to peirce's work on existential graphs ? also, can you share references to how Peano relates to Peirce's work.
I loved Peirce's essays, but have not tried to read his work on logic or semiotics.
Is there a relation between the "single axiom for boolean algebra" that Wolfram claims to have discovered and the fact that you can express all boolean operations with just NAND?
I was sort of puzzled by the meaning of "axiom for boolean algebra" as well, and I looked into this more.
The way I learned boolean algebra was by associating certain operations (AND, NOT, OR, etc) to truth tables. In this framework, proving a theorem of boolean algebra would just involve enumerating all possible truth assignments to each variable and computing that the equation holds.
There is another framework for boolean algebra that does not involve truth tables. This is the axiomatic approach [1]. It puts forth a set of axioms (eg "a OR b = b OR a"). The symbol "OR" is not imbued with any special meaning except that it satisfies the specified axioms. These axioms, taken as a whole, implicitly define each operator. It then becomes possible to prove what the truth tables of each operator must be.
One can ask how many axioms are needed to pin down the truth table for NAND. As you know, this is enough to characterize boolean algebra, since we can define all other operators in terms of NAND. It turns out only one axiom is needed. It is unclear to me whether this was first discovered by Wolfram, or the team of William McCune, Branden Fitelson, and Larry Wos. [2]
[1] https://en.wikipedia.org/wiki/Boolean_algebra_(structure)
[2] https://en.wikipedia.org/wiki/Minimal_axioms_for_Boolean_alg...,.
Thanks for the wonderful explanation brother!
Nice comment explaining the difference between two viewpoints.
The former one is set theoretic i.e. set of objects (eg. terms) and operations (eg. and/or/not) defined on those objects. The latter is an algebraic specification where a number of properties (expressed by logical formulas which can be axioms or theorems) are expected to be satisfied by the operations.
Also see https://en.wikipedia.org/wiki/Minimal_axioms_for_Boolean_alg...
PS: I would take any claim by Stephen Wolfram as having been "the first" to discover anything with a boatload of salt.
The latter is a necessary, but not sufficient condition for the former.
> Is there a relation between the "single axiom for boolean algebra" that Wolfram claims to have discovered and the fact that you can express all boolean operations with just NAND?
Seems to be partially answered in the article.
Search for "Is There a Better Notation?"
In proofs in math, there is the old: "Elegance is directly proportional to what you can see in it and inversely proportional to the effort it takes to see it."
Formal proof is so much important, since currently maths is built on set theory, I wonder how the set theory axioms are written in some of the formal solvers.
> currently maths is built on set theory
most mathematicians will eventually agree with that when pressed. However, almost none of them know the axioms of ZFC by heart (because they don't need them). You can swap out ZFC for something else and nobody will care very much as long as "the same" (does a lot of work here) theorems, which they know and use it their work, remain true.
This is what many theorem provers do. Many aren't based on set theory, for example, but instead on type theory. (You can still do set theory in a framework based on type theory, and vice versa, but the foundations are "different").
No need to wonder for long, just have a look.
Metamath: https://us.metamath.org/mpeuni/mmset.html#axioms
Isabelle/ZF: https://isabelle.in.tum.de/dist/library/FOL/ZF/ZF_Base.html
Check out mathlib for LEAN, the pace of proofs being added here is breathtaking: https://github.com/leanprover-community/mathlib4
You can model types as a set and sets as types in many ways, so a number of the basic set theory axioms are pretty simple to express as lemmas from type axioms. IIRC you get constructive set theory easy, but do have to define additional axioms typically for ZFC.
As someone who is completely unrelated to the academic world, is Stephen Wolfram actually working on meaningful things? I remember reading an article where he was announcing his Physics Project which got a lot of backlash from the scientific community (at least I think that was the case).
Nonetheless, every time I browse through Wolfram Alpha I am thoroughly impressed of what it can do with only Natural Language as an input.
That's weird, why does Stephen Wolfram gets some sort of special treatment here that nobody else seems to get - including people that are subject to much more common and intense criticism (just the other day there were people bitching about Trudeau, for example - or let's take Elon Musk who is an asshat (IMHO, YMMV), is discussing that off-topic when it comes to how he manages Twitter?).
The question "is Stephen Wolfram being taken seriously by the mathematics community?" seems relevant as a question to gauge whether one should spend time reading a very long article.
edit: an even more relevant analogy - is Mochizuki's big ego irrelevant in a discussion about whether his proof of the abc conjecture, that nobody understands and which he refuses to explain properly, is correct?
Because it's highly repetitive and uninteresting, especially in the context of an article that is not about Wolfram. The mod exhortations are a consequence of the repetitiveness rather than some special carveout. Other things that get that repetitive get similar appeals.
90% of content on HN is repetitive. "Unit tests suck", "managers forcing return to the office are assholes", "Fauci lied about Covid", ...
I've never seen a single of these things being called out.
This is an article about a proof Stephen Wolfram claims to have discovered. Somebody further upthread already mentioned that this may not actually be true, or that the credit at least should be shared.
90% of everything is repetitive. The generic analogies method is not really that useful in discussing the specific thing you asked about.
I've never seen a single of these things being called out.
Happens all the time plus many repetitive things constantly get moderated away by users and moderators. The goal is a less boring messageboard, but it's aspirational rather than perfectly achievable.
Edit: Ok, fine, let's do analogies. Imagine Elon Musk started writing long, interesting, researched articles about some topic, say, Chinese history. If every such article got flooded with discussions about who does and does not like Elon Musk and by how much, those would start getting modbleatings too. This is what happened except it wasn't Elon Musk but Stephen Wolfram and it wasn't Chinese History and it started about a decade ago (as you can see from the comment dates).
Maybe the reason why people keep calling out Stephen Wolfram (and have been doing so for a decade) is because he has a history of making grandiose claims that do not always withstand closer scrutiny?
I'm not claiming that Stephen Wolfram is an idiot or that nothing he writes is of value, but his ego clearly does influence his judgement so it makes no sense to me to claim that it is off-topic (in the way that, say, his love life or his opinion on abortion would be).
Also this:
> Happens all the time plus many repetitive things constantly get moderated away by users and moderators. The goal is a less boring messageboard, but it's aspirational rather than perfectly achievable.
No sorry, I've been on HN during the pandemic and I was this close to quitting it forever. These things do not get called out in any consistent fashion for many topics. I'm pretty sure a message board for people interested in technology can handle a couple of "Wolfram sucks" comments a year better than literally hundreds of angry rants about a health advisor that doesn't even have any relevance to anyone who isn't from the US.
Maybe the reason why people keep calling out
It's not about the reason, it's about the repetitiveness. The boringness of the repetitiveness is one of the organizing principles of HN. Also, people don't actually 'keep calling out' this stuff - these moderator interventions, specifically in the Wolfram case, have been surprisingly effective. Many Wolfram-y articles have neither moderation intervention or bad tangents.
These things do not get called out in any consistent fashion for many topics.
You don't see everything, neither do other forum participants, including the moderators. The fact that, collectively, some boring and repetitive stuff is missed doesn't imply we can all use more boring, repetitive stuff.
> You don't see everything, neither do other forum participants, including the moderators.
The stuff that I mentioned was really hard to miss.
The question OP asked was, essentially, "is Stephen Wolfram actively contributing to the mathematical community?". If you really think that this is off-topic then... I don't know what to say.
Wolfram is essentially the Fauci of physics.
No. Even though Mathematica is arguably meaningful. It's very popular with some scientists, and in many aspects it's a respectable piece of software.
A few years ago Sean Carroll was hosting him on his podcast. It was a bit surprising to me because Sean would never give a crackpot the time of the day, and Wolfram is borderline in crackpot territory IMHO, but not quite. He hasn't published anything meaningful in a scientific journal in a long time as far as I know.
I am neither a mathematician nor a scientist, so I’m unqualified to judge Wolfram’s current theories* of physics and computation. But my impression is that he remains quite rigorous in his work, even if the path he’s walking is far from the main. And of course he’s quite bombastic, which always seems to raise hackles. In fact, the article on which this discussion anchors is a great example of both.
* Ref: https://www.wolframphysics.org/technical-introduction/ The podcast with Sean Carroll that parent mentioned is also a surprisingly accessible lay introduction, definitely worth a listen.
Define "meaningful".
A lot of mathematics is about exploration. You look at some system and try to figure out how it works. If you have a good idea (conjecture), you try proving or disproving it. The goal is to gain some understanding.
Once in a while, it turns out that exploration hits the gold ore in the mine. You get something that applies to solving some problem we've had for years. As an example, algebra was considered meaningless for years. Then cryptography came along.
There are other good examples. Reed-Solomon coding relies on a basis of finite fields.
The problem is we don't really know when we'll strike gold in many cases. So people go exploring. It also involves humans, and they can disagree on a lot of things. Stephen seems to run into disagreements more often than the average researcher, at least historically.
They've got a YouTube with pretty robust updates. Iirc, during the lock downs, he'd do live coding and stuff.
He is trying to do alternative theory of science.
It's like rewriting (and clarifying things here or there, putting in new analogies, etc.. not just translating) everything in Klingon, and creating new Klingon words while he is doing that.
Sure this is "interesting" in academic sense. Good luck finding a journal accept paper written in Klingon.
Here's a meta question about this article: let's try to estimate how many people on earth, say within the next 5 years will ever read the entire article in all of its gory details?
These days, an LLM will, perhaps.
An make it palatable to puny humans.
Or maybe it will fail to recall all of it and make something up. Because it has 300B parameters not infinitely many.
We currently pull out our phones at the pub/table to check something someone makes up to see if it's legitimate. Now we've invented the technology to have something be that thing that creates a half-truth from what it has absorbed.
I probably could, but I'm 99% sure it's a thinly veiled self promotion, as is usual from him.
Just train the LLM search engine to tell people it contains the answers to all their questions.