• sgt101 an hour ago

    I love that "system 1" and "system 2" are tossed out as fundamental concepts of mind when they're basically labels for some semi-random ideas based on bullshit studies in a pop sci fluff book. Is there any actual science that ties these ideas to mathematical reasoning?

    Then we have:

    >> By analogy, we can now state the main hypothesis proposed in this paper: a crucial component of the usefulness of a new proven theorem t (in the context of previous theorems T(S)) is how efficiently T(S)∪{t} compresses the set of all provable mathematical statements M. That is, T (S) ∪ {t} is a good compression of M if many provable statements, beyond those in S, can be derived from T (S) ∪ {t}, say using at most k derivation steps.

    which says "Occam's Razor, that's good isn't it?" in more words.

    • drdeca 34 minutes ago

      I don’t think that’s really what Occam’s razor says.

      If one’s goal is to use ML stuff to try to produce new useful theorems, it seems useful towards that goal to come up with a numeric heuristic for how useful a new theorem is. And, stating such a heuristic explicitly seems reasonable.

      Just because the heuristic that they state, which they expect to be a crucial component of what makes a theorem useful to prove in a given context, isn’t particularly surprising, doesn’t make it not worth saying.

      to elaborate on “isn’t particularly surprising” : if you asked many other people to come up with heuristics for the usefulness of a new potential theorem, I imagine many of the answers you would get (counting multiplicity if multiple people give identical answers) would be fairly similar.

      Even if a hypothesis is an obvious-to-consider hypothesis, if one wants to talk about it, it is worth stating what it is first.

      • CooCooCaCha 3 minutes ago

        > I love that "system 1" and "system 2" are tossed out as fundamental concepts of mind when they're basically labels for some semi-random ideas based on bullshit studies in a pop sci fluff book. Is there any actual science that ties these ideas to mathematical reasoning?

        I think this fails to hit the mark on what people actually care about with regards to system 1 vs system 2. It’s really just, can we build models that are able to vary how long they think about a problem? Current AI models are very limited in this aspect and I think most would agree that being able to think about a problem for a period of time is a useful feature that we humans use all the time.

      • youoy 14 minutes ago

        I feel that here the authors think about mathematics as just the language and not what it talks about. We know at least since Gödel that mathematics is more than just the formal system. I even remember going to a talk by Gromov and he getting "angry" about a question from the public in the lines of "what if we suppose this extra thing and change this other thing on that theorem", and him saying that it's stupid to think of new theorems for the sake of doing it, that the language of mathematics are like metaphors speaking about something else.

        In my experience learning math, their claim that "The central hypothesis is that a desirable body of theorems better summarizes the set of all provable statements, for example by having a small description length" is not true. Short != Better, better is what gets me faster to form the correct intuitive idea about the mathematical statement. For example, several times I have experienced the fact of understanding the formal definitions and proofs of a theory, but it's not until I form the correct intuitions (maybe months later), that I truly understand the theory. And it's not until I have the correct intuitions, that I can successfully apply the theory to create meaningful new theorems.

        Anyways, I understand that one has to start from somewhere and the point of view of the article is more tractable and explicit.

        • marojejian 3 hours ago

          I found this to be an interesting read, though it's just a high-level plan, vs. breaking new ground. I like how the ideas are synthesized.

          A fun detail is to be found in a footnote: "A mathematician is a person who can find analogies between theorems; a better mathematician is one who can see analogies between proofs and the best mathematician can notice analogies between theories. One can imagine that the ultimate mathematician is one who can see analogies between analogies.” (attributed to Stefan Banach in Randrianantoanina and Randrianantoanina [2007]).

          And reading this, it seems to me this implies the "ultimate" mathematician is a poet.

          P.S. I initially assumed "Randrianantoanina and Randrianantoanina" was the title of some nerdy speculative fiction. It turns out this is a regular paper reference. The authors are Malagasy, no doubt. They have such cool names there.

          • ben_w 3 hours ago

            > P.S. I initially assumed "Randrianantoanina and Randrianantoanina" was the title of some nerdy speculative fiction. It turns out this is a regular paper reference. The authors are Malagasy, no doubt. They have such cool names there.

            Given what else Banach is famous for, I assumed a Banach–Tarski joke.

            • card_zero 2 hours ago

              > the "ultimate" mathematician is a poet.

              Approximately, but not actually, a member of the set of poets.

            • marojejian 3 hours ago

              Abstract. The current state-of-the-art in artificial intelligence is impressive, especially in terms of mastery of language, but not so much in terms of mathematical reasoning. What could be missing? Can we learn something useful about that gap from how the brains of mathematicians go about their craft? This essay builds on the idea that current deep learning mostly succeeds at system 1 abilities – which correspond to our intuition and habitual behaviors – but still lacks something important regarding system 2 abilities – which include reasoning and robust uncertainty estimation. It takes an information-theoretical posture to ask questions about what constitutes an interesting mathematical statement, which could guide future work in crafting an AI mathematician. The focus is not on proving a given theorem but on discovering new and interesting conjectures. The central hypothesis is that a desirable body of theorems better summarizes the set of all provable statements, for example by having a small description length while at the same time being close (in terms of number of derivation steps) to many provable statements.

              • thomasahle 2 hours ago

                A key aspect in AlphaProof was how to find good training problems. For AlphaProof they managed to scrape 1 million "International Mathematical Olympiad (IMO) style" problems.

                Here Bengio seems to just suggest using estimated entropy `E_P(t|T(S))[− log P(t | T(S))]` of the theorem to find interesting/surprising theorems to try and prove.

                This reminds me of the early work in estimating LLM uncertainty/hallucinations using perplexity. It didn't work very well.

                Actually generating good "exercises" for the system to "practice" on, and theorems to try and prove, still seems like the biggest road block for an AI mathematician. It connects to the philosophical question of why we are doing math in the first place?

                • ford 3 hours ago

                  Given the pace of AI lately - might be worth mentioning this is from March of this year