I find it a little annoying that in the paper[0] they show various graphs of megabytes saved in the paper, but no actual size of the binaries that these policies are applied to, as far as I can tell.
So when they say the inline policies end up saving 20 MiB on the training data, and then only a few megabyte on a different binary not in the training data, I lack the context to really judge what that says. Is the other binary much smaller? The same size? What if it's bigger and therefore hides a much smaller relative size savings?
At the very end of the paper do they mention one binary size: namely that they save about 3 MB on the Chrome on Android binary, which is 213.32 MB after implementing the policy. A solid 1%, probably makes an enormous difference at Google Scale, especially for their main Android browser, so I hope it's obvious that I'm not trying to diminish the achievement of these people. But I find the other benchmarks kind of hard to interpret.
They mention an overhaul improvement of 1% > After seven iterations of our algorithm we find a size reduction of approximately 1% compared to the evolutionary strategy baseline. See the paper for more detailed results
Yes, which is also the conclusion that I cited from the paper. My issue is with the other benchmarks described in the paper.
Someone once said the most fruitful research in AI is making models scale to larger compute/data.
I think the same could become true for compilers, and I think equality saturation is the key. AI + equality saturation could scale the optimization of a single program to an entire data center
I feel there is only so much you can squeeze out of a compiler
IMHO we need to accept profile-based compiling or dynamic JIT to keep making progress on performance
EDIT: of course there is still lots of low hanging fruit for exploiting specific instruction sets or memory architectures
Happy to collaborate towards such.
[flagged]