• vighneshiyer an hour ago

    This work from Google (original Nature paper: https://www.nature.com/articles/s41586-021-03544-w) has been credibly criticized by several researchers in the EDA CAD discipline. These papers are of interest:

    - A rebuttal by a researcher within Google who wrote this at the same time as the "AlphaChip" work was going on ("Stronger Baselines for Evaluating Deep Reinforcement Learning in Chip Placement"): http://47.190.89.225/pub/education/MLcontra.pdf

    - The 2023 ISPD paper from a group at UCSD ("Assessment of Reinforcement Learning for Macro Placement"): https://vlsicad.ucsd.edu/Publications/Conferences/396/c396.p...

    - A paper from Igor Markov which critically evaluates the "AlphaChip" algorithm ("The False Dawn: Reevaluating Google's Reinforcement Learning for Chip Macro Placement"): https://arxiv.org/pdf/2306.09633

    In short, the Google authors did not fairly evaluate their RL macro placement algorithm against other SOTA algorithms: rather they claim to perform better than a human at macro placement, which is far short of what mixed-placement algorithms are capable of today. The RL technique also requires significantly more compute than other algorithms and ultimately is learning a surrogate function for placement iteration rather than learning any novel representation of the placement problem itself.

    In full disclosure, I am quite skeptical of their work and wrote a detailed post on my website: https://vighneshiyer.com/misc/ml-for-placement/

    • Workaccount2 44 minutes ago

      To be fair, some of these criticisms are a few years old. Which normally would be fair game, but the progress in AI has been breakneck. Criticism of other AI tech from 2021 or 2022 are pretty dated today.

      • jeffbee 25 minutes ago

        It certainly looks like the criticism at the end of the rebuttal that DeepMind has abandoned their EDA efforts is a bit stale in this context.

      • s-macke an hour ago

        When I first read about AlphaChip yesterday, my first question was how it compares to other optimization algorithms such as genetic algorithms or simulated annealing. Thank you for confirming that my questions are valid.

        • gdiamos an hour ago

          Criticism is an important part of the scientific process.

          Whichever approach ends up winning is improved by careful evaluation and replication of results

          • jeffbee an hour ago

            It seems like this is multiple parties pursuing distinct arguments. Is Google saying that this technique is applicable in the way that the rebuttals are saying it is not? When I read the paper and the update I did not feel as though Google claimed that it is general, that you can just rip it off and run it and get a win. They trained it to make TPUs, then they used it to make TPUs. The fact that it doesn't optimize whatever "ibm14" is seems beside the point.

          • hinkley 2 hours ago

            TSMC made a point of calling out that their latest generation of software for automating chip design has features that allow you to select logic designs for TDP over raw speed. I think that’s our answer to keep Dennard scaling alive in spirit if not in body. Speed of light is still going to matter, so physical proximity of communicating components will always matter, but I wonder how many wins this will represent versus avoiding thermal throttling.

            • ilaksh 2 hours ago

              How far are we from memory-based computing going from research into competitive products? I get the impression that we are already well passed the point where it makes sense to invest very aggressively to scale up experiments with things like memristors. Because they are talking about how many new nuclear reactors they are going to need just for the AI datacenters.

              • HPsquared 2 hours ago

                And think of the embedded applications.

              • yeahwhatever10 3 hours ago

                Why do they keep saying "superhuman"? Algorithms are used for these tasks, humans aren't laying out trillions of transistors by hand.

                • fph 2 hours ago

                  My state-of-art bubblesort implementation is also superhuman at sorting numbers.

                  • xanderlewis 2 hours ago

                    Nice. Do you offer API access for a monthly fee?

                    • int0x29 2 hours ago

                      I'll need 7 5 gigawatt datacenters in the middle of major urban areas or we might lose the Bubble Sort race with the Chinese.

                      • dgacmu 34 minutes ago

                        Surely you'll be able to reduce this by getting TSMC to build new fabs to construct your new Bubble Sort Processors (BSPs).

                        • gattr an hour ago

                          Surely a 1.21-GW datacenter would suffice!

                          • therein an hour ago

                            Have we decided when are we deprecating it? I'm already cultivating another team in a remote location to work on a competing product that we will include into Google Cloud a month before deprecating this one.

                        • HPsquared 2 hours ago

                          Nice. Still true though! We are in the bubble sort era of AI.

                        • epistasis 3 hours ago

                          Google is good at many things, but perhaps their strongest skill is media positioning.

                          • jonas21 2 hours ago

                            I feel like they're particularly bad at this, especially compared to other large companies.

                            • pinewurst an hour ago

                              Familiarity breeds contempt. They've been pushing the Google==Superhuman thing since the Internet Boom with declining efficacy.

                            • lordswork an hour ago

                              The media hates Google.

                            • jeffbee 3 hours ago

                              This is floorplanning the blocks, not every feature. We are talking dozens to hundreds of blocks, not billions-trillions of gates and wires.

                              I assume that the human benchmark is a human using existing EDA tools, not a guy with a pocket protector and a roll of tape.

                            • jayd16 2 hours ago

                              "superhuman or comparable"

                              What nonsense! XD

                            • lordswork an hour ago

                              Some interesting context on this work: 2 researchers were bullied to the point of leaving Google for Anthropic by a senior researcher (who has now been terminated himself): https://www.wired.com/story/google-brain-ai-researcher-fired...

                              They must feel vindicated by their work turning out to be so fruitful now.

                              • cobrabyte an hour ago

                                I'd love a tool like this for PCB design/layout

                                • onjectic an hour ago

                                  First thing my mind went to as well, I’m sure this is already being worked on, I think it would be more impactful than even this.

                                • mirchiseth 2 hours ago

                                  I must be old because first thing I thought reading AlphaChip was why is deepmind talking about chips in DEC Alpha :-) https://en.wikipedia.org/wiki/DEC_Alpha.

                                  • sedatk 2 hours ago

                                    I first used Windows NT on a PC with a DEC Alpha AXP CPU.

                                    • mdtancsa 2 hours ago

                                      haha, same!

                                    • dreamcompiler 2 hours ago

                                      Looks like this is only about placement. I wonder if it can be applied to routing?

                                      • amelius 2 hours ago

                                        Exactly what I was thinking.

                                        Also: when is this coming to KiCad? :)

                                        PS: It would also be nice to apply a similar algorithm to graph drawing (e.g. trying to optimize for human readability instead of electrical performance).

                                      • pfisherman 2 hours ago

                                        Questions for those in the know about chip design. How are they measuring the quality of a chip design? Does the metric that Google is reporting make sense? Or is it just something to make themselves look good?

                                        Without knowing much, my guess is that “quality” of a chip design is multifaceted and heavily dependent on the use case. That is the ideal chip for a data center would look very different from those for a mobile phone camera or automobile.

                                        So again what does “better” mean in the context of this particular problem / task.

                                        • q3k 2 hours ago

                                          This is just floorplanning, which is a problem with fairly well defined quality metrics (max speed and chip area used).

                                        • red75prime 2 hours ago

                                          I hope I'll still be alive when they'll announce AlephZero.

                                          • loandbehold 2 hours ago

                                            Every generation of chips is used to design next generation. That seems to be the root of exponential growth in Moore's law.

                                            • abc-1 an hour ago

                                              Why aren’t they using this technique to design better transformer architectures or completely novel machine learning architectures in general? Are plain or mostly plain transformers really peak? I find that hard to believe.

                                              • amelius 2 hours ago

                                                Can this be abstracted and generalized into a more generally applicable optimization method?

                                                • FrustratedMonky 31 minutes ago

                                                  So AI designing it's own chips. Now that is moving towards exponential growth. Like at the end of "Colossus" the movie.

                                                  Forget LLM's. What DeepMind is doing seems more like how an AI will rule, in the world. Building real world models, and applying game logic like winning.

                                                  LLM's will just be the text/voice interface to what DeepMind is building.

                                                  • idunnoman1222 2 hours ago

                                                    So one other designer plus Google is using alpha chip for their layouts? - not sure on that title, call me when amd and nvidia are using it

                                                    • kayson an hour ago

                                                      I'm pretty sure Cadence and Synopsys have both released reinforcement-learning-based placing and floor planning tools. How do they compare...?

                                                      • colesantiago 2 hours ago

                                                        A marvellous achievement from DeepMind as usual, I am quite surprised that Google acquired them for a significant discount of $400M, when I would have expected it to be in the range of $20BN, but then again Deepmind wasn’t making any money back then.

                                                        • dharma1 an hour ago

                                                          it was very early. probably one of their all time best acquisitions in addition to YouTube.

                                                          Re:using RL and other types AI assistance for chip design, Nvidia and others are doing this too

                                                        • mikewarot 2 hours ago

                                                          I understand the achievement, but can't square it with my belief that uniform systolic arrays will prove to be the best general purpose compute engine for neural networks. Those are almost trivial to route, by nature.

                                                          • ilaksh 2 hours ago

                                                            Isn't this already the case for large portions of GPUs? Like, many of the blocks would be systolic arrays?

                                                            I think the next step is arrays of memory-based compute.

                                                            • mikewarot an hour ago

                                                              Imagine a bit level systolic array. Just a sea of LUTs, with latches to allow the magic of graph coloring to remove all timing concerns by clocking everything in 2 phases.

                                                              GPUs still treat memory as separate from compute, they just have wider bottlenecks than CPUs.