« BackGPU Puzzlesgithub.comSubmitted by cgadski 5 days ago
  • srush 4 hours ago

    I made these a couple of years ago as a teaching exercise for https://minitorch.github.io/. At the time the resources for doing anything on GPUs were pretty sparse and the NVidia docs were quite challenging.

    These days there are great resources for going deep on this topic. The CUDA-mode org is particularly great, both their video series and PMPP reading groups.

    • lins1909 8 minutes ago

      Thanks Sasha - this looks like a great resource.Just to be clear, would you recommend going through other newer resources than this instead?

      Not sure if your comment is to discourage someone from going through this.

      • nextos 2 hours ago

        Slightly offtopic, but any chance you could update or re-upload code for your https://github.com/harvardnlp/DeepLatentNLP tutorial? I found the NLP latent variable models discussed there really interesting, and notebooks were excellent. However, these seem gone and the only thing left are slides?

        Alternatively, any other places that discuss the same topics, including some code? I could only find equivalent discussions with code in Pyro docs and Kevin Murphy's book, volume 2. But these are more sparse as they also cover many other topics.

        • srush 15 minutes ago

          I'll take a look. Yeah Pyro is the best thing to do here. But it would be nice to revisit some of these implementationz

        • bytepoet 3 hours ago

          Thanks a lot, Sasha, for creating these. I found your LLM training puzzles to be excellent as well.

        • aleinin an hour ago

          I recently ported this to Metal for Apple Silicon computers. If you're interested in learning GPU programming on an M series Mac, I think this is a very accessible option. Thanks to Sasha for making this!

          https://github.com/abeleinin/Metal-Puzzles

          • throwaway314155 an hour ago

            Either puzzle 4 has a bug in it or I'm losing my mind. (Possible answer to solution below, so don't read if you want to go in fresh)

                # FILL ME IN (roughly 2 lines)
                if local_i < size and local_j < size:
                    out[local_i][local_j] = a[local_i][local_j] + 10
            
            
            Results in a failed assertion:

                 AssertionError: Wrong number of indices
            
            
            But the test cell beneath it will still pass?
            • imjonse a minute ago

              maybe try out[local_i, local_j] ?

            • 867-5309 35 minutes ago

              seems like an opportune moment to gift a plug for bitcoin puzzles, namely BTC32 / 1000 BTC Challenge[1]

              pools are in dire need of cuda developers

              [1]https://bitcointalk.org/index.php?topic=1306983.0

              • fifilura 2 hours ago

                I think this course is also relevant for some deeper context.

                https://gfxcourses.stanford.edu/cs149/fall23/lecture/datapar...

                • geekodour 10 minutes ago

                  all videos should be available on YT by end of month

                • ismailmaj 3 hours ago

                  It would be nice if the puzzles natively supported C++ CUDA.