• wesleyyue 9 months ago

    Interesting observations:

    * Llama 3.2 multimodal actually still ranks below Molmo from ai2 released this morning.

    * AI2D: 92.3 (3.2 90B) vs 96.3 (of Molmo 72B)

    * Llama 3.2 1B and 3B is pruned from 3.1 8B so no leapfrogging unlike 3 -> 3.1.

    * Notably no code benchmarks. Deliberate exclusion of code data in distillation to maximize mobile on-device use cases?

    Was hoping there would be some interesting models I can add to https://double.bot but doesn't seem like any improvements to frontier performance on coding.

    • daemonologist 9 months ago

      On the second point, you're comparing MMMU-Pro (multimodal) to MMLU-Pro (text only). I don't think they published scores on MMLU-Pro for 3.2.

      (Edit: parent comment was corrected, thanks!)

      • wesleyyue 9 months ago

        Yep you're right, thanks for catching (sorry for the ninja edit!)

      • idiliv 9 months ago

        Where do you see the MMLU-Pro evaluation for Llama 3.2 90B? On the link I only see Llama 3.2 90B evaluated against multimodal benchmarks.

        • wesleyyue 9 months ago

          Ah you're right I totally misread that!

      • ChrisArchitect 9 months ago
        • jarbus 9 months ago

          I’m more excited about Llama stack, I can’t wait for local models to be able to use tools in a standard way.

          • fulladder 9 months ago

            When will it come to ollama? That's my preferred quantization platform.

            • artninja1988 9 months ago

              Are users in the EU not allowed to use it, like they threatened to do so recently?

              • btdmaster 9 months ago

                They are not allowed indeed: https://github.com/meta-llama/llama-models/blob/main/models/...

                > With respect to any multimodal models included in Llama 3.2, the rights granted under Section 1(a) of the Llama 3.2 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models.

                Interesting though, since (some) EU law applies outside the EU anyway, so I'm not sure how much lawyer there is in the text.

                • hiAndrewQuinn 9 months ago

                  This would not apply to the smaller 1B and 3B models, though, if I'm reading this right, since they are text only, not multimodal. Is that correct?

                  • btdmaster 9 months ago

                    Yes, only the multimodal ones.

              • oriettaxx 9 months ago

                do you have an idea how long will take to have it available in ollama ?

                • oriettaxx 9 months ago
                  • rahimnathwani 9 months ago

                    No multimodal yet :(

                    • Patrick_Devine 9 months ago

                      Soon! We're working on it, and it's almost there.

                      • rahimnathwani 9 months ago

                        It seems like the weights for Llama3.2-11B-Vision-Instruct are about 20GB. Will ollama run that on an M1 Mac with 32GB RAM? Will the ollama model library have quantized models?

                        • Patrick_Devine 9 months ago

                          I think it should run fine. Yes, there will be quantized versions.

                • pheeney 9 months ago

                  What is the best provider for API use of llama frontier models considering pricing / reputation?

                  • undefined 9 months ago
                    [deleted]