• imjonse 2 days ago

    Apart from results on benchmarks, what sets Allenai models apart - Olmo/OlMoE/Molmo - is they are fully open, not just open-weights/free to use. The datasets used, a crucial ingredient, are also disclosed and open. UPDATE: they say the datasets will be made available, but they aren't yet.

    • comp_raccoon 2 days ago

      it’s coming! just takes a bit more time to properly release it.

    • espadrine 2 days ago

      The paper: https://molmo.allenai.org/paper.pdf

      > Our key innovation is a simple but effective data collection strategy that avoids these problems: we ask annotators to describe images in speech

      I see this as another example that datasets trump architecture nowadays.

      The architecture is not where the innovation is: it is only CLIP embeddings converted to the LLM tokens through MLP with some pooling to reduce the token count.

      • imjonse 2 days ago

        Architectural innovations can definitely help with training/inference speed - that was the case with convolutional networks too - but for model performance (as in 'intelligence' not speed) dataset size and quality were always more important. Even with classical ML models the advice was to stop tweaking the model and clean the data/gather more data first.

      • causal a day ago

        That graphic comparing benchmark averages is really nice, wish things were presented so clearly more often.

        That being said, I think this definitely tilts things in Molmo's favor by including so many benchmarks that seem to favor Molmo, in particular the counting ones. The average hides that it has a pretty modest MMLU score compared to state of the art.

        • danielcampos93 2 days ago

          Not mentioned in their blog posts but on the model cards on huggingface: "Molmo 72B is based on Qwen2-72B and uses OpenAI CLIP as vision backbone. Molmo-72B achieves the highest academic benchmark score and ranks second on human evaluation, just slightly behind GPT-4o." Others are based on Qwen 7B. What happened to the Olmo chain?

          • jszymborski 2 days ago

            I think the "Molmo-7B-O" and "MolmoE-1B" models are using Olmo, judging by the fact its LLM backbone is the only one listed as having open data.

            EDIT: From the post "For the LLM, we have trained models on a variety of choices at different scales and degrees of openness including: the fully open-weight and data OLMo-7B-1024 (using the October, 2024 pre-released weights, which will be public at a later date), the efficient fully open-weight and data OLMoE-1B-7B-0924, open-weight Qwen2 7B, open-weight Qwen2 72B, open-weight Mistral 7B, open-weight Gemma2 9B, and Phi 3 Medium). Today we are releasing 4 samples from this family."

            • comp_raccoon a day ago

              This is correct! we wanted to show that you can use PixMo dataset and our training code to improve any open model, not just ours!

          • naiv 2 days ago

            image was flagged as inappropriate by the google vision api ?

            • comp_raccoon 2 days ago

              google image APIs are not great, yeah it’s only for demo, though—checkpoints on huggingface are uncensored.

            • undefined 2 days ago
              [deleted]