Damn I literally just published a article benchmarking flash-1.5 and showing it is very impressive for it's cost.
https://myswamp.substack.com/p/improving-accessibility-using...
Maybe I'll redo it and add in 1.5-8b, it's so cheap it doesn't hurt to add it lol.
Can you also include gpt-4o-mini.
I made a note with the updated chart:
https://substack.com/profile/107132439-michael-barajas/note/...
Why do some people turn to Gemini? I've tried it, and I remember it lacking or being heavily censored. Is it because it's cheap? Or is it better at some tasks that others aren't?
It's such a shame, zed editor cannot use Gemini Flash for code completion, it's stuck on Supermaven or copilot.
Most editors can easily support LLMs via Fill in Middle operation mode
Does anyone know if the rate limits on Flash and Flash8B are separate?
It's on the bottom
> To make this model as useful as we can, we are doubling the 1.5 Flash-8B rate limits, meaning developers can send up to 4,000 requests per minute (RPM).
You can even compare the late limit here https://ai.google.dev/pricing