The choice for log graphs here probably wasn't necessary and seems to have hurt more than it helped. Despite looking relatively similar, memcached performed 3x faster than redis on some benchmarks whilst appearing only slightly above average.
Otherwise, very thorough and well done benchmark from the looks of it. Redis my beloved not holding up so well against some others these days it looks like.
I wouldn't have even noticed if I hadn't seen this comment. Definitely necessary to change to linear scale.
Also, while I appreciate the thoroughness, I think it would be very useful to reduce the number of graphs significantly. Maybe 10x fewer. Just present the key ones that tell the story, and put the rest in another folder.
Agreed, it'd be nice to see the graphs with a linear scale.
Does anyone have an idea about why there is such a gap sometimes between valley and Redis? I would have expected only a marginal différence at this point.
I assume because valkey is multithreaded and redis isn’t
I'm sad to see memcached is used with the legacy text protocol instead of the recommended (and supported by the benchmarking software) binary protocol.
That shouldn't be representative of any modern deployments and not even declaring this outside of the code itself is IMO misleading.
Please fix the documentation or better, run that one and update the graphs.
Be good to also include AWS own hosted variants of Elasticache. They do a bunch of tuning as well so the results are likely different vs running the same software on the same Aws instance type too.
I'm happy to see Valkey consistently outperforming Redis. It should be food for thought for anyone considering rug pulls.
Anyone knows how Garnet outperforms others so much in pipelining >1 tests while being written in C#?
I think the programming language is not relevant, especially since startup time plays no role. A different design can have much more impact, and IIRC Garnet is not fully compatible with Redis.
The main difference appears to be that Garnet is more parallel, according to this student's report of benchmarking various keystores (see the "CPU usage" sections in the PDF) https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A196...
I would also like to see linear scaling graphs.
> c8g.8xlarge (32 core non-NUMA ARM64)
Requests are scheduled on half of these. Despite that, a plateau is hit after 8 threads? Is this a 16-core 32-thread type of a setup?
Also, consider redoing this in linear scale.
Edit: Oddly enough, no? 1 thread per core as per https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/cpu-opti...
> a plateau is hit after 8 threads?
Most of the graphs plateau around 6 threads, for pretty much all the caches under test. I wonder if there is some interesting architectural issue with cache-sharing on this particular platform?
I guess memory controller behavior; especially if it's not set up for parallelism IOPS but for single-threaded throughout (channel interleaving).
g = graviton which doesn't support SMT so 1 vcpu is 1 full core