I’ve found this article lacking. Like some other articles in this space, it introduces isolation levels through the lens of the phenomena described in the SQL standard, but I find that there’s a different, more intuitive approach.
I think it’s more tractable to define this problem space starting from the concept of (strict) serializability, which is really a generalization of the concept of thread safety. Every software engineer has an intuitive understanding of it. Lack of serializability can lead to execution-dependent behavior, which usually results in hard-to-diagnose bugs. Thus, all systems should strive towards serializability, and the database can be a tool in achieving it.
Various non-serializable levels of database transaction isolation are relaxations of the serializability guarantee, where the database no longer enforces the guarantee and it’s up to the database user to ensure it through other means.
The isolation phenomena are a useful tool for visualizing various corner cases of non-serializability, but they are not inherently tied to it. It's possible to achieve serializability while observing all of the SQL phenomena. For example, a Kubernetes cluster with carefully-written controllers can be serializable.
Author here. This is good feedback.
The combination of transactions, isolation levels, and MVCC is such a huge undertaking to cover all at once, specially when comparing how it's done across multiple DBs which I attempted here. Always a balance between technical depth, accessibility to people with less experience, and not letting it turn into an hour-long read.
I love the work planetscale does on keeping this type of content accurate yet accessible. Keep it up!
https://aphyr.com/posts/327-jepsen-mariadb-galera-cluster
More notation, more citations, more better.
Notation is useful. Citations are nice for further reading. But I don't agree more of this makes for a better article!
Looks like the author is geoblocking in protest of the UK Online Safety Act (and fair enough).
Most RDBMSs offer serializable isolation if you need it. Often you don't need it. The downside of using serializable isolation unnecessarily is reduced concurrency and throughput due to increased coordination between transactions.
Then recommend a better explanation?
This actually used to be one of my favorite interview questions for backend engineers. Everyone has used transactions but depending on your seniority you'd understand it to different degrees.
And no I'd never expect people to know the isolation levels by heart, but if you know there are different ones and they behave differntly that's pretty good and tells me you are curious about how things work under the hood.
What's up with the trend of articles that seem like they're written by someone after a couple of sessions in a freshmen comp sci class? Is this the result of AI posts and humans trying to compete at the same game? Nothing new here and a discussion of DB transactions without even mentioning ACID compliance and the trade-offs? You're better off picking up a 40 year old textbook than posts like this.
>What's up with the trend of articles that seem like they're written by someone after a couple of sessions in a freshmen comp sci class?
This is funny and cracked me up. Because the author is actually the one who teaches CS in University.
>Nothing new here and a discussion of DB transactions without even mentioning ACID compliance and the trade-offs? You're better off picking up a 40 year old textbook than posts like this.
That would have been a very long blog post. Edit: I just realise Ben has already replied above.
A lot of database tools these days prioritize instant sharing of updates over transactions and ACID properties. Example: Airtable. As soon as you update a field the update shows up on your coworkers screen who also has the same table open. The downside of this is that Airtable doesn't do transactions. And the downside of not doing transactions is potentially dangerous data inconsistencies. More about that here: https://visualdb.com/blog/concurrencycontrol/
It's an absolute pleasure reading planetscale blogs. I'm curious about what tool is used to make these visualizations?
Author here. Thank you! These visuals are built with js + gsap (https://gsap.com)
Thank you for sharing it, kind sir. Your explanation on b+trees (https://planetscale.com/blog/btrees-and-database-indexes) is probably the best one I've ever seen on the internet.
I think this is a great post to have but I'm going to make a critical usability suggestion:
* the videos should have "pause" and a "step at a time" control *
Even at the "half speed", without a deep knowledge of the context, the videos move way too fast for me to read the syntax that's invoking and line it up with the data on the left side. I (and im definitely not the only one) need to be able to sit on one step and stare at the whole thing without the latent anxiety of the state changing before I've had a chance to grok the whole thing.
this has nothing to do with familiarity with the concepts (read my profile). I literally need time to read all the words and connect them together mentally (ooh, just noticed this is pseudo-SQL syntax also, e.g. "select id=4", that probably added some load for me) without worrying they're going to change before watching things move.
please add a step-at-a-time button!
I appreciate this feedback, and then you read through it with enough rigor to notice.