Comments Page - Building rqlite 9.0: Cutting disk usage by half

« Back Building rqlite 9.0: Cutting disk usage by halfphilipotoole.comSubmitted by otoolep a year ago

west0n a year ago
I'm curious about how rqlite's performance compares to other distributed databases developed in Go, such as CockroachDB, Vitess, and TiDB.
- jitl a year ago
  It’s going to have much lower write throughput, since SQLite is single-writer and on top of that you need to do Raft consensus. TiDB and CockroachDB can handle concurrent writes easily. Cockroach runs raft per “range” of 128mb of the key space, I’m not as familiar with TiDB. Vitess is an orchestration layer over MySQL, and MySQL handles concurrent writes easily.
  otoolep a year ago
  rqlite creator here.
  That's correct, there is a write-performance hit for the reasons you say. All Raft systems will take the same hit, and SQLite is intrinsically single-writer -- nothing about rqlite changes that[2]. That said, there are approaches to increasing write-performance substantially. See [1] for much more information.
  Write-performance is not the only thing to consider though (assuming one has sufficient performance in that dimension). Ease of deployment and operation are also important, and that's an area in which rqlite excels[3] (at least I think so, but I'm biased).
  [1] https://rqlite.io/docs/guides/performance/
  [2] https://rqlite.io/docs/faq/#rqlite-is-distributed-does-that-...
  [3] https://rqlite.io/docs/faq/#why-would-i-use-this-versus-some...
  otoolep a year ago
  Oh, I also presented some performance numbers in a presentation to a CMU a couple of years back. A little out-of-date, but gives a order-of-magnitude sense. https://youtu.be/JLlIAWjvHxM?t=2690
  The biggest performance improvement since is due to the introduction of Queued Writes. See https://rqlite.io/docs/api/queued-writes/
  joostdecock a year ago
  > Ease of deployment and operation are also important, and that's an area in which rqlite excels
  Amen. I've been building something appliance-like where I want to support clustering but I don't want to manage a database cluster inside the project.
  Rqlite is so easy to run either stand-alone or clustered. It's a godsend.
  And when people want postgres or whatever, I let them bring their own database. It's not hard to abstract a database storage layet if you plan ahead.
  But if you want it to 'just work' rqlite is doing that with flying colors.
  spmurrayzzz a year ago
  Relevant to the original inquiry — I really admire that you bring up the etcd and consul comparison right up front in the readme. For my own comprehension at least, it makes obvious the type of workloads for which you're optimizing and I appreciate that context as a past user of both of those stacks.
- ClumsyPilot a year ago
  Maybe ETCD is a more appropriate comparison?
  protosam a year ago
  Depending on what you’re using these tools for. If you want a locking manager and some meta data storage to help your distributed system maintain state, etcd is better for the job than rqlite for that. It’s a better zookeeper. With etcd you can hold a lock and defer unlocking if the connection is disrupted. Rqlite is not a good option for this.
  otoolep a year ago
  Agreed, in the sense that while rqlite has a lot in common with etcd (and Consul too -- Consul and rqlite share the same Raft implementation[1]) rqlite's primary use case is not about making it easy to build other distributed systems on top of it.
  [1] https://github.com/hashicorp/raft
  protosam a year ago
  Every time I've looked at rqlite, it just falls short features-wise in what I would want to do with it. A single raft group does not scale horizontally, so to me rqlite is a toy rather than a tool worth using (because someone might mistake the toy as production grade software).
  otoolep a year ago
  rqlite creator here.
  That's clearly a mistaken attitude because both Consul and etcd also use a single "Raft group" and they are production-grade software.
  Ruling out a piece of software simply because it doesn't "scale horizontally" (and only writes don't scale horizontally in practice) is a naive attitude.
  protosam a year ago
  The qualifier here is for /my/ use cases. However I couldn't recommend rqlite over better options at the level of scale that it can fill.
  One of the problems is if you're working with developers, the log replication contents is the queries, instead of the sqlite WAL like in dqlite. I know this is a work around to integrate mattn/sqlite3, but it's untenable in enterprise applications where developers are going to just think "oh, I can do sqlite stuff!". This is a footgun that someone will inevitably trigger at some point if rqlite is in their infrastructure for anything substantial. In enterprise, it's plainly untenable.
  Another issue is if I want to architect a system around rqlite, it wont be "consistent" with rqlite alone. The client must operate the transaction and get feedback from the system, which you can not do with an HTTP API the way you've implemented it. There was a post today where you can observe that with the jetcd library against etcd. Furthermore to this point, you can't even design a consistent system around rqlite alone because you can't use it as a locking service. If I want locks, I end up deploying etcd, consul, or zookeeper anyways.
  If I had to choose a distributed database with schema support right now for a small scale operation, it would probably be yugabyte or cockroachdb. They're simply better at doing what rqlite is trying to do.
  At the end of the day, the type of people needing to do data replication also need to distribute their data. They need a more robust design and better safety guarantees than rqlite can offer today. This is literally the reason one of my own projects has been in the prototyping stage for nearly 10 years now. If building a reliable database was as easy as integrating sqlite with a raft library, I would have shipped nearly 10 years ago. Unfortunately, I'm still testing non-conventional implementations to guarantee safety before I go sharing something that people are going to put their valuable data into.
  To simply say I'm "ruling out a piece of software because it doesn't scale horizontally" is incorrect. The software lacks designs and features required for the audience you probably want to use it.
  Hopefully you find my thoughts helpful in understanding where I'm coming from with the context I've shared.
  otoolep a year ago
  Wow, a lot there. Thanks for your comments.
  >One of the problems is if you're working with developers, the log replication contents is the queries, instead of the sqlite WAL like in dqlite.
  I think you mean rqlite does "statement-based replication"? Yes, that is correct, it has its drawbacks, and is clearly called out in the docs[1].
  >Another issue is if I want to architect a system around rqlite, it wont be "consistent" with rqlite alone. The client must operate the transaction and get feedback from the system, which you can not do with an HTTP API the way you've implemented it.
  I don't understand this statement. rqlite docs are quite clear about the types of transactions it supports. It doesn't support traditional transactions because of the nature of the HTTP API (though that could be addressed).
  >Furthermore to this point, you can't even design a consistent system around rqlite alone because you can't use it as a locking service. If I want locks, I end up deploying etcd, consul, or zookeeper anyways.
  rqlite is not about allowing developers build consistent systems on top of it. That's not its use case. It's highly-available, fault-tolerant store, the aims for ease-of-use and ease-of-operation -- and aims to do what it does do very well.
  >If I had to choose a distributed database with schema support right now for a small scale operation, it would probably be yugabyte or cockroachdb. They're simply better at doing what rqlite is trying to do.
  https://rqlite.io/docs/faq/#why-would-i-use-this-versus-some...
  Of course, you should always pick the database that meets your needs.
  >If building a reliable database was as easy as integrating sqlite with a raft library, I would have shipped nearly 10 years ago.
  Who said it was easy? It's taken almost 10 years of programming to get to the level of maturity it's at today.
  >They need a more robust design and better safety guarantees than rqlite can offer today.
  That is an assertion without any evidence. What are the safety issues with rqlite within the context of its design goals and scope? I would very much like to know so I can address them. Quality is very important to me.
  [1] https://rqlite.io/docs/api/non-deterministic/
  protosam a year ago
  > That is an assertion without any evidence.
  This seems like a lack of knowledge issue. The problems with rqlite are inherit in it's design as I've already articulated. You can literally start reading jepsen analyses right now and understand it if you don't already: https://jepsen.io/analyses
  otoolep a year ago
  Can you be more specific?
  "Evidence Dump Fallacy." This fallacy occurs when a person claims that a certain proposition is true but, instead of providing clear and specific evidence to support the claim, directs the questioner to a large amount of information, asserting that the evidence is contained within.
  protosam a year ago
  You realize that your product offers no transaction support due to the HTTP API right?
  otoolep a year ago
  Transactions -- or the lack thereof -- have nothing to do with the consistency guarantees offered by rqlite.
  You may wish to read this:
  https://github.com/wildarch/jepsen.rqlite/blob/main/doc/blog...
  rqlite -- to the best of my knowledge and as a result of extensive testing -- offers strict linearizability due to its use of the Raft protocol. Each write request to rqlite is atomic because it's encapsulated in a single Raft log entry -- this is distinct from the other form of transactions offered by rqlite[1], but that second form of transaction functionality has zero effect on the guarantees offered by Raft and rqlite (they are completely different things, operating at different levels in the design). If you know otherwise I'd very much like to know precisely why and how.
  [1] https://rqlite.io/docs/api/api/#transactions
  protosam a year ago
  I won't be following up further. I've shared all I have to share on this topic. On a personal level, I'm actually disappointed in how you take to critical feedback about your product and don't seem to be interested in understanding the problem domain you're developing for.
  https://gist.github.com/protosam/35880f46ed3f3e80a4e2ec47e6b...
usr1106 a year ago
It's been many years that I haven't been working actively with databases anymore, but I have never heard about Raft or rq before. Is that system used by many / by significant players?
- hencoappel a year ago
  I've not heard of rqlite, but raft is a popular consensus algorithm used by several quite a few notable systems including CockroachDB, MongoDB, RabbitMQ. https://en.m.wikipedia.org/wiki/Raft_(algorithm)
  kitd a year ago
  And now Kafka, without Zookeeper.
- otoolep a year ago
  rqlite creator here.
  One notable production user is Replicated: https://www.philipotoole.com/replicated-postgres-to-rqlite/
simplify a year ago
Well-written article. Introduces what the library is/does, gives background context, includes a high-level overview of the system, and demonstrates how it solves the problem.
- otoolep a year ago
  Just to be clear, rqlite is not a library. It's a complete RDBMS. rqlite has everything you need to read and write data, and backup, maintain, and monitor the database itself. It's not just a library (unlike, say, dqlite).
yNeolh a year ago
A little off-topic, but I love that the first paragraph describes the project. Usually, posts exclude that information, and the landing is not of more help.
- undefined a year ago
  [deleted]
djbusby a year ago
Cool! We use this but never had too much problems with disk usage (I/o or size). But, were only using to deploy configs across multiple nodes. And we read the local copy directly!
- mst a year ago
  Presumably you'll need to either use the correct WAL file or accept some very slight data staleness during the 9.0 sync process (also the direct read trick sounds great to me where permissible - EDIT: question about how much of your warranty does that void removed because the answer's already written up here - https://rqlite.io/docs/faq/#can-i-read-the-sqlite-file-direc...).
  I would -imagine- that 'slight staleness' won't matter for your use case (I'm pretty confident that for the sort of configs I'm considering rqlite for that'll be the case, at least) but it's probably worth triple checking when configs are involved.
  otoolep a year ago
  Thanks for the question, but I don't follow it -- I don't see any data staleness if you query rqlite during snapshotting. Granted the blog post doesn't go into every single detail, so this might be hard to follow.
  Can you expand a bit more on your concern? What scenario do you have in mind?
  mst a year ago
  They're querying the sqlite database directly so I was thinking, possibly mistakenly, that when rqlite spawns a new WAL file to snapshot, the connection they have to said database outside of rqlite might not see the changes only in the snapshotting operation's WAL.
  I could easily be imagining a problem that won't exist.
  otoolep a year ago
  Ah yes. If you are accessing the SQLite files directly I cannot be sure what you will see during the snapshotting. I haven't tested that, since accessing the SQLite files underneath rqlite is not officially supported.
Lord_Zero a year ago
How does this compare to litestream?
- scottyeager a year ago
  Litestream is a solution for realtime replication of a SQLite database to S3 storage. The application uses SQLite as usual. It allows for recovery with less opportunity for data loss than with periodic backups.
  rqlite is a database server that uses SQLite as a backend and Raft consensus for clustering. It provides an API for clients to access over a network connection, rather than clients using SQLite directly. With a cluster you get data replication and high availability.
  There's also LiteFS now from the creator of Litestream. It's also a clustered approach but the app still uses SQLite directly. LiteFS depends on a Consul cluster.
  See also: https://litestream.io/alternatives/
  And: https://rqlite.io/docs/faq/#how-is-it-different-than-litestr...
zxilly a year ago
well, still didn't support embed as library.
niux a year ago
Absolutely amazing news! Thank you Philip!
malaysanghi a year ago
[flagged]
commercialnix a year ago
Looking forward to a Rust implementation.
ku1ik a year ago
Version 9.0 already? To me this unfortunately signals lack of focus and/or disregard for backward compatibility, which means unnecessary churn for me as user of this project. Hard pass.
- otoolep a year ago
  rqlite creator here. That's a mischaracterization.
  rqlite has been in development for 10 years[1], it's a long-running project and its design goals have never changed. The API hasn't changed in a breaking fashion since 2016 and rqlite has supported seamless upgrades for years now.
  In other words rqlite users have been upgrading from version to version for over 8 years, without having to change a single line of their code.
  [1] https://rqlite.io/docs/design/
  OskarS a year ago
  I so much more appreciate this style of version numbering compared to being eternally 0.something.
  ku1ik a year ago
  Thanks, and sorry for mischaracterization. Glad to hear it’s different than I (wrongly) assumed. I’m curious though, if the design goals never changed and there was very little breaking changes, why the version is that high then? Not that you can’t use any numbering scheme you like, and not that semver is mandatory for every project. In hindsight I think my reaction came from assuming you use semver.
  otoolep a year ago
  I usually bump the major version number anytime I introduce important new functionality, major performance improvements, or a major new design change. While the API hasn't changed in years, the underlying implementation and file layout can change a lot between major versions. I want to communicate that.
  Also rqlite doesn't support seamless downgrades between major versions, only seamless upgrades. I want to communicate that too (I've put a lot of work into the backup-and-restore system[1] so users can protect themselves if they are concerned about the seamless upgrade failing on them).
  So by bumping the major version it helps people understand that they are upgrading to a substantially different version of rqlite, even if their client code doesn't have to change at all.
  [1] https://rqlite.io/docs/guides/backup/
  ku1ik a year ago
  Got it, this makes sense. Thanks!
  kstrauser a year ago
  As a side note, that was a gracious way for you to reply. Nicely done.
- sevg a year ago
  Perhaps do a bit more research before you spread FUD from a knee-jerk reaction to a version number.
- undefined a year ago
  [deleted]