« BackReads Causing Writes in Postgresjesipow.comSubmitted by thunderbong a year ago
  • chasil a year ago

    In Oracle, this happens because uncommitted transactions are found to be committed by a later reader, which cleans them out.

    https://www.databasejournal.com/oracle/delayed-block-cleanou...

    • refset a year ago

      Interesting! MVCC mechanics aside, it's also worth remembering that work_mem is only 4MB by default [0], so large intermediate results will likely spill to disk (e.g. external sorts for ORDER BY operations).

      [0] https://www.postgresql.org/docs/current/runtime-config-resou...

      • metanonsense a year ago

        Did not see your comment until after I posted mine, but exactly this. The amount of disk io from these sort operations can be massive and very surprising.

      • pm90 a year ago

        Trying to reason about postgres is somewhat of an enigma when you are forced to do it; generally the only reason as a programmer you have to is because something went wrong, and then the mindset is a mix of nervousness and panic; then incredulity at some of the seemingly unintuitive behaviors. I suspect this might be true of any large, complex system at the edges.

        • isbvhodnvemrwvn a year ago

          With postgres I think it's also the problem of weak observability mechanisms. By default all you get is cumulative statistics. Then with extensions you get pg_stat_statements and a few more things, but you really shouldn't need to use something like pgAnalyze to get basics, like history of autovacuums, cumulative wait events and other stuff like that.

        • rpcope1 a year ago

          Things get even weirder when you use extensions. I remember being profoundly confused using Timescale 1 and doing a lot of concurrent writes on a hypertable with a foreign key (while also inserting into the other table) when I would get transaction deadlocks even in scenarios where it wouldn't normally be possible. This is how I found out doing DML on a "hypertable" actually does DDL under the hood, with all of the associated problems that brings.

          • efxhoy a year ago

            That’s confusing. What DDL did it do? Create new partitions?

            • juxhindb a year ago

              Likely creating child tables for the various chunks that kick in periodically (e.g., depending on your hypertable chunking policy). Used to hit these all the time, quite annoying.

          • buglungtung a year ago

            Greate article! I have learned about block/page long time ago when I needed to debug performance issue but not as deep as this article. Will share it with my teammate and its funny to see their emotional face :D

            • madars a year ago

              Similar things can also happen with file systems: ext4 mounted -o ro will let the driver do filesystem recovery even if userspace writes are prevented.

              • sneak a year ago

                That seems like it violates the principle of least surprise.

                • Sayrus a year ago

                  At the same time, you want to be able to read files in normal use-case. Being able to read them (after recovery) only if mounted read-write seems counterintuitive. This is the kind of times where right or wrong depends on the use.

                  • lazide a year ago

                    Also how you can end up with silly things like ro-but-i-really-mean-it-this-time flags

                    • poincaredisk a year ago

                      The forensics people I know don't worry about flags, and just use a write blocker for everything.

                      • lazide a year ago

                        Yeah and clone everything before even touching (the copy) too.

                    • numpad0 a year ago

                      Do changes need to go on disk for that to work?

                    • mort96 a year ago

                      Hmmm yes and no. If I set / to mount read-only in some embedded Linux system context, my intention is just that the contents of disk shouldn't change just because some program decided to write something somewhere; I would be quite surprised if some recoverable metadata bit flip or something caused the system to irrecoverably fail to boot just because the readonly flag also prevented fsck from fixing errors.

                      However if I have a faulty drive that I connect to my system to recover data from it and I don't want it to experience any more writes because I'm worried further writes may break it further, I would be quite surprised if 'mount -o ro' caused the driver to write to it.

                      • bobmcnamara a year ago

                        > I would be quite surprised if some recoverable metadata bit flip or something caused the system to irrecoverably fail to boot just because the readonly flag also prevented fsck from fixing errors.

                        This is exactly what happens maintaining bootloaders. As time goes on, the amount of configuration to get ext4 to reliably read a possibly dirty filesystem without modifying it has skyrocketed to the point where I started putting /boot on ext2 again.

                        • vbezhenar a year ago

                          Recovery and mounting should be separate operations. If filesystem is not clean, it should not be allowed to mount at all.

                          • epcoa a year ago

                            “Recovering” an otherwise error free journaled or logged filesystem is considered a normal operation. Unclean just doesn’t mean an error. That’s how this works and I don’t see very many interested in changing this behavior.

                            • Joe_Cool a year ago

                              You can disable the journal. It should(! haven't checked !) not touch the recovery information then. You also need this when you have a decade of version difference and an error on mount: `mount -oro,noload`

                          • undefined a year ago
                            [deleted]
                        • metanonsense a year ago

                          The authors of this article obviously know infinitely more about postgres than I do, but you can trigger writes using reads much easier. If you’re selecting something that does not fit into working memory and try to sort it (or use a mechanism that needs sorting), the sort is performed on disk.

                          This almost rendered our SAN nonfunctional a few years back.

                          • indulona a year ago

                            Haha

                            • cube2222 a year ago

                              TLDR: it can be caused by hint bit updates, as well as page pruning - both can be kicked off by a select query, and will be counted as part of the query’s statistics.

                              However, the article as a whole is both a much wider and deeper dive. I recommend giving it a read in full!

                              • vichle a year ago

                                Thanks, a TLDR should be mandatory for articles of this length :)

                                • stronglikedan a year ago

                                  As articles (especially about postgres) go, this isn't that long, but you can always get your own AI summary if it's too long for you.

                                  • SoftTalker a year ago

                                    Firefox reader mode (necessary to read this, as the font size and color choices are poor) estimated this at a 30+ minute read. It would be a courtesy to readers for authors to provide a summary. That way people can decide if they want to spend time reading further. This is why academic papers have an abstract up front.

                                    • makeitdouble a year ago

                                      > AI summary

                                      This is one of the AI side effect that I fear the most.

                                      We're not there, and perhaps will never be, but I imagine a point where information organization becomes fully neglected because an AI tools can do something about it.

                                      We have a taste of it with emailing that became a wasteland as we're supposed to filter and search it either way, and mail notifications have only a on/off button and nothing in-between.

                                      Not reading emails is I think close to the norm, and I guess "TLDR" will stop being an expression and just a fact of life ?