On Wed, Oct 21, 2020 at 3:36 PM Peter Geoghegan <p...@bowt.ie> wrote: > Bear in mind that we actually do practically the same thing all the > time with the current LP_DEAD setting stuff, where we need to call > compute_xid_horizon_for_tuples/heap_compute_xid_horizon_for_tuples > with a leaf buffer lock held in almost the same way. That's actually > potentially far worse if you look at it in isolation, because you > could potentially have hundreds of heap pages, whereas this is just 1 > - 3. (BTW, next version will also do that work in passing, so you're > practically guaranteed to do less with a buffer lock held compared to > the typical case of nbtree LP_DEAD setting, even without counting how > the LP_DEAD bits get set in the first place.)
That's fair. It's not that I'm trying to enforce some absolute coding rule, as if I even had the right to do such a thing. But, people have sometimes proposed patches which would have caused major regression in this area and I think it's really important that we avoid that. Normally, it doesn't matter: I/O requests complete quickly and everything is fine. But, on a system where things are malfunctioning, it makes a big difference whether you can regain control by hitting ^C. I expect you've probably at some point had the experience of being unable to recover control of a terminal window because some process was stuck in wait state D on the kernel level, and you probably thought, "well, this sucks." It's even worse if the kernel's entire process table fills up with such processes. This kind of thing is essentially the same issue at the database level, and it's smart to do what we can to mitigate it. But that being said, I'm not trying to derail this patch. It isn't, and shouldn't be, the job of this patch to solve that problem. It's just better if it doesn't regress things, or maybe even (as you say) makes them a little better. I think the idea you've got here is basically good, and a lot of it comes down to how well it works in practice. I completely agree that looking at amortized cost rather than worst-case cost is a reasonable principle here; you can't take that to a ridiculous extreme because people also care about consistently of performance, but it seems pretty clear from your description that your patch should not create any big problem in that area, because the worst-case number of extra buffer accesses for a single operation is tightly bounded. And, of course, containing index bloat is its own reward. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company