On 2024-06-25 08:42:02 -0400, Robert Haas wrote: > On Tue, Jun 25, 2024 at 8:03 AM Andres Freund <and...@anarazel.de> wrote: > > I think that's going in the wrong direction. We *want* to prune more > > aggressively if we can (*), the necessary state is represented by the > > vistest. That's a different thing than *having* to prune tuples beyond a > > certain xmin (the cutoff determined by vacuum.c/vacuumlazy.c). The problem > > we're having here is that the two states can get out of sync due to the > > vistest "moving backwards", because of hot_standby_feedback (and perhaps > > also > > an issue around aborts). > > I agree that we want to prune more aggressively if we can. I think > that fixing this by preventing vistest from going backward is > reasonable, and I like it better than what Melanie proposed, although > I like what Melanie proposed much better than not fixing it! I'm not > sure how to do that cleanly, but one of you may have an idea.
It's not hard - but it has downsides. It'll mean that - outside of vacuum - we'll much more often not react to horizons going backwards due to hot_standby_feedback. Which means that hot_standby_feedback, when used without slots, will prevent fewer conflicts. > I do think that having a bunch of different XID values that function > as horizons and a vistest object that holds some more XID horizons > floating around in vacuum makes the code hard to understand. The > relationships between the various values are not well-documented. For > instance, the vistest has to be after vacrel->cutoffs.OldestXmin for > correctness, but I don't think there's a single comment anywhere > saying that; It is somewhat documented: * Note: the approximate horizons (see definition of GlobalVisState) are * updated by the computations done here. That's currently required for * correctness and a small optimization. Without doing so it's possible that * heap vacuum's call to heap_page_prune_and_freeze() uses a more conservative * horizon than later when deciding which tuples can be removed - which the * code doesn't expect (breaking HOT). > And more generally, it seems like a fairly big problem to me that > LVRelState directly stores NewRelfrozenXid; contains a VacuumCutoffs > object that stores relfrozenxid, OldestXmin, and FreezeLimit; and also > points to a GlobalVisState object that contains definitely_needed and > maybe_needed. That is six different XID cutoffs for one vacuum > operation. That's a lot. I can't describe how they're all different > from each other or what the necessary relationships between them are > off-hand, and I bet nobody else could either, at least until recently, > else we might not have this bug. I feel like if it were possible to > have fewer of them and still have things work, we'd be better off. I'm > not sure that's doable. But six seems like a lot. Agreed. I don't think you can just unify things though, they actually are all different for good, or at least decent, reasons. I think improving the naming alone could help a good bit though. Greetings, Andres Freund