Hi, On 2022-12-29 09:43:39 -0800, Peter Geoghegan wrote: > On Thu, Dec 29, 2022 at 9:21 AM Andres Freund <and...@anarazel.de> wrote: > > I do think we wanted to avoid reviving actually-dead tuples (present due to > > the multixact and related bugs). And I'm worried about giving that checking > > up, I've seen it hit too many times. Both in the real world and during > > development. > > I could just move the same tests from heap_prepare_freeze_tuple() to > heap_freeze_execute_prepared(), without changing any of the details.
That might work, yes. > That would mean that the TransactionIdDidCommit() calls would only > take place with tuples that actually get frozen, which is more or less > how it worked before now. > > heap_prepare_freeze_tuple() will now often prepare freeze plans that > just get discarded by lazy_scan_prune(). My concern is the impact on > tables/pages that almost always discard prepared freeze plans, and so > require many TransactionIdDidCommit() calls that really aren't > necessary. It seems somewhat wrong that we discard all the work that heap_prepare_freeze_tuple() did. Yes, we force freezing to actually happen in a bunch of important cases (e.g. creating a new multixact), but even so, e.g. GetMultiXactIdMembers() is awfully expensive to do for nought. Nor is just creating the freeze plan free. I think the better approach might be to make heap_tuple_should_freeze() more powerful and to only create the freeze plan when actually freezing. I wonder how often it'd be worthwhile to also do opportunistic freezing during lazy_vacuum_heap_page(), given that we already will WAL log (and often issue an FPI). > > Somewhat of a tangent: I've previously wondered if we should have a small > > hash-table based clog cache. The current one-element cache doesn't suffice > > in > > a lot of scenarios, but it wouldn't take a huge cache to end up filtering > > most > > clog accesses. > > I imagine that the one-element cache works alright in some scenarios, > but then suddenly doesn't work so well, even though not very much has > changed. Behavior like that makes the problems difficult to analyze, > and easy to miss. I'm suspicious of that. I think there's a lot of situations where it flat out doesn't work - even if you just have an inserting and a deleting transaction, we'll often end up not hitting the 1-element cache due to looking up two different xids in a roughly alternating pattern... Greetings, Andres Freund