On Mon, Nov 13, 2023 at 2:29 PM Melanie Plageman <melanieplage...@gmail.com> wrote: > I think it also makes it clear that we should update the VM in > lazy_scan_prune(). All callers of lazy_scan_prune() will now consider > updating the VM after returning. And most of the state communicated back > to lazy_scan_heap() from lazy_scan_prune() is to inform it whether or > not to update the VM.
That makes sense. > I didn't do that in this patch set because I would > need to pass all_visible_according_to_vm to lazy_scan_prune() and that > change didn't seem worth the improvement in code clarity in > lazy_scan_heap(). Have you thought about finding a way to get rid of all_visible_according_to_vm? (Not necessarily in the scope of the ongoing work, just in general.) all_visible_according_to_vm is inherently prone to races -- it tells us what the VM used to say about the page, back when we looked. It is very likely that the page isn't visible in this sense, anyway, because VACUUM is after all choosing to scan the page in the first place when we end up in lazy_scan_prune. (Granted, this is much less true than it should be due to the influence of SKIP_PAGES_THRESHOLD, which implements a weird and inefficient form of prefetching/readahead.) Why should we necessarily care what all_visible_according_to_vm says or would say at this point? We're looking at the heap page itself, which is more authoritative than the VM (theoretically they're equally authoritative, but not really, not when push comes to shove). The best answer that I can think of is that all_visible_according_to_vm gives us a way of noticing and then logging any inconsistencies between VM and heap that we might end up "repairing" in passing (once we've rechecked the VM). But maybe that could happen elsewhere. Perhaps that cross-check could be pushed into visibilitymap.c, when we go to set a !PageIsAllVisible(page) page all-visible in the VM. If we're setting it and find that it's already set from earlier on, then remember and complaing/LOG it. No all_visible_according_to_vm required, plus this seems like it might be more thorough. Just a thought. -- Peter Geoghegan