On Mon, 2 Dec 2024 at 18:02, Tom Lane <t...@sss.pgh.pa.us> wrote: > > Andres Freund <and...@anarazel.de> writes: > > I think the problematic scenario involves tuples that *nobody* can see. > > During > > the bitmap index scan we don't know that though. Thus the tid gets inserted > > into the bitmap. Then, before we visit the heap, a concurrent vacuum removes > > the tuple from the indexes and then the heap and marks the page as > > all-visible, as the deleted row version has been removed. > > Yup. I am saying that that qualifies as too-aggressive setting of the > all-visible bit. I'm not sure what rule we should adopt instead of > the current one, but I'd much rather slow down page freezing than > institute new page locking rules.
I don't think it's new page locking rules, but rather a feature that forgot to apply page locking rules after bypassing MVCC's snapshot rules. IOS is only allowed exactly when they comply with index AM's locking rules "only return a TID that's on a page that can't concurrently be processed by vacuum" - why would this be different for the bitmap equivalent? By saying we're too aggressive with setting the all-visible bit you seem to suggest we should add rules to vacuum that to remove LP_DEAD items we don't just have to wait for tuples to be dead to all transactions, but also for all transactions that might've gotten those all_dead TIDs from indexes to have committed or rolled back, so that no references to those TIDs exist that might consider them "possibly visible". We could achieve that by adding a waitpoint to vacuum (between the index scan and the second heap scan for LP cleanup) which waits for all concurrent transactions accessing the table to commit (thus all bitmaps would be dropped), similar to REINDEX CONCURRENTLY's wait phase, but that would slow down vacuum's ability to clean up old data significantly, and change overall vacuum behaviour in a fundamental way. I'm quite opposed to such a change. Kind regards, Matthias van de Meent Neon (https://neon.tech)