On Thu, Apr 29, 2021 at 1:22 PM Peter Geoghegan <p...@bowt.ie> wrote: > > On Wed, Apr 28, 2021 at 7:34 PM tanghy.f...@fujitsu.com > <tanghy.f...@fujitsu.com> wrote: > > >TRAP: FailedAssertion("!all_visible_according_to_vm || > > >prunestate.all_visible", File: "vacuumlazy.c", Line: 1347, PID: 1274675) > > > > BTW, in asynchronous mode, the same problem can also happen but in a low > > frequency.(I tried many times, but the problem happened only 2 times) > > As for synchronous mode, I found it seems easier to reproduce the problem > > with setting "autovacuum_naptime = 1". > > But it still can't be 100% to reproduced it. (I tested it 5 times, 3 of > > them reproduced it.)
Thanks for reporting the issue! > > Is setting all_visible_according_to_vm false as below enough to avoid > the assertion failure? > > diff --git a/src/backend/access/heap/vacuumlazy.c > b/src/backend/access/heap/vacuumlazy.c > index c3fc12d76c..76c17e063e 100644 > --- a/src/backend/access/heap/vacuumlazy.c > +++ b/src/backend/access/heap/vacuumlazy.c > @@ -1146,6 +1146,7 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams > *params, bool aggressive) > { > ReleaseBuffer(vmbuffer); > vmbuffer = InvalidBuffer; > + all_visible_according_to_vm = false; > } > > /* Remove the collected garbage tuples from table and indexes */ Since we set all_visible_according_to_vm before acquiring the buffer lock it's likely to happen that the page gets modified and all-visible bit is cleared after setting true to all_visible_according_to_vm. This assertion can easily be reproduced by adding a delay before the buffer lock and invoking autovacuums frequently: diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index c3fc12d76c..76f067a7e4 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -1180,6 +1180,8 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive) buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL, vacrel->bstrategy); + pg_usleep(100000); + /* * We need buffer cleanup lock so that we can prune HOT chains and * defragment the page. So we should recheck also visibility map bit there but I think we can remove this assertion since we already do that in later code and we don’t treat this case as a should-not-happen case: /* * As of PostgreSQL 9.2, the visibility map bit should never be set if * the page-level bit is clear. However, it's possible that the bit * got cleared after we checked it and before we took the buffer * content lock, so we must recheck before jumping to the conclusion * that something bad has happened. */ else if (all_visible_according_to_vm && !PageIsAllVisible(page) && VM_ALL_VISIBLE(vacrel->rel, blkno, &vmbuffer)) { elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u", vacrel->relname, blkno); visibilitymap_clear(vacrel->rel, blkno, vmbuffer, VISIBILITYMAP_VALID_BITS); } I've attached a patch removing the assertion. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index c3fc12d76c..47ac6385d1 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -1344,7 +1344,6 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive) lazy_scan_prune(vacrel, buf, blkno, page, vistest, &prunestate); Assert(!prunestate.all_visible || !prunestate.has_lpdead_items); - Assert(!all_visible_according_to_vm || prunestate.all_visible); /* Remember the location of the last page with nonremovable tuples */ if (prunestate.hastup)