At Fri, 10 Apr 2020 12:32:31 +0900, Masahiko Sawada <masahiko.saw...@2ndquadrant.com> wrote in > On Fri, 10 Apr 2020 at 04:05, Peter Geoghegan <p...@bowt.ie> wrote: > > > > On Wed, Apr 8, 2020 at 10:56 PM Masahiko Sawada > > <masahiko.saw...@2ndquadrant.com> wrote: > > > Here is the reproducer: > > > > What version of Postgres did you notice the actual customer issue on? > > I ask because I wonder if the work on B-Tree indexes in Postgres 12 > > affects the precise behavior you get here with real world workloads. > > It probably makes _bt_killitems() more effective with some workloads, > > which naturally increases the likelihood of having multiple FPI issued > > in the manner that you describe. OTOH, it might make it less likely > > with low cardinality indexes, since large groups of garbage duplicate > > tuples tend to get concentrated on just a few leaf pages. > > > > > The inner test in the comment "found the item" never tests the item > > > for being dead. So maybe we can add !ItemIdIsDead(iid) to that > > > condition. But there still is a race condition of recording multiple > > > FPIs can happen. Maybe a better solution is to change the lock to > > > exclusive, at least when wal_log_hints = on, so that only one process > > > can run this code -- the reduction in concurrency might be won back by > > > the fact that we don't wal-log the page multiple times. > > > > I like the idea of checking !ItemIdIsDead(iid) as a further condition > > of killing the item -- there is clearly no point in doing work to kill > > an item that is already dead. I don't like the idea of using an > > exclusive buffer lock (even if it's just with wal_log_hints = on), > > though. > > > > Okay. I think only adding the check would also help with reducing the > likelihood. How about the changes for the current HEAD I've attached?
FWIW, looks good to me. > Related to this behavior on btree indexes, this can happen even on > heaps during searching heap tuples. To reduce the likelihood of that > more generally I wonder if we can acquire a lock on buffer descriptor > right before XLogSaveBufferForHint() and set a flag to the buffer > descriptor that indicates that we're about to log FPI for hint bit so > that concurrent process can be aware of that. Makes sense if the lock were acquired just before the "BM_DIRTY | BM_JUST_DIRTIED) check. Could we use double-checking, as similar to the patch for ItemIdIsDead()? > if ((pg_atomic_read_u32(&bufHdr->state) & (BM_DIRTY | BM_JUST_DIRTIED)) != > (BM_DIRTY | BM_JUST_DIRTIED)) > { ... > * essential that CreateCheckpoint waits for virtual transactions > * rather than full transactionids. > */ > /* blah, blah */ > buf_state = LockBufHdr(bufHdr); > > if (buf_state & (BM_ | BM_JUST) != (..)) > { > MyProc->delayChkpt = delayChkpt = true; > lsn = XLogSaveBufferForHint(buffer, buffer_std); > } > } > else > buf_state = LockBuffer(bufHdr); regards. -- Kyotaro Horiguchi NTT Open Source Software Center