Hi Fujii, The WaitEvent sounds good to me, I will submit a separate patch for that when I find time.
Kevin On Wed, May 28, 2025 at 5:06 PM Fujii Masao <masao.fu...@oss.nttdata.com> wrote: > > > On 2025/05/27 4:43, Kevin K Biju wrote: > > Hi Fujii, > > > > Thanks for the review. > > So unless there are any objections, I'm planning to commit > the patch with the following commit message. > > --------------- > Make XactLockTableWait() and ConditionalXactLockTableWait() interruptable > more. > > Previously, XactLockTableWait() and ConditionalXactLockTableWait() could > enter > a non-interruptible loop when they successfully acquired a lock on a > transaction > but the transaction still appeared to be running. Since this loop continued > until the transaction completed, it could result in long, uninterruptible > waits. > > Although this scenario is generally unlikely since a transaction lock is > basically acquired only when the transaction is not running, it can occur > in a hot standby. In such cases, the transaction may still appear active > due to > the KnownAssignedXids list, even while no lock on the transaction exists. > For example, this situation can happen when creating a logical replication > slot > on a standby. > > The cause of the non-interruptible loop was the absence of > CHECK_FOR_INTERRUPTS() > within it. This commit adds CHECK_FOR_INTERRUPTS() to the loop in both > functions, > ensuring they can be interrupted safely. > > Back-patch to all supported branches. > > Author: Kevin K Biju <kevinkb...@gmail.com> > Reviewed-by: Fujii Masao <masao.fu...@gmail.com> > Discussion: > https://postgr.es/m/CAM45KeELdjhS-rGuvN=zlj_asvzacucz9lzwvzh7bgcd12d...@mail.gmail.com > Backpatch-through: 13 > --------------- > > > > > Just an idea: how about calling pgstat_report_wait_start() and _end() > around pg_usleep(), similar to what WaitExceedsMaxStandbyDelay() does? > > > > I was thinking about this as well, but the question is what do we report > the wait event here as? The one that makes sense to me is PG_WAIT_LOCK, but > that wouldn't align with what pg_locks would display since there is no > actual lock grabbed on the standby. > > I couldn't find an existing wait event that fits this case, so how about > adding a new one, like IPC/XactDone, to indicate "Waiting for the > transaction > to commit or abort"? > > > > > Also, would waiting only 1ms per loop cycle be too aggressive? > > > > I agree that it's aggressive for this case. But > XactLockTableWait/ConditionalXactLockTableWait seem to be used in other > places including in heapam so I'm hesitant on changing this behaviour for > all of them. > I agree that we need more time to consider whether this change is safe. > > Regards, > > -- > Fujii Masao > NTT DATA Japan Corporation > >