On 2025/05/27 4:43, Kevin K Biju wrote:
Hi Fujii, Thanks for the review.
So unless there are any objections, I'm planning to commit the patch with the following commit message. --------------- Make XactLockTableWait() and ConditionalXactLockTableWait() interruptable more. Previously, XactLockTableWait() and ConditionalXactLockTableWait() could enter a non-interruptible loop when they successfully acquired a lock on a transaction but the transaction still appeared to be running. Since this loop continued until the transaction completed, it could result in long, uninterruptible waits. Although this scenario is generally unlikely since a transaction lock is basically acquired only when the transaction is not running, it can occur in a hot standby. In such cases, the transaction may still appear active due to the KnownAssignedXids list, even while no lock on the transaction exists. For example, this situation can happen when creating a logical replication slot on a standby. The cause of the non-interruptible loop was the absence of CHECK_FOR_INTERRUPTS() within it. This commit adds CHECK_FOR_INTERRUPTS() to the loop in both functions, ensuring they can be interrupted safely. Back-patch to all supported branches. Author: Kevin K Biju <kevinkb...@gmail.com> Reviewed-by: Fujii Masao <masao.fu...@gmail.com> Discussion: https://postgr.es/m/CAM45KeELdjhS-rGuvN=zlj_asvzacucz9lzwvzh7bgcd12d...@mail.gmail.com Backpatch-through: 13 ---------------
> Just an idea: how about calling pgstat_report_wait_start() and _end() around pg_usleep(), similar to what WaitExceedsMaxStandbyDelay() does? I was thinking about this as well, but the question is what do we report the wait event here as? The one that makes sense to me is PG_WAIT_LOCK, but that wouldn't align with what pg_locks would display since there is no actual lock grabbed on the standby.
I couldn't find an existing wait event that fits this case, so how about adding a new one, like IPC/XactDone, to indicate "Waiting for the transaction to commit or abort"?
> Also, would waiting only 1ms per loop cycle be too aggressive? I agree that it's aggressive for this case. But XactLockTableWait/ConditionalXactLockTableWait seem to be used in other places including in heapam so I'm hesitant on changing this behaviour for all of them.
I agree that we need more time to consider whether this change is safe. Regards, -- Fujii Masao NTT DATA Japan Corporation