Hi,

Here's patch version 4.


1. The problem

I conducted further investigation on this issue. The 1ms sleep in
XactLockTableWait that falls back to polling was not problematic in certain
scenarios prior to v16. It became an potential issue after the "Allow
logical decoding on standbys" feature was introduced [1]. After that
commit, we can now create logical replication slots on hot standby servers
and perform many other logical decoding operations. [2]



2. The analysis

Is this issue limited to calling sites related to logical decoding? I
believe so. As we've discussed, the core issue is that primary transactions
are not running locally on the standby, which breaks the assumption of
XactLockTableWait—that there is an xid lock to wait on. Other call stacks
of this function, except those from logical decoding, are unlikely to
encounter this problem because they are unlikely to be invoked from standby
servers.


Analysis of XactLockTableWait calling sites:


Problematic Case (Hot Standby):

- Logical decoding operations in snapbuild.c during snapshot building

- This is the case that can run on hot standby and experience the issue


Non-problematic Cases (Primary Only):

- Heap operations (heapam.c): DML operations not possible on read-only
standby

- B-tree operations (nbtinsert.c): Index modifications impossible on
standby

- Logical apply workers (execReplication.c): Cannot start during recovery
(BgWorkerStart_RecoveryFinished)

- All other cases: Require write operations unavailable on standby



3. The proposed solution

If the above analysis is sound, one potential fix would be to add separate
branching for standby in XactLockTableWait. However, this seems
inconsistent with the function's definition—there's simply no lock entry in
the lock table for waiting. We could implement a new function for this
logic, but it's hard to find a proper place for it to fit well: lmgr.c is
for locks, while standby.c or procarray.c are not that ideal either.
Therefore, I placed the special handling for standby in
SnapBuildWaitSnapshot.


The waiting strategy for this version uses threshold-based sleep. A more
ideal approach might be a continuous sleep that's only woken up when the
transaction completes, but I didn't find the proper infrastructure to
support it.


The CHECK_FOR_INTERRUPTS() calls in the sleep sections have been removed.
With separate handling for primary and standby, unbounded waiting seems
unlikely to occur here.


[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=0fdab27ad68a059a1663fa5ce48d76333f1bd74c

[2]
https://bdrouvot.github.io/2023/04/19/postgres-16-highlight-logical-decoding-on-standby/


Best regards,
Xuneng

>

Attachment: v4-0001-Add-threshold-based-sleep-to-SnapBuildWaitSnapshot.patch
Description: Binary data

Reply via email to