On Sat, Feb 22, 2020 at 6:10 AM Palamadai, Eka <ekana...@amazon.com> wrote: > Thanks a lot for the feedback. Please let me know if you have any further > comments. Meanwhile, I have also added this patch to "Commitfest 2020-03" at > https://commitfest.postgresql.org/27/2464.
Thanks for the excellent reproducer for this obscure bug. You said the problem exists in 9.6-11, but I'm also able to reproduce it in 9.5. That's the oldest supported release, but it probably goes back further. I confirmed that this patch fixes the immediate problem. I've attached a version of your patch with a commit message, to see if anyone has more feedback on this.
From 26740a9a670161f27c07c3c83c1056f8b4373d35 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Wed, 11 Mar 2020 17:02:02 +1300 Subject: [PATCH] Fix nextXid tracking bug on standbys (9.5-11 only). RecordKnownAssignedTransactionIds() should never move nextXid backwards. Before this commit, that could happen if some other code path had advanced it without advancing latestObservedXid. One consequence is that a well timed XLOG_CHECKPOINT_ONLINE could cause hot standby feedback messages to get confused and report an xmin from a future epoch, potentially allowing vacuum to run too soon on the primary. Repair, by making sure RecordKnownAssignedTransactionIds() can only move nextXid forwards. In release 12 and master, this was already done by commit 2fc7af5e, which consolidated similar code and straightened out this bug. Back-patch to supported releases before that. Author: Eka Palamadai Discussion: https://postgr.es/m/98bb4805-d0a2-48e1-96f4-15014313e...@amazon.com --- src/backend/storage/ipc/procarray.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index ddd3461d56..50b9aea683 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -3256,7 +3256,8 @@ RecordKnownAssignedTransactionIds(TransactionId xid) next_expected_xid = latestObservedXid; TransactionIdAdvance(next_expected_xid); LWLockAcquire(XidGenLock, LW_EXCLUSIVE); - ShmemVariableCache->nextXid = next_expected_xid; + if (TransactionIdFollows(next_expected_xid, ShmemVariableCache->nextXid)) + ShmemVariableCache->nextXid = next_expected_xid; LWLockRelease(XidGenLock); } } -- 2.20.1