Hi, While investigating other bugs I noticed that ResolveRecoveryConflictWithLock() wasn't really working. Turns out GetLockConflicts() violates it's API contract which says:
* The result array is palloc'd and is terminated with an invalid VXID. Problem 1: We don't actually put the terminator there. It happens to more or less accidentally work on a master because the array is palloc0()ed there and while a 0 is valid backend id it happens to not be a valid local transaction id. In HS we don't actually allocate the array every time, but it's instead statically allocated. Without zeroing. I have no idea why this doesn't crash ResolveRecoveryConflictWithInterrupt() which does: while (!lock_acquired) { if (++num_attempts < 3) backends = GetLockConflicts(locktag, AccessExclusiveLock); ... ResolveRecoveryConflictWithVirtualXIDs(backends, PROCSIG_RECOVERY_CONFLICT_LOCK); and ResolveRecoveryConflictWithVirtualXIDs does: ... while (VirtualTransactionIdIsValid(*waitlist)) { /* kill kill kill */ /* The virtual transaction is gone now, wait for the next one */ waitlist++; } I guess we just accidentally happen to come across appropriately set memory at some point. I'm really baffled that this hasn't caused significant problems so far. Problem 2: Since bcd8528f001 and 29eedd312274 the "the result array is palloc'd" is wrong because we're now doing: static VirtualTransactionId *vxids; /* * Allocate memory to store results, and fill with InvalidVXID. We only * need enough space for MaxBackends + a terminator, since prepared xacts * don't count. InHotStandby allocate once in TopMemoryContext. */ if (InHotStandby) { if (vxids == NULL) vxids = (VirtualTransactionId *) MemoryContextAlloc(TopMemoryContext, sizeof(VirtualTransactionId) * (MaxBackends + 1)); } else vxids = (VirtualTransactionId *) palloc0(sizeof(VirtualTransactionId) * (MaxBackends + 1)); Obviously that violates the API contract. I'm inclined to rip the HS special case out and add a pfree() to the single HS caller. The commit message introducing it says: Use malloc() in GetLockConflicts() when called InHotStandby to avoid repeated palloc calls. Current code assumed this was already true, so this is a bug fix. But I can't really believe this is relevant. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers