On 2025-Jul-25, Andrey Borodin wrote: > Also I've discovered one more serious problem. > If a backend crashes just before WAL-logging multi, any heap tuple > that uses this multi will become unreadable. Any attempt to read it > will hang forever. > > I've reproduced the problem and now I'm working on scripting this > scenario. Basically, I modify code to hang forever after assigning > multi number 2.
It took me a minute to understand this, and I think your description is slightly incorrect: you mean that the heap tuple that uses the PREVIOUS multixact cannot be read (at least, that's what I understand from your reproducer script). I agree it's a pretty ugly bug! I think it's essentially the same bug as the other problem, so the proposed fix should solve both. Thanks for working on this! Looking at this, /* * We want to avoid edge case 2 in redo, because we cannot wait for * startup process in GetMultiXactIdMembers() without risk of a * deadlock. */ MultiXactId next = multi + 1; int next_pageno; /* Handle wraparound as GetMultiXactIdMembers() does it. */ if (multi < FirstMultiXactId) multi = FirstMultiXactId; Don't you mean to test and change the value 'next' rather than 'multi' here? In this bit, * We do not need to handle race conditions, because this code * is only executed in redo and we hold * MultiXactOffsetSLRULock. I think it'd be good to have an Assert(LWLockHeldByMeInMode(MultiXactOffsetSLRULock, LW_EXCLUSIVE)); just for peace of mind. Also, commit c61678551699 removed ZeroMultiXactOffsetPage(), but since you have 'false' as the second argument, then SimpleLruZeroPage() is enough. (I wondered why isn't WAL-logging necessary ... until I remember that we're in a standby. I think a simple comment here like "no WAL-logging because we're a standby" should suffice.) -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/