On Sat, Sep 10, 2022 at 12:07:30PM +0800, Zhang Mingli wrote: > That’s interesting, dig into it for a while but not too much progress. > > Maybe we could add some logs to print MultiXactMembers’ xid and status if xid > is 0. > > Inside MultiXactIdGetUpdateXid() > > ``` > nmembers = GetMultiXactIdMembers(xmax, &members, false, false); > > if (nmembers > 0) > { > int i; > > for (i = 0; i < nmembers; i++) > { > /* Ignore lockers */ > if (!ISUPDATE_from_mxstatus(members[i].status)) > continue; > > /* there can be at most one updater */ > Assert(update_xact == InvalidTransactionId); > update_xact = members[i].xid; > > // log here if xid is invalid
> #ifndef USE_ASSERT_CHECKING > > /* > * in an assert-enabled build, walk the whole array to ensure > * there's no other updater. > */ > break; > #endif > } > > pfree(members); > } > // and here if didn’t update update_xact at all (it shouldn’t happen as > designed) Yeah. I added assertions for the above case inside the loop, and for this one, and this fails right before "return". TRAP: FailedAssertion("update_xact != InvalidTransactionId", File: "src/backend/access/heap/heapam.c", Line: 6939, PID: 4743) It looks like nmembers==2, both of which are lockers and being ignored. > And could we see multixact reply in logs if db does recover? Do you mean waldump or ?? BTW, after a number of sigabrt's, I started seeing these during recovery: < 2022-09-09 19:44:04.180 CDT >LOG: unexpected pageaddr 1214/AF0FE000 in log segment 0000000100001214000000B4, offset 1040384 < 2022-09-09 23:20:50.830 CDT >LOG: unexpected pageaddr 1214/CF65C000 in log segment 0000000100001214000000D8, offset 6668288 -- Justin