On Sat, Sep 10, 2022 at 12:07:30PM +0800, Zhang Mingli wrote:
> That’s interesting, dig into it for a while but not too much progress.
> 
> Maybe we could add some logs to print MultiXactMembers’ xid and status if xid 
> is 0.
> 
> Inside MultiXactIdGetUpdateXid()
> 
> ```
>       nmembers = GetMultiXactIdMembers(xmax, &members, false, false);
> 
>       if (nmembers > 0)
>       {
>  int i;
> 
>  for (i = 0; i < nmembers; i++)
>  {
>  /* Ignore lockers */
>  if (!ISUPDATE_from_mxstatus(members[i].status))
>  continue;
> 
>  /* there can be at most one updater */
>  Assert(update_xact == InvalidTransactionId);
>  update_xact = members[i].xid;
> 
> // log here if xid is invalid

> #ifndef USE_ASSERT_CHECKING
> 
>  /*
>  * in an assert-enabled build, walk the whole array to ensure
>  * there's no other updater.
>  */
>  break;
> #endif
>  }
> 
>  pfree(members);
>       }
> // and here if didn’t update update_xact at all (it shouldn’t happen as 
> designed)

Yeah.  I added assertions for the above case inside the loop, and for
this one, and this fails right before "return".

TRAP: FailedAssertion("update_xact != InvalidTransactionId", File: 
"src/backend/access/heap/heapam.c", Line: 6939, PID: 4743)

It looks like nmembers==2, both of which are lockers and being ignored.

> And could we see multixact reply in logs if db does recover?

Do you mean waldump or ??

BTW, after a number of sigabrt's, I started seeing these during
recovery:

< 2022-09-09 19:44:04.180 CDT  >LOG:  unexpected pageaddr 1214/AF0FE000 in log 
segment 0000000100001214000000B4, offset 1040384
< 2022-09-09 23:20:50.830 CDT  >LOG:  unexpected pageaddr 1214/CF65C000 in log 
segment 0000000100001214000000D8, offset 6668288

-- 
Justin


Reply via email to