Hi, Is this assertion something that is worthwhile to fix?
Thanks, Jasper Smit On Wed, Mar 26, 2025 at 4:26 PM Jasper Smit <jbs...@gmail.com> wrote: > Hi, > > My colleague Oleksii Kozlov ran into an assertion while testing aborted > UPDATE-commands in sub transactions. > To reproduce this assertion run the SQl in the attached script. I tested > this on 15.10 and 17.4 > > Running the script will lead to the the assertion: > TRAP: failed Assert("HEAP_XMAX_IS_LOCKED_ONLY(infomask_lock_old_tuple)"), > File: > "/usr/local/postgresql-17.4/debug-build/../src/backend/access/heap/heapam.c", > Line: 3766, PID: 15604 > > After analysis with Luc Vlaming, we believe that the problem is caused by > a stale multixact member of an aborted subtransaction. > > At the time of the assertion, we established that the new tuple does not > fit on the same page as the old tuple. The > tuple lock needs to be updated while the page lock is temporarily released. > > One line above the assertion, compute_new_xmax_infomask() is called, which > will in turn call MultiXactIdExpand(). > In MultiXactIdExpand() we determine that the requested txid/status is > already a member of the current multixact, therefore skipping > the removal of dead members further below in that function. The multixact > has in fact an aborted transaction included in it. > Because the aborted transaction was not removed, later in > GetMultiXactIdHintBits(), HEAP_XMAX_LOCK_ONLY is not added to the infomask. > The absence of this bit in the infomask, will eventually lead to the > assertion. > > A possible fix is to change MultiXactIdExpand() to not skip the removal of > dead members. See the proposed patch attached to this email. > Another alternative is to remove the assertion, as I think that at > relevant places the transaction statuses of multixact members get checked. > > Regards, > Jasper Smit > >