Hi,

Is this assertion something that is worthwhile to fix?

Thanks,
Jasper Smit

On Wed, Mar 26, 2025 at 4:26 PM Jasper Smit <jbs...@gmail.com> wrote:

> Hi,
>
> My colleague Oleksii Kozlov ran into an assertion while testing aborted
> UPDATE-commands in sub transactions.
> To reproduce this assertion run the SQl in the attached script. I tested
> this on 15.10 and 17.4
>
> Running the script will lead to the the assertion:
> TRAP: failed Assert("HEAP_XMAX_IS_LOCKED_ONLY(infomask_lock_old_tuple)"),
> File:
> "/usr/local/postgresql-17.4/debug-build/../src/backend/access/heap/heapam.c",
> Line: 3766, PID: 15604
>
> After analysis with Luc Vlaming, we believe that the problem is caused by
> a stale multixact member of an aborted subtransaction.
>
> At the time of the assertion, we established that the new tuple does not
> fit on the same page as the old tuple. The
> tuple lock needs to be updated while the page lock is temporarily released.
>
> One line above the assertion, compute_new_xmax_infomask() is called, which
> will in turn call MultiXactIdExpand().
> In MultiXactIdExpand() we determine that the requested txid/status is
> already a member of the current multixact, therefore skipping
> the removal of dead members further below in that function. The multixact
> has in fact an aborted transaction included in it.
> Because the aborted transaction was not removed, later in
> GetMultiXactIdHintBits(), HEAP_XMAX_LOCK_ONLY is not added to the infomask.
> The absence of this bit in the infomask, will eventually lead to the
> assertion.
>
> A possible fix is to change MultiXactIdExpand() to not skip the removal of
> dead members. See the proposed patch attached to this email.
> Another alternative is to remove the assertion, as I think that at
> relevant places the transaction statuses of multixact members get checked.
>
> Regards,
> Jasper Smit
>
>

Reply via email to