Re: snapshot too old issues, first around wraparound and then more.

Andres Freund Wed, 16 Jun 2021 11:28:15 -0700

Hi,

On 2021-06-16 13:04:07 -0400, Tom Lane wrote:
> Yeah, I think this scenario of a few transactions with old snapshots
> and the rest with very new ones could be improved greatly if we exposed
> more info about backends' snapshot state than just "oldest xmin".  But
> that might be expensive to do.


I think it'd be pretty doable now. The snapshot scalability changes
separated out information needed to do vacuuming / pruning (i.e. xmin)
from the information needed to build a snapshot (xid, flags, subxids
etc). Because xmin is not frequently accessed from other backends
anymore, it is not important anymore to touch it as rarely as
possible. From the cross-backend POV I think it'd be practically free to
track a backend's xmax now.

It's not quite as obvious that it'd essentially free to track a
backend's xmax across all the snapshots it uses. I think we'd basically
need a second pairingheap in snapmgr.c to track the "most advanced"
xmax? That's *probably* fine, but I'm not 100% - Heikki wrote a faster
heap implementation for snapmgr.c for a reason I assume.


I think the hard part of this would be much more on the pruning / vacuum
side of things. There's two difficulties:

1) Keeping it cheap to determine whether a tuple can be vacuumed,
   particularly while doing on-access pruning. This likely means that
   we'd only assemble the information to do visibility determination for
   rows above the "dead for everybody" horizon when encountering a
   sufficiently old tuple. And then we need a decent datastructure for
   checking whether an xid is in one of the "not needed" xid ranges.

   This seems solvable.

2) Modeling when it is safe to remove row versions. It is easy to remove
   a tuple that was inserted and deleted within one "not needed" xid
   range, but it's far less obvious when it is safe to remove row
   versions where prior/later row versions are outside of such a gap.

   Consider e.g. an update chain where the oldest snapshot can see one
   row version, then there is a chain of rows that could be vacuumed
   except for the old snapshot, and then there's a live version. If the
   old session updates the row version that is visible to it, it needs
   to be able to follow the xid chain.

   This seems hard to solve in general.

   It perhaps is sufficiently effective to remove row version chains
   entirely within one removable xid range. And it'd probably doable to
   also address the case where a chain is larger than one range, as long
   as all the relevant row versions are within one page: We can fix up
   the ctids of older still visible row versions to point to the
   successor of pruned row versions.

   But I have a hard time seeing a realistic approach to removing chains
   that span xid ranges and multiple pages. The locking and efficiency
   issues seem substantial.

Greetings,

Andres

Re: snapshot too old issues, first around wraparound and then more.

Reply via email to