On Wed, Mar 7, 2012 at 2:06 PM, Simon Riggs <si...@2ndquadrant.com> wrote: >> I am not thrilled with the design as it stands, but bulk loading is a >> known and serious pain point for us, so it would be awfully nice to >> improve it. I'm not sure whether we should only go as far as setting >> HEAP_XMIN_COMMITTED or whether we should actually try to mark the >> tuples with FrozenXID. The former has the advantage of (I think) not >> requiring any other changes to preserve MVCC semantics while the >> latter is, obviously, a bigger performance improvement. > > It's the other way around. Setting to FrozenTransactionId makes the > test in XidInMVCCSnapshot() pass when accessed by later commands in > the same transaction. If we just set the hint we need to play around > to get it accepted. So the frozen route is both best for performance > and least impact on fastpath visibility code. That part of the code is > solid.
Your comment is reminding me that there are actually two problems here, or at least I think there are. 1. Some other transaction might look at the tuples. 2. An older snapshot (e.g. cursor) might look at the tuples. Case #1 can happen when we create a table, insert some data, and commit, and then some other transaction that took a snapshot before we committed reads the table. It's OK if the tuples are marked HEAP_XMIN_COMMITTED, because if we abort no other transaction will ever see the new pg_class row as alive, and therefore no other transaction can examine the table contents. But using FrozenXID as the tuple xmin would allow those tuples to be seen by a transaction that took its snapshot before we committed; this is the problem that relvalidxid is designed to fix, and what I was thinking of when I said that we need more infrastructure to handle the FrozenXID case. Case #2 is certainly a problem for FrozenXID as well, because anything that's marked with FrozenXID is going to look visible to everybody, including our older snapshots. And I gather you're saying it's also a problem for HEAP_XMIN_COMMITTED. I had assumed that the way we were fixing this problem was to disable these optimizations for transactions that had more than one snapshot floating around. I'm not sure whether the patch does that or not, but I think it probably needs to, unless you have some other idea for how to fix this. It doesn't seem like an important restriction in practice because it's unlikely that anyone would keep a cursor open across a bulk data load - and if they do, this isn't the only problem they're going to have. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers