On Tue, 2008-09-16 at 15:38 +0100, Simon Riggs wrote: > On Tue, 2008-09-16 at 17:01 +0300, Heikki Linnakangas wrote: > > Simon Riggs wrote: > > > Subtransactions cause a couple of problems for Hot Standby: > > > > Do we need to treat subtransactions any differently from normal > > transactions? Just treat all subtransactions as top-level transactions > > until commit, and mark them all as committed when you see the commit > > record for the top-level transaction. > > If we do that, snapshots become infinitely sized objects though, which > then requires us to invent some way of scrolling it to disk. So having > removed the need for subtrans, I then need to reinvent something similar > (or at least something like a multitrans entry).
Currently we keep track of whether the whole subxid cache has overflowed, or not. It seems possible to track for overflows of individual parts of the cache. That makes the code path for subxid overflow in GetSnapshotData() slightly slower, but it's not the common case. We save time elsewhere in more common cases. We would be able to avoid making an entry in subtrans for new subxids unless our local backend has overflowed its cache. That will reduce subtrans access frequency considerably and greatly reduce the number of requests that might need to perform I/O, possibly to zero. It would also avoid the need for generating WAL records for new subxids for standby. The path thru XidInMVCCSnapshot() would then require us to *always* check the subxid cache, even if it has overflowed. If we find the xid then we don't need to check subtrans at all. That's quite useful because searching the subxid cache is cheaper than looking in subtrans and the probability it would be there rather than in subtrans is still good, even for overflows of up to 3-5 times the subxid cache. It would increase the cost of subxid checking slightly when running with very high numbers of subxids. For Hot Standby, this would mean we could avoid generating WAL records for new subxids in most cases - only generate them when our backend's subxid cache has overflowed. On the standby it then means we can store xids into a fixed size snapshot without worrying about whether it overflows because the xids all fitted in the snapshot on the master (whose xids we are emulating), *or* we have a WAL record that tells us the cache overflowed and we make the insert into subtrans instead. When we use the standby snapshot we look in subxid cache first and if we don't find it then we check in subtrans. Sounds possible? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers