I'm looking at how to make queries safe during recovery, in the presence of concurrent changes to blocks. In particular, concurrent removal of rows that might be read by queries.
My thinking is * we ignore LockRelationForExtension(). Normal queries never request it. All new blocks were created with that lock held and we are replaying changes serially, so we do not need to re-create that lock. We already do this, so no change. * re-create the Cleanup lock on blocks, when the original operation was performed while a Cleanup lock was held. So the plan is to introduce a new XLogLockBufferForCleanup() function and then use it in all places where a cleanup lock was held during the write of the WAL record. This means we will need to hold cleanup lock: * while RM_HEAP2_ID records are applied (Freeze, Clean, CleanMove) * while an XLOG_BTREE_DELETE was generated by VACUUM - this record type is not always generated by VACUUM. So split this WAL record into two types XLOG_BTREE_DELETE and XLOG_BTREE_VACUUM, so we can hold Cleanup lock while applyinh XLOG_BTREE_VACUUM. (This may not be required, but I'd rather do the full locking now and relax it later). * Whenever we apply a backup block that performs the same function as any of the above WAL records. So we would hold Cleanup lock when applying WAL records of types all RM_HEAP2_ID types XLOG_BTREE_VACUUM I'm most of the way through implementing the above and will post patch as a separate item to make it easier to inspect. Other aspects: * For GIN indexes, we appear to not hold a Cleanup lock during vacuuming, except on root page. That stops new scans from starting, but it doesn't prevent progress of concurrent scans. Doesn't look correct to me... so not sure what strength lock to acquire in each case. Probably need to differentiate between WAL record types so we can tell which to acquire CleanupLock for and which not. * GIST doesn't use CleaupLocks at all. So I'm very unclear here. Teodor has mentioned that it should be OK for GIST/GIN. Can I double check that based upon my inspection of the code? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers