> Heikki Linnakangas wrote: > On 03.06.2011 23:44, Kevin Grittner wrote: >> Heikki Linnakangas wrote: >> >>> I think you'll need to just memorize the lock deletion command in >>> a backend-local list, and perform the deletion in a post-commit >>> function. >> >> Hmm. As mentioned earlier in the thread, cleaning these up doesn't >> actually have any benefit beyond freeing space in the predicate >> locking collections. I'm not sure that benefit is enough to >> justify this much new mechanism. Maybe I should just leave them >> alone and let them get cleaned up in due course with the rest of >> the locks. Any opinions on that? > > Is there a chance of false positives if oid wraparound happens and > a new table gets the same oid as the old one? It's also possible > for a heap to get the OID of an old index or vice versa, will that > confuse things? Some additional thoughts after working on the new code... The risk we're trying to cover is that a table can be dropped while committed transactions still have predicate locks against it, which won't be cleared until overlapping transactions which are still active, and which can form a dangerous structure with the committed transactions, commit. If we just leave those predicate locks to be cleaned up in the normal course of transaction completion, there could be a new object created with the old OID which, if a complicated set of conditions are met, could lead to a serialization failure of a transaction which would otherwise succeed. We can't clean the predicate locks up immediately when the DROP TABLE appears to succeed, because the enclosing transaction (or subtransaction) could still be rolled back. That last bit seems pretty hard to dodge, because premature predicate lock cleanup could lead to false negatives -- which would allow data inconsistencies. So cleanup before commit is definitely out. But some additional time meditating on the dangerous structure diagram today has me wondering whether we might be OK just ignoring cleanup prior to what would normally happen as transactions complete. Tin - - -> Tpivot - - -> Tout As mentioned above, this whole issue is based around concern about false positives when a transaction has read from a table and committed before the table is dropped. We don't need to worry about that for Tout -- its position in the dangerous structure is just based around what it *writes*; it can't be responsible for the read-and-commit before the table is dropped. To have a problem, Tpivot can't be the transaction which commits before the DROP TABLE either -- since it reads what Tout writes and Tout has to be first to commit, then it cannot read and commit before the DROP TABLE and still have Tout write afterward. That means that the transaction which did the read and commit would have to be Tin, and Tpivot would need to be the transaction which later wrote to an object with the same OID as the dropped table. For a problem to manifest itself, this sequence must occur: - The timing of transaction starts doesn't matter (other than the RO optimizations) as long as Tpivot overlaps both Tin and Tout. - Tout commits first - Any time before Tin commits, it reads from relation X. - Tin commits second - Tpivot develops a rw-conflict out to Tout on a relation *other than* X, any time after both have started and before Tpivot commits. - Tpivot writes to a relation which has the same OID as relation X, but which isn't relation X. This requires that a new relation be created with the same OID which had been used by X, and that this new relation is visible to Tpivot -- which acquired its snapshot *before X was dropped*. Is this possible? If a transaction gets its snapshot while OID of N is assigned to relation X, can that transaction wind up seeing an OID of N as a reference to relation Y? If not, there aren't any false positives possible. Given the quantity and complexity of code which I've needed to generate to cover this, I'm wondering again whether it is justified *even if* such a false positive is possible. It certainly seems like it would be a *very* infrequent occurrence, There's also the question of what shows in pg_locks, but I'm wondering whether even that is enough to justify this much code. Thoughts? Maybe I should submit a patch without added complexity of the scheduled cleanup and we can discuss that as a separate patch? -Kevin
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers