On Thu, Feb 9, 2017 at 5:32 PM, Andres Freund <and...@anarazel.de> wrote: >> > Again, some parts of the code doing something bad isn't a good argument >> > for doing again. Releasing locks early is a bad pattern, because backend >> > code isn't generally safe against it, we have special code-paths for >> > catalog tables to support it for those. >> >> I don't agree with that. The reason we keep relation locks until the >> end of the transaction is to avoid surprising application code, not >> because the system can't tolerate it. But here it's more likely that >> retaining the lock will surprise the user. > > It's both. A bunch of code paths rely on early release ony happening on > catalogs. E.g., and that's just the first thing that comes to mind, > cache invalidations which really only are guaranteed to be delivered and > visibile for other cache accesses, if the lock is only released after > the end of the transaction. Object accesses like relation_open() accept > invalidations *after* acquiring the lock. Combined with lock releases > only happening *after* transaction commits that guarantees an up2date > view of the cache; but if others release the lock earlier and have cache > invalidations, that doesn't work anymore. Now you can argue that it's > possibly harmless here because there's presumably no way this can ever > trigger cache invals - and you're probably right - but this isn't the > only place with such assumption and you'd definitely have to argue *why* > it's right.
I agree that code which issues cache invalidations needs to hold a blocking lock until commit IF it's critical for other backends to see the update. That's not always the case; for example, if an ALTER TABLE statement only changes the fillfactor the worst thing that happens if some other backend fails to see the invalidations is that some transactions continue to use the old value for a while. That's why we now allow fillfactor and some other things using only ShareUpdateExclusiveLock. That's not quite the same thing but the effect is the same: we issue invalidations that other backends aren't guaranteed to see in a timely fashion, and it works OK. Yeah, some other session might not see the change for a while, but we don't view that as a big problem. But I think in this case that argument is basically a straw man. A utility that's called amcheck is obviously not issuing any cache invalidations. It's probably not even changing anything at all; if it is, maybe it should be called amrepair. So the fact that CLUSTER has to hold AEL until commit time really doesn't have anything to do with whether amcheck needs to hold AES until commit time. I don't really understand your demand for "an argument why it's right". My argument is simple: I can't think of a single thing that it would break, and I don't believe there is any such thing. Furthermore, as Peter points out, we do this kind of thing all the time in other places and it indeed does not break things. I think you're trying to invent a coding rule that doesn't exist, but I can't prove a negative. Your argument so far is that "yeah, well, maybe inval doesn't emit sinval traffic (duh?) but it might do something else that breaks, prove to me that there is no such thing". But that's just like asking me to prove that it's impossible to exceed the speed of light. On information and belief, nobody currently knows how to do that and there is good scientific evidence that it cannot be done, but there's no way to prove that someone won't in the future discover a refinement in the physical laws of the universe as we understand them today which sheds a different light on the situation. >From my point of view, it's likely to be common to want to run amcheck in series on a bunch of indexes, like by writing a SELECT query against pg_class or pg_index and passing the OIDs to amcheck. With the retain-locks design, that query accumulates locks on a large number of objects, possibly blowing out the lock table or blocking DDL. I think that readily-forseeable problem ought to trump concerns about purely hypothetical issues caused by releasing locks early. If you can come up with an actual issue with releasing locks early that applies to this particular case, then that's different, of course. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers