On Tue, Nov 18, 2014 at 4:40 AM, Jeff Davis <pg...@j-davis.com> wrote: >> To reiterate the basic problem here, if we do nothing at all about the >> lock manager, a parallel backend can stall trying to grab an >> AccessExclusiveLock that the user backend alread holds, either because >> the user backend holds an AccessExclusiveLock as well, or because some >> other process is waiting for one, we'll deadlock and not notice. > > My feeling is that we should keep the concept of a group and group > leader from your patch, and improve the deadlock detector to make use of > that information (looking at the code, it looks doable but not trivial). > But unless I am missing something, we should separate out the lock > sharing, and defer that until later.
Doing as you propose solves this problem: 1. Backend A-1 acquires AccessShareLock on relation R. 2. Backend B waits for AccessExclusiveLock on relation R. 3. Backend A-2 waits for AccessShareLock on relation R. Knowing that A-1 and A-2 are related, the deadlock detector can move A-2 ahead of B and grant the lock. All is well. But your proposal does not solve this problem: 1. Backend A-1 acquires AccessExclusiveLock on relation R. 2. Backend A-2 waits for AccessShareLock on relation R. The good news is that the deadlock detector should realize that since A-1 and A-2 are in the same group, this is a deadlock. And it can abort either A-1 or A-2, which will eventually abort them both. The bad news, to borrow a phrase from Peter Geoghegan, is that it's an unprincipled deadlock; a user confronted with the news that her parallel scan has self-deadlocked will be justifiably dismayed. It's been proposed by Andres, and maybe a few other people, that we can detect this situation and just not use parallelism in these cases, but that's actually quite hard to do. Of course, it's pretty easy for a backend A-1 contemplating parallel scan of relation R to notice that it holds a lock that conflicts with the AccessShareLock it expects a prospective parallel backend A-2 to attempt to acquire. I don't think there's a lock manager method for that today, but we could easily add one. However, A-1 might incidentally hold other locks (from earlier statements in the transaction) that conflict with locks that will be sought by A-2, and as discussed *extensively* upthread, it is not easy to predict all of the locks a parellel backend might try to take: even seemingly-trivial code like the equality operators for built-in data types can sometimes take relation locks, and we have no infrastructure for predicting which ones. Even if you could somehow solve that problem, there's also the issue of plan caching. If we make a parallel plan for an SQL statement inside a PL/pgsql procedure because our backend holds no strong locks, and then we acquire a strong lock on a relevant relation, we have to invalidate that plan. I think that will add complexity inside the plan cache machinery. It seems to me that it would be better to handle these issues inside the lock manager rather than letting them creep into other parts of the system. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers