Robert Haas <robertmh...@gmail.com> writes: > Yeah, you're right. Attached is a draft patch that tries to clean up > that and a bunch of other things that you raised.
I reviewed this patch. I don't have any particular comment on the changes in deadlock.c; I haven't studied the code closely enough to know if those are right. I did think that the commentary elsewhere could be improved some more, so attached is a delta patch on top of yours that shows what I suggest. I've not done anything here about removing lockGroupLeaderIdentifier, although I will be happy to send a patch for that unless you know of some reason I missed why it's necessary. regards, tom lane
diff --git a/src/backend/storage/lmgr/README b/src/backend/storage/lmgr/README index 8eaa91c..5a62c8f 100644 *** a/src/backend/storage/lmgr/README --- b/src/backend/storage/lmgr/README *************** acquire an AccessShareLock while the oth *** 603,609 **** This might seem dangerous and could be in some cases (more on that below), but if we didn't do this then parallel query would be extremely prone to self-deadlock. For example, a parallel query against a relation on which the ! leader had already AccessExclusiveLock would hang, because the workers would try to lock the same relation and be blocked by the leader; yet the leader can't finish until it receives completion indications from all workers. An undetected deadlock results. This is far from the only scenario where such a --- 603,609 ---- This might seem dangerous and could be in some cases (more on that below), but if we didn't do this then parallel query would be extremely prone to self-deadlock. For example, a parallel query against a relation on which the ! leader already had AccessExclusiveLock would hang, because the workers would try to lock the same relation and be blocked by the leader; yet the leader can't finish until it receives completion indications from all workers. An undetected deadlock results. This is far from the only scenario where such a *************** quickly enough for this interlock to fai *** 664,690 **** A PGPROC's lockGroupLeader is NULL for processes not involved in parallel query. When a process wants to cooperate with parallel workers, it becomes a ! lock group leader, which means setting this field back to point to its own PGPROC. When a parallel worker starts up, it points this field at the leader, with the above-mentioned interlock. The lockGroupMembers field is only used in ! the leader; it is a list of the workers. The lockGroupLink field is used to ! link the leader and all workers into the leader's list. All of these fields ! are protected by the lock manager locks; the lock manager lock that protects ! the lockGroupLeaderIdentifier, lockGroupLeader, and lockGroupMembers fields in ! a given PGPROC is chosen by taking pgprocno modulo the number of lock manager ! partitions. This unusual arrangement has a major advantage: the deadlock ! detector can count on the fact that no lockGroupLeader field can change while ! the deadlock detector is running, because it knows that it holds all the lock ! manager locks. A PGPROC's lockGroupLink is protected by the lock manager ! partition lock for the group of which it is a part. If it's not part of any ! group, this field is unused and can only be examined or modified by the ! process that owns the PGPROC. We institute a further coding rule that a process cannot join or leave a lock group while owning any PROCLOCK. Therefore, given a lock manager lock ! sufficient to examine PROCLOCK *proclock, it also safe to examine ! proclock->tag.myProc->lockGroupLeader (but not the other fields mentioned in ! the previous paragraph). User Locks (Advisory Locks) --------------------------- --- 664,689 ---- A PGPROC's lockGroupLeader is NULL for processes not involved in parallel query. When a process wants to cooperate with parallel workers, it becomes a ! lock group leader, which means setting this field to point to its own PGPROC. When a parallel worker starts up, it points this field at the leader, with the above-mentioned interlock. The lockGroupMembers field is only used in ! the leader; it is a list of the member PGPROCs of the lock group (the leader ! and all workers). The lockGroupLink field is the list link for this list. ! ! All four of these fields are considered to be protected by a lock manager ! partition lock. The partition lock that protects these fields within a given ! lock group is chosen by taking the leader's pgprocno modulo the number of lock ! manager partitions. This unusual arrangement has a major advantage: the ! deadlock detector can count on the fact that no lockGroupLeader field can ! change while the deadlock detector is running, because it knows that it holds ! all the lock manager locks. Also, holding this single lock allows safe ! manipulation of the lockGroupMembers list for the lock group. We institute a further coding rule that a process cannot join or leave a lock group while owning any PROCLOCK. Therefore, given a lock manager lock ! sufficient to examine a PROCLOCK *proclock, it is also safe to examine ! proclock->tag.myProc->lockGroupLeader (but not the other lock-group-related ! fields of the PGPROC). User Locks (Advisory Locks) --------------------------- diff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c index d81746d..21f92dd 100644 *** a/src/backend/storage/lmgr/deadlock.c --- b/src/backend/storage/lmgr/deadlock.c *************** *** 40,47 **** * is, it will be the leader rather than any other member of the lock group. * The group leaders act as representatives of the whole group even though * those particular processes need not be waiting at all. There will be at ! * least one member of the group on the wait queue for the given lock, maybe ! * more. */ typedef struct { --- 40,47 ---- * is, it will be the leader rather than any other member of the lock group. * The group leaders act as representatives of the whole group even though * those particular processes need not be waiting at all. There will be at ! * least one member of the waiter's lock group on the wait queue for the given ! * lock, maybe more. */ typedef struct { diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index cff8c08..deb88c7 100644 *** a/src/backend/storage/lmgr/proc.c --- b/src/backend/storage/lmgr/proc.c *************** ProcKill(int code, Datum arg) *** 826,835 **** leader->lockGroupLeader = NULL; if (leader != MyProc) { - procgloballist = leader->procgloballist; - /* Leader exited first; return its PGPROC. */ SpinLockAcquire(ProcStructLock); leader->links.next = (SHM_QUEUE *) *procgloballist; *procgloballist = leader; SpinLockRelease(ProcStructLock); --- 826,834 ---- leader->lockGroupLeader = NULL; if (leader != MyProc) { /* Leader exited first; return its PGPROC. */ SpinLockAcquire(ProcStructLock); + procgloballist = leader->procgloballist; leader->links.next = (SHM_QUEUE *) *procgloballist; *procgloballist = leader; SpinLockRelease(ProcStructLock); *************** ProcSleep(LOCALLOCK *locallock, LockMeth *** 1004,1010 **** int i; /* ! * If group locking is in use, locks held my members of my locking group * need to be included in myHeldLocks. */ if (leader != NULL) --- 1003,1009 ---- int i; /* ! * If group locking is in use, locks held by members of my locking group * need to be included in myHeldLocks. */ if (leader != NULL) *************** BecomeLockGroupMember(PGPROC *leader, in *** 1802,1810 **** /* PID must be valid. */ Assert(pid != 0); ! /* Try to join the group. */ leader_lwlock = LockHashPartitionLockByProc(leader); LWLockAcquire(leader_lwlock, LW_EXCLUSIVE); if (leader->lockGroupLeaderIdentifier == pid) { ok = true; --- 1801,1817 ---- /* PID must be valid. */ Assert(pid != 0); ! /* ! * Get lock protecting the group fields. Note LockHashPartitionLockByProc ! * accesses leader->pgprocno in a PGPROC that might be free. This is safe ! * because all PGPROCs' pgprocno fields are set during shared memory ! * initialization and never change thereafter; so we will acquire the ! * correct lock even if the leader PGPROC is in process of being recycled. ! */ leader_lwlock = LockHashPartitionLockByProc(leader); LWLockAcquire(leader_lwlock, LW_EXCLUSIVE); + + /* Try to join the group */ if (leader->lockGroupLeaderIdentifier == pid) { ok = true; diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h index a3e19e1..2053bfe 100644 *** a/src/include/storage/lock.h --- b/src/include/storage/lock.h *************** typedef enum *** 477,486 **** * by one of the lock hash partition locks. Since the deadlock detector * acquires all such locks anyway, this makes it safe for it to access these * fields without doing anything extra. To avoid contention as much as ! * possible, we map different PGPROCs to different partition locks. */ ! #define LockHashPartitionLockByProc(p) \ ! LockHashPartitionLock((p)->pgprocno) /* * function prototypes --- 477,487 ---- * by one of the lock hash partition locks. Since the deadlock detector * acquires all such locks anyway, this makes it safe for it to access these * fields without doing anything extra. To avoid contention as much as ! * possible, we map different PGPROCs to different partition locks. The lock ! * used for a given lock group is determined by the group leader's pgprocno. */ ! #define LockHashPartitionLockByProc(leader_pgproc) \ ! LockHashPartitionLock((leader_pgproc)->pgprocno) /* * function prototypes diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h index 81bddbb..a9405ce 100644 *** a/src/include/storage/proc.h --- b/src/include/storage/proc.h *************** struct PGPROC *** 152,158 **** */ TransactionId procArrayGroupMemberXid; ! /* Per-backend LWLock. Protects fields below. */ LWLock backendLock; /* Lock manager data, recording fast-path locks taken by this backend. */ --- 152,158 ---- */ TransactionId procArrayGroupMemberXid; ! /* Per-backend LWLock. Protects fields below (but not group fields). */ LWLock backendLock; /* Lock manager data, recording fast-path locks taken by this backend. */ *************** struct PGPROC *** 163,173 **** * lock */ /* ! * Support for lock groups. Use LockHashPartitionLockByProc to get the ! * LWLock protecting these fields. */ int lockGroupLeaderIdentifier; /* MyProcPid, if I'm a leader */ ! PGPROC *lockGroupLeader; /* lock group leader, if I'm a follower */ dlist_head lockGroupMembers; /* list of members, if I'm a leader */ dlist_node lockGroupLink; /* my member link, if I'm a member */ }; --- 163,173 ---- * lock */ /* ! * Support for lock groups. Use LockHashPartitionLockByProc on the group ! * leader to get the LWLock protecting these fields. */ int lockGroupLeaderIdentifier; /* MyProcPid, if I'm a leader */ ! PGPROC *lockGroupLeader; /* lock group leader, if I'm a member */ dlist_head lockGroupMembers; /* list of members, if I'm a leader */ dlist_node lockGroupLink; /* my member link, if I'm a member */ };
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers