Robert Haas <robertmh...@gmail.com> writes:
> Yeah, you're right.  Attached is a draft patch that tries to clean up
> that and a bunch of other things that you raised.

I reviewed this patch.  I don't have any particular comment on the changes
in deadlock.c; I haven't studied the code closely enough to know if those
are right.  I did think that the commentary elsewhere could be improved
some more, so attached is a delta patch on top of yours that shows what
I suggest.

I've not done anything here about removing lockGroupLeaderIdentifier,
although I will be happy to send a patch for that unless you know of
some reason I missed why it's necessary.

                        regards, tom lane

diff --git a/src/backend/storage/lmgr/README b/src/backend/storage/lmgr/README
index 8eaa91c..5a62c8f 100644
*** a/src/backend/storage/lmgr/README
--- b/src/backend/storage/lmgr/README
*************** acquire an AccessShareLock while the oth
*** 603,609 ****
  This might seem dangerous and could be in some cases (more on that below), but
  if we didn't do this then parallel query would be extremely prone to
  self-deadlock.  For example, a parallel query against a relation on which the
! leader had already AccessExclusiveLock would hang, because the workers would
  try to lock the same relation and be blocked by the leader; yet the leader
  can't finish until it receives completion indications from all workers.  An
  undetected deadlock results.  This is far from the only scenario where such a
--- 603,609 ----
  This might seem dangerous and could be in some cases (more on that below), but
  if we didn't do this then parallel query would be extremely prone to
  self-deadlock.  For example, a parallel query against a relation on which the
! leader already had AccessExclusiveLock would hang, because the workers would
  try to lock the same relation and be blocked by the leader; yet the leader
  can't finish until it receives completion indications from all workers.  An
  undetected deadlock results.  This is far from the only scenario where such a
*************** quickly enough for this interlock to fai
*** 664,690 ****
  
  A PGPROC's lockGroupLeader is NULL for processes not involved in parallel
  query. When a process wants to cooperate with parallel workers, it becomes a
! lock group leader, which means setting this field back to point to its own
  PGPROC. When a parallel worker starts up, it points this field at the leader,
  with the above-mentioned interlock. The lockGroupMembers field is only used in
! the leader; it is a list of the workers. The lockGroupLink field is used to
! link the leader and all workers into the leader's list. All of these fields
! are protected by the lock manager locks; the lock manager lock that protects
! the lockGroupLeaderIdentifier, lockGroupLeader, and lockGroupMembers fields in
! a given PGPROC is chosen by taking pgprocno modulo the number of lock manager
! partitions. This unusual arrangement has a major advantage: the deadlock
! detector can count on the fact that no lockGroupLeader field can change while
! the deadlock detector is running, because it knows that it holds all the lock
! manager locks.  A PGPROC's lockGroupLink is protected by the lock manager
! partition lock for the group of which it is a part.  If it's not part of any
! group, this field is unused and can only be examined or modified by the
! process that owns the PGPROC.
  
  We institute a further coding rule that a process cannot join or leave a lock
  group while owning any PROCLOCK.  Therefore, given a lock manager lock
! sufficient to examine PROCLOCK *proclock, it also safe to examine
! proclock->tag.myProc->lockGroupLeader (but not the other fields mentioned in
! the previous paragraph).
  
  User Locks (Advisory Locks)
  ---------------------------
--- 664,689 ----
  
  A PGPROC's lockGroupLeader is NULL for processes not involved in parallel
  query. When a process wants to cooperate with parallel workers, it becomes a
! lock group leader, which means setting this field to point to its own
  PGPROC. When a parallel worker starts up, it points this field at the leader,
  with the above-mentioned interlock. The lockGroupMembers field is only used in
! the leader; it is a list of the member PGPROCs of the lock group (the leader
! and all workers). The lockGroupLink field is the list link for this list.
! 
! All four of these fields are considered to be protected by a lock manager
! partition lock.  The partition lock that protects these fields within a given
! lock group is chosen by taking the leader's pgprocno modulo the number of lock
! manager partitions.  This unusual arrangement has a major advantage: the
! deadlock detector can count on the fact that no lockGroupLeader field can
! change while the deadlock detector is running, because it knows that it holds
! all the lock manager locks.  Also, holding this single lock allows safe
! manipulation of the lockGroupMembers list for the lock group.
  
  We institute a further coding rule that a process cannot join or leave a lock
  group while owning any PROCLOCK.  Therefore, given a lock manager lock
! sufficient to examine a PROCLOCK *proclock, it is also safe to examine
! proclock->tag.myProc->lockGroupLeader (but not the other lock-group-related
! fields of the PGPROC).
  
  User Locks (Advisory Locks)
  ---------------------------
diff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c
index d81746d..21f92dd 100644
*** a/src/backend/storage/lmgr/deadlock.c
--- b/src/backend/storage/lmgr/deadlock.c
***************
*** 40,47 ****
   * is, it will be the leader rather than any other member of the lock group.
   * The group leaders act as representatives of the whole group even though
   * those particular processes need not be waiting at all.  There will be at
!  * least one member of the group on the wait queue for the given lock, maybe
!  * more.
   */
  typedef struct
  {
--- 40,47 ----
   * is, it will be the leader rather than any other member of the lock group.
   * The group leaders act as representatives of the whole group even though
   * those particular processes need not be waiting at all.  There will be at
!  * least one member of the waiter's lock group on the wait queue for the given
!  * lock, maybe more.
   */
  typedef struct
  {
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index cff8c08..deb88c7 100644
*** a/src/backend/storage/lmgr/proc.c
--- b/src/backend/storage/lmgr/proc.c
*************** ProcKill(int code, Datum arg)
*** 826,835 ****
  			leader->lockGroupLeader = NULL;
  			if (leader != MyProc)
  			{
- 				procgloballist = leader->procgloballist;
- 
  				/* Leader exited first; return its PGPROC. */
  				SpinLockAcquire(ProcStructLock);
  				leader->links.next = (SHM_QUEUE *) *procgloballist;
  				*procgloballist = leader;
  				SpinLockRelease(ProcStructLock);
--- 826,834 ----
  			leader->lockGroupLeader = NULL;
  			if (leader != MyProc)
  			{
  				/* Leader exited first; return its PGPROC. */
  				SpinLockAcquire(ProcStructLock);
+ 				procgloballist = leader->procgloballist;
  				leader->links.next = (SHM_QUEUE *) *procgloballist;
  				*procgloballist = leader;
  				SpinLockRelease(ProcStructLock);
*************** ProcSleep(LOCALLOCK *locallock, LockMeth
*** 1004,1010 ****
  	int			i;
  
  	/*
! 	 * If group locking is in use, locks held my members of my locking group
  	 * need to be included in myHeldLocks.
  	 */
  	if (leader != NULL)
--- 1003,1009 ----
  	int			i;
  
  	/*
! 	 * If group locking is in use, locks held by members of my locking group
  	 * need to be included in myHeldLocks.
  	 */
  	if (leader != NULL)
*************** BecomeLockGroupMember(PGPROC *leader, in
*** 1802,1810 ****
  	/* PID must be valid. */
  	Assert(pid != 0);
  
! 	/* Try to join the group. */
  	leader_lwlock = LockHashPartitionLockByProc(leader);
  	LWLockAcquire(leader_lwlock, LW_EXCLUSIVE);
  	if (leader->lockGroupLeaderIdentifier == pid)
  	{
  		ok = true;
--- 1801,1817 ----
  	/* PID must be valid. */
  	Assert(pid != 0);
  
! 	/*
! 	 * Get lock protecting the group fields.  Note LockHashPartitionLockByProc
! 	 * accesses leader->pgprocno in a PGPROC that might be free.  This is safe
! 	 * because all PGPROCs' pgprocno fields are set during shared memory
! 	 * initialization and never change thereafter; so we will acquire the
! 	 * correct lock even if the leader PGPROC is in process of being recycled.
! 	 */
  	leader_lwlock = LockHashPartitionLockByProc(leader);
  	LWLockAcquire(leader_lwlock, LW_EXCLUSIVE);
+ 
+ 	/* Try to join the group */
  	if (leader->lockGroupLeaderIdentifier == pid)
  	{
  		ok = true;
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index a3e19e1..2053bfe 100644
*** a/src/include/storage/lock.h
--- b/src/include/storage/lock.h
*************** typedef enum
*** 477,486 ****
   * by one of the lock hash partition locks.  Since the deadlock detector
   * acquires all such locks anyway, this makes it safe for it to access these
   * fields without doing anything extra.  To avoid contention as much as
!  * possible, we map different PGPROCs to different partition locks.
   */
! #define LockHashPartitionLockByProc(p) \
! 	LockHashPartitionLock((p)->pgprocno)
  
  /*
   * function prototypes
--- 477,487 ----
   * by one of the lock hash partition locks.  Since the deadlock detector
   * acquires all such locks anyway, this makes it safe for it to access these
   * fields without doing anything extra.  To avoid contention as much as
!  * possible, we map different PGPROCs to different partition locks.  The lock
!  * used for a given lock group is determined by the group leader's pgprocno.
   */
! #define LockHashPartitionLockByProc(leader_pgproc) \
! 	LockHashPartitionLock((leader_pgproc)->pgprocno)
  
  /*
   * function prototypes
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 81bddbb..a9405ce 100644
*** a/src/include/storage/proc.h
--- b/src/include/storage/proc.h
*************** struct PGPROC
*** 152,158 ****
  	 */
  	TransactionId	procArrayGroupMemberXid;
  
! 	/* Per-backend LWLock.  Protects fields below. */
  	LWLock		backendLock;
  
  	/* Lock manager data, recording fast-path locks taken by this backend. */
--- 152,158 ----
  	 */
  	TransactionId	procArrayGroupMemberXid;
  
! 	/* Per-backend LWLock.  Protects fields below (but not group fields). */
  	LWLock		backendLock;
  
  	/* Lock manager data, recording fast-path locks taken by this backend. */
*************** struct PGPROC
*** 163,173 ****
  												 * lock */
  
  	/*
! 	 * Support for lock groups.  Use LockHashPartitionLockByProc to get the
! 	 * LWLock protecting these fields.
  	 */
  	int			lockGroupLeaderIdentifier;	/* MyProcPid, if I'm a leader */
! 	PGPROC	   *lockGroupLeader;	/* lock group leader, if I'm a follower */
  	dlist_head	lockGroupMembers;	/* list of members, if I'm a leader */
  	dlist_node  lockGroupLink;		/* my member link, if I'm a member */
  };
--- 163,173 ----
  												 * lock */
  
  	/*
! 	 * Support for lock groups.  Use LockHashPartitionLockByProc on the group
! 	 * leader to get the LWLock protecting these fields.
  	 */
  	int			lockGroupLeaderIdentifier;	/* MyProcPid, if I'm a leader */
! 	PGPROC	   *lockGroupLeader;	/* lock group leader, if I'm a member */
  	dlist_head	lockGroupMembers;	/* list of members, if I'm a leader */
  	dlist_node  lockGroupLink;		/* my member link, if I'm a member */
  };
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to