> On 2020-11-17 12:55:01 -0300, Alvaro Herrera wrote:
> > ... ah, but I realize now that this means that we can use shared lock
> > here, not exclusive, which is already an enormous improvement.  That's
> > because ->pgxactoff can only be changed with exclusive lock held; so as
> > long as we hold shared, the array item cannot move.
> 
> Uh, wait a second. The acquisition of this lock hasn't been affected by
> the snapshot scalability changes, and therefore are unrelated to
> ->pgxactoff changing or not.

I'm writing a response trying to thoroughly analyze this, but I also
wanted to report that ProcSleep is being a bit generous with what it
calls "as quickly as possible" here:

            LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);

            /*
             * Only do it if the worker is not working to protect against Xid
             * wraparound.
             */
            statusFlags = ProcGlobal->statusFlags[autovac->pgxactoff];
            if ((statusFlags & PROC_IS_AUTOVACUUM) &&
                !(statusFlags & PROC_VACUUM_FOR_WRAPAROUND))
            {
                int            pid = autovac->pid;
                StringInfoData locktagbuf;
                StringInfoData logbuf;    /* errdetail for server log */

                initStringInfo(&locktagbuf);
                initStringInfo(&logbuf);
                DescribeLockTag(&locktagbuf, &lock->tag);
                appendStringInfo(&logbuf,
                                 _("Process %d waits for %s on %s."),
                                 MyProcPid,
                                 GetLockmodeName(lock->tag.locktag_lockmethodid,
                                                 lockmode),
                                 locktagbuf.data);

                /* release lock as quickly as possible */
                LWLockRelease(ProcArrayLock);

The amount of stuff that this is doing with ProcArrayLock held
exclusively seems a bit excessive; it sounds like we could copy the
values we need first, release the lock, and *then* do all that memory
allocation and string printing -- it's a lock of code for such a
contended lock.  Anytime a process sees itself as blocked by autovacuum
and wants to signal it, there's a ProcArrayLock hiccup (I didn't
actually measure it, but it's at least five function calls).  We could
make this more concurrent by copying lock->tag to a local variable,
releasing the lock, then doing all the string formatting and printing.
See attached quickly.patch.

Now, when this code was written (d7318d43d, 2012), this was a LOG
message; it was demoted to DEBUG1 later (d8f15c95bec, 2015).  I think it
would be fair to ... remove the message?  Or go back to Simon's original
formulation from commit acac68b2bca, which had this message as DEBUG2
without any string formatting.

Thoughts?
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index d1738c65f5..caefff41e2 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -1325,20 +1325,22 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 				int			pid = autovac->pid;
 				StringInfoData locktagbuf;
 				StringInfoData logbuf;	/* errdetail for server log */
+				LOCKTAG		locktag_copy;
+
+				/* release lock as quickly as possible */
+				locktag_copy = lock->tag;
+				LWLockRelease(ProcArrayLock);
 
 				initStringInfo(&locktagbuf);
 				initStringInfo(&logbuf);
-				DescribeLockTag(&locktagbuf, &lock->tag);
+				DescribeLockTag(&locktagbuf, &locktag_copy);
 				appendStringInfo(&logbuf,
 								 _("Process %d waits for %s on %s."),
 								 MyProcPid,
-								 GetLockmodeName(lock->tag.locktag_lockmethodid,
+								 GetLockmodeName(locktag_copy.locktag_lockmethodid,
 												 lockmode),
 								 locktagbuf.data);
 
-				/* release lock as quickly as possible */
-				LWLockRelease(ProcArrayLock);
-
 				/* send the autovacuum worker Back to Old Kent Road */
 				ereport(DEBUG1,
 						(errmsg("sending cancel to blocking autovacuum PID %d",

Reply via email to