As we know, the deadlock error message isn't the most friendly one. All the
client gets back is process PIDs, transaction IDs, and lock types. You have to
check the server log to retrieve lock details. This is tedious.
In one of my apps I even added a deadlock exception handler on the app side to
query pg_stat_activity for processes involved in the deadlock and include their
application names and queries in the exception message. It is a little racy but
works well enough.
Ideally I'd like to see that data coming from Postgres upon detecting the
deadlock. That's why I made this small change.
The change makes the deadlock error look as follows - the new element is the
application name or "" in its place if the activity
user doesn't match the current user (and the current use isn't a superuser):
postgres=*> SELECT * FROM q WHERE id = 2 FOR UPDATE;
ERROR: deadlock detected
DETAIL: Process 194520 (application_name: ) waits for
ShareLock on transaction 776; blocked by process 194521.
Process 194521 (application_name: woof) waits for ShareLock on transaction 775;
blocked by process 194520.
HINT: See server log for query details.
CONTEXT: while locking tuple (0,2) in relation "q"
I added a new LocalPgBackendCurrentActivity struct combining application name
and query string pointers and a sameProcess boolean. It is returned by value,
since it's small. Performance-wise, this is a a part of the deadlock handler,
if the DB hits it frequently, there are much more serious problems going on.
I could extend it by sending the queries back to the client, with an identical
security check, but this is a potential information exposure of whatever's in
the query plaintext. Another extension is to replace "(application_name:
)" with something better like "(unknown
application_name)", or even nothing.
Attached patch is for master, 2fb7560c. It doesn't contain any tests.
Let me know if you approve of the patch and if it makes sense to continue
working on it.
Best,
KarolineFrom f7d3b3854ece45b2ba3b79643e99c1acbb09abae Mon Sep 17 00:00:00 2001
From: Karoline Pauls
Date: Wed, 8 May 2024 23:49:46 +0100
Subject: [PATCH] call pgstat_get_backend_current_activity once
---
src/backend/storage/lmgr/deadlock.c | 31 +
src/backend/utils/activity/backend_status.c | 10 +++
src/include/utils/backend_status.h | 4 ++-
src/tools/pgindent/typedefs.list| 1 +
4 files changed, 29 insertions(+), 17 deletions(-)
diff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c
index d94f65df..c08f66af 100644
--- a/src/backend/storage/lmgr/deadlock.c
+++ b/src/backend/storage/lmgr/deadlock.c
@@ -1075,17 +1075,22 @@ DeadLockReport(void)
StringInfoData logbuf; /* errdetail for server log */
StringInfoData locktagbuf;
int i;
+ bool isSuperuser = superuser();
+ LocalPgBackendCurrentActivity *activities = \
+ (LocalPgBackendCurrentActivity *) palloc(sizeof(LocalPgBackendCurrentActivity) * nDeadlockDetails);
initStringInfo(&clientbuf);
initStringInfo(&logbuf);
initStringInfo(&locktagbuf);
- /* Generate the "waits for" lines sent to the client */
+ /* Generate the "waits for" lines sent to the client and the log */
for (i = 0; i < nDeadlockDetails; i++)
{
DEADLOCK_INFO *info = &deadlockDetails[i];
int nextpid;
- LocalPgBackendCurrentActivity activity = pgstat_get_backend_current_activity(info->pid, true);
+ const char *lockModeName = GetLockmodeName(info->locktag.locktag_lockmethodid, info->lockmode);
+
+ activities[i] = pgstat_get_backend_current_activity(info->pid);
/* The last proc waits for the first one... */
if (i < nDeadlockDetails - 1)
@@ -1104,29 +1109,33 @@ DeadLockReport(void)
appendStringInfo(&clientbuf,
_("Process %d (application_name: %s) waits for %s on %s; blocked by process %d."),
info->pid,
- activity.st_appname,
- GetLockmodeName(info->locktag.locktag_lockmethodid,
- info->lockmode),
+ (activities[i].sameUser || isSuperuser) ? activities[i].st_appname : "",
+ lockModeName,
locktagbuf.data,
nextpid);
- }
- /* Duplicate all the above for the server ... */
- appendBinaryStringInfo(&logbuf, clientbuf.data, clientbuf.len);
+ /* the difference is that we don't do user checking for the server log */
+ appendStringInfo(&logbuf,
+ _("Process %d (application_name: %s) waits for %s on %s; blocked by process %d."),
+ info->pid,
+ activities[i].st_appname,
+ lockModeName,
+ locktagbuf.data,
+ nextpid);
+ }
/* ... and add info about query strings */
for (i = 0; i < nDeadlockDetails; i++)
{
DEADLOCK_INFO *info = &deadlockDetails[i];
- LocalPgBackendCurrentActivity activi