Hi,

On Mon, Jan 06, 2025 at 10:08:36PM +0100, Guillaume Lelarge wrote:
> Hello,
> 
> Le lun. 16 déc. 2024 à 22:00, Melanie Plageman <melanieplage...@gmail.com>
> a écrit :
> 
> > Hi,
> >
> > Users wishing to debug slow connection establishment have little
> > visibility into which steps take the most time. We don't expose any
> > stats and none of the logging includes durations.
> >
> > The attached patch logs durations for authentication, forking the
> > Postgres backend, and the end-to-end connection establishment duration
> > starting from when the postmaster calls accept() and ending the first
> > time the forked backend is ready for query.
> >
> > As an example of how this could be useful, I artificially slowed down
> > authentication and you can see that that is now visible in logs:
> >
> > LOG:  backend ready for query. pid=86341. socket=10. connection
> > establishment times (ms): total: 8, fork: 0, authentication: 0
> > LOG:  backend ready for query. pid=86794. socket=10. connection
> > establishment times (ms): total: 108, fork: 0, authentication: 100
> >
> > Two notes on implementation:
> >
> > To make this portable, the timestamps captured in the postmaster
> > (socket creation time, fork initiation time) are passed through the
> > ClientSocket and BackendStartupData structures instead of simply saved
> > in backend local memory inherited by the child process.
> >
> > Secondly, I used TimestampTz for the times, but it is pretty unclear
> > to me when instr_time should be used vs TimestampTz for such things.
> > instr_time says it is more portable, but there are all sorts of places
> > that use TimestampTz that would probably need to be portable.
> > Also, INSTR_TIME_SUBTRACT() isn't overflow safe, which must be because
> > the actual data type of ticks is platform dependent and so the caller
> > is responsible for not passing it values that would overflow when
> > subtracted.
> >
> >
> Just a quick note. I like this patch a lot and it would help to diagnose
> some situations our customers may have. I tried the patch on my laptop, and
> it works fine. I'll find some time to read the code as well, but in the
> meantime, this looks like a nice thing to have in PostgreSQL.

Also +1 on the idea.

The patch needed a rebase due to 34486b6092e. I did it in v2 attached (it's
a minor rebase due to the AmRegularBackendProcess() introduction in 
miscadmin.h).

v2 could rely on AmRegularBackendProcess() instead of AmClientBackendProcess()
but I kept it with AmClientBackendProcess() to reduce "my" changes as compared 
to
v1.

Regarding the TimestampTz vs instr_time choice, we have things like:

+ TimestampTz fork_time = ((BackendStartupData *) startup_data)->fork_time;
+ TimestampTz cur_time = GetCurrentTimestamp();
+
+ conn_timing.fork_duration = TimestampDifferenceMicroseconds(fork_time,
+                                                             cur_time);

but looking at TimestampDifferenceMicroseconds():

"
    /* If start_time is in the future or now, no time has elapsed */
    if (start_time >= stop_time)
        return 0;
"

I think that it can happen due to time changes.

So with TimestampTz, we would:

1. return 0 if we moved the time backward
2. provide an inflated duration including the time jump (if the time move
forward).

But with instr_time (and on systems that support CLOCK_MONOTONIC) then
pg_clock_gettime_ns() should not be affected by system time change IIUC.

Though time changes are "rare", given the fact that those metrics could provide
"inaccurate" measurements during that particular moment (time change) then it
might be worth considering instr_time instead for this particular metric.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
>From 570f72f4bc1eb71216f898246d4c752034422c24 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Tue, 3 Dec 2024 18:02:15 -0500
Subject: [PATCH v2] Add connection establishment duration logging

Add durations for several key parts of connection establishment when
log_connections is enabled.

For a new incoming conneciton, starting from when the postmaster gets a
socket from accept() and ending when the forked child backend is first
ready for query, there are multiple steps that could each take longer
than expected due to external factors. Provide visibility into
authentication and fork duration as well as the end-to-end connection
establishment time with logging.

To make this portable, the timestamps captured in the postmaster (socket
creation time, fork initiation time) are passed through the ClientSocket
and BackendStartupData structures instead of simply saved in backend
local memory inherited by the child process.
---
 src/backend/postmaster/launch_backend.c | 25 +++++++++++++++++++++++++
 src/backend/postmaster/postmaster.c     |  8 ++++++++
 src/backend/tcop/postgres.c             | 23 +++++++++++++++++++++++
 src/backend/utils/adt/timestamp.c       | 13 +++++++++++++
 src/backend/utils/init/globals.c        |  2 ++
 src/backend/utils/init/postinit.c       | 11 +++++++++++
 src/include/libpq/libpq-be.h            |  1 +
 src/include/miscadmin.h                 |  3 +++
 src/include/postmaster/postmaster.h     |  7 +++++++
 src/include/tcop/backend_startup.h      |  3 +++
 src/include/utils/timestamp.h           |  3 +++
 src/tools/pgindent/typedefs.list        |  1 +
 12 files changed, 100 insertions(+)
  34.0% src/backend/postmaster/
  28.4% src/backend/tcop/
  10.5% src/backend/utils/adt/
  13.0% src/backend/utils/init/
   3.3% src/include/postmaster/
   3.4% src/include/utils/
   6.5% src/include/

diff --git a/src/backend/postmaster/launch_backend.c b/src/backend/postmaster/launch_backend.c
index a97a1eda6da..009c4083906 100644
--- a/src/backend/postmaster/launch_backend.c
+++ b/src/backend/postmaster/launch_backend.c
@@ -232,6 +232,10 @@ postmaster_child_launch(BackendType child_type, int child_slot,
 
 	Assert(IsPostmasterEnvironment && !IsUnderPostmaster);
 
+	/* Capture time Postmaster initiates fork for logging */
+	if (child_type == B_BACKEND)
+		((BackendStartupData *) startup_data)->fork_time = GetCurrentTimestamp();
+
 #ifdef EXEC_BACKEND
 	pid = internal_forkexec(child_process_kinds[child_type].name, child_slot,
 							startup_data, startup_data_len, client_sock);
@@ -240,6 +244,16 @@ postmaster_child_launch(BackendType child_type, int child_slot,
 	pid = fork_process();
 	if (pid == 0)				/* child */
 	{
+		/* Calculate total fork duration in child backend for logging */
+		if (child_type == B_BACKEND)
+		{
+			TimestampTz fork_time = ((BackendStartupData *) startup_data)->fork_time;
+			TimestampTz cur_time = GetCurrentTimestamp();
+
+			conn_timing.fork_duration = TimestampDifferenceMicroseconds(fork_time,
+																		cur_time);
+		}
+
 		/* Close the postmaster's sockets */
 		ClosePostmasterPorts(child_type == B_LOGGER);
 
@@ -618,6 +632,17 @@ SubPostmasterMain(int argc, char *argv[])
 	/* Read in the variables file */
 	read_backend_variables(argv[2], &startup_data, &startup_data_len);
 
+	/* Calculate total fork duration in child backend for logging */
+	if (child_type == B_BACKEND)
+	{
+		TimestampTz fork_time = ((BackendStartupData *) startup_data)->fork_time;
+		TimestampTz cur_time = GetCurrentTimestamp();
+
+		conn_timing.fork_duration = TimestampDifferenceMicroseconds(fork_time,
+																	cur_time);
+	}
+
+
 	/* Close the postmaster's sockets (as soon as we know them) */
 	ClosePostmasterPorts(child_type == B_LOGGER);
 
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 5f615d0f605..b60f5d4c7d8 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1676,7 +1676,14 @@ ServerLoop(void)
 				ClientSocket s;
 
 				if (AcceptConnection(events[i].fd, &s) == STATUS_OK)
+				{
+					/*
+					 * Capture time that Postmaster got a socket from accept
+					 * (for logging connection establishment duration)
+					 */
+					s.creation_time = GetCurrentTimestamp();
 					BackendStartup(&s);
+				}
 
 				/* We no longer need the open socket in this process */
 				if (s.sock != PGINVALID_SOCKET)
@@ -3458,6 +3465,7 @@ BackendStartup(ClientSocket *client_sock)
 
 	/* Pass down canAcceptConnections state */
 	startup_data.canAcceptConnections = cac;
+	startup_data.fork_time = 0;
 	bn->rw = NULL;
 
 	/* Hasn't asked to be notified about any bgworkers yet */
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 5655348a2e2..cbd952f2405 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4151,6 +4151,7 @@ PostgresMain(const char *dbname, const char *username)
 	volatile bool send_ready_for_query = true;
 	volatile bool idle_in_transaction_timeout_enabled = false;
 	volatile bool idle_session_timeout_enabled = false;
+	bool		reported_first_ready_for_query = false;
 
 	Assert(dbname != NULL);
 	Assert(username != NULL);
@@ -4605,6 +4606,28 @@ PostgresMain(const char *dbname, const char *username)
 			/* Report any recently-changed GUC options */
 			ReportChangedGUCOptions();
 
+			/*
+			 * The first time this backend is ready for query, log the
+			 * durations of the different components of connection
+			 * establishment.
+			 */
+			if (!reported_first_ready_for_query &&
+				Log_connections && AmClientBackendProcess())
+			{
+				TimestampTz cur_time = GetCurrentTimestamp();
+				TimestampTz total_duration =
+					TimestampDifferenceMicroseconds(MyClientSocket->creation_time, cur_time);
+
+				ereport(LOG,
+						errmsg("backend ready for query. pid=%d. socket=%d. connection establishment times (ms): total: %ld, fork: %ld, authentication: %ld",
+							   MyProcPid,
+							   (int) MyClientSocket->sock,
+							   (long int) total_duration / 1000,
+							   (long int) conn_timing.fork_duration / 1000,
+							   (long int) conn_timing.auth_duration / 1000));
+
+				reported_first_ready_for_query = true;
+			}
 			ReadyForQuery(whereToSendOutput);
 			send_ready_for_query = false;
 		}
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ba9bae05069..67827613bfc 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1769,6 +1769,19 @@ TimestampDifferenceMilliseconds(TimestampTz start_time, TimestampTz stop_time)
 		return (long) ((diff + 999) / 1000);
 }
 
+/*
+ * Returns the difference between two timestamps safely as an unsigned int64.
+ */
+uint64
+TimestampDifferenceMicroseconds(TimestampTz start_time, TimestampTz stop_time)
+{
+	/* If start_time is in the future or now, no time has elapsed */
+	if (start_time >= stop_time)
+		return 0;
+
+	return (uint64) stop_time - start_time;
+}
+
 /*
  * TimestampDifferenceExceeds -- report whether the difference between two
  *		timestamps is >= a threshold (expressed in milliseconds)
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index b844f9fdaef..3c7b14dd57d 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -43,6 +43,8 @@ volatile uint32 InterruptHoldoffCount = 0;
 volatile uint32 QueryCancelHoldoffCount = 0;
 volatile uint32 CritSectionCount = 0;
 
+ConnectionTiming conn_timing = {0};
+
 int			MyProcPid;
 pg_time_t	MyStartTime;
 TimestampTz MyStartTimestamp;
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 01bb6a410cb..38aba9d3435 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -189,9 +189,15 @@ GetDatabaseTupleByOid(Oid dboid)
 static void
 PerformAuthentication(Port *port)
 {
+	TimestampTz auth_start_time = 0;
+
 	/* This should be set already, but let's make sure */
 	ClientAuthInProgress = true;	/* limit visibility of log messages */
 
+	/* Capture authentication start time for logging */
+	if (AmClientBackendProcess())
+		auth_start_time = GetCurrentTimestamp();
+
 	/*
 	 * In EXEC_BACKEND case, we didn't inherit the contents of pg_hba.conf
 	 * etcetera from the postmaster, and have to load them ourselves.
@@ -250,6 +256,11 @@ PerformAuthentication(Port *port)
 	 */
 	disable_timeout(STATEMENT_TIMEOUT, false);
 
+	/* Calculate authentication duration for logging */
+	if (AmClientBackendProcess())
+		conn_timing.auth_duration = TimestampDifferenceMicroseconds(auth_start_time,
+																	GetCurrentTimestamp());
+
 	if (Log_connections)
 	{
 		StringInfoData logmsg;
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 2f6c29200ba..3898735267c 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -252,6 +252,7 @@ typedef struct ClientSocket
 {
 	pgsocket	sock;			/* File descriptor */
 	SockAddr	raddr;			/* remote addr (client) */
+	TimestampTz creation_time;
 } ClientSocket;
 
 #ifdef USE_SSL
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index d016a9c9248..76dd27de26d 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -105,6 +105,8 @@ extern PGDLLIMPORT volatile uint32 InterruptHoldoffCount;
 extern PGDLLIMPORT volatile uint32 QueryCancelHoldoffCount;
 extern PGDLLIMPORT volatile uint32 CritSectionCount;
 
+extern PGDLLIMPORT struct ConnectionTiming conn_timing;
+
 /* in tcop/postgres.c */
 extern void ProcessInterrupts(void);
 
@@ -379,6 +381,7 @@ extern PGDLLIMPORT BackendType MyBackendType;
 #define AmRegularBackendProcess()	(MyBackendType == B_BACKEND)
 #define AmAutoVacuumLauncherProcess() (MyBackendType == B_AUTOVAC_LAUNCHER)
 #define AmAutoVacuumWorkerProcess()	(MyBackendType == B_AUTOVAC_WORKER)
+#define AmClientBackendProcess() (MyBackendType == B_BACKEND)
 #define AmBackgroundWorkerProcess() (MyBackendType == B_BG_WORKER)
 #define AmWalSenderProcess()        (MyBackendType == B_WAL_SENDER)
 #define AmLogicalSlotSyncWorkerProcess() (MyBackendType == B_SLOTSYNC_WORKER)
diff --git a/src/include/postmaster/postmaster.h b/src/include/postmaster/postmaster.h
index 188a06e2379..6826e1532c3 100644
--- a/src/include/postmaster/postmaster.h
+++ b/src/include/postmaster/postmaster.h
@@ -126,6 +126,12 @@ extern PMChild *AllocDeadEndChild(void);
 extern bool ReleasePostmasterChildSlot(PMChild *pmchild);
 extern PMChild *FindPostmasterChildByPid(int pid);
 
+typedef struct ConnectionTiming
+{
+	uint64		fork_duration;
+	uint64		auth_duration;
+} ConnectionTiming;
+
 /*
  * Note: MAX_BACKENDS is limited to 2^18-1 because that's the width reserved
  * for buffer references in buf_internals.h.  This limitation could be lifted
@@ -155,4 +161,5 @@ typedef enum DispatchOption
 
 extern DispatchOption parse_dispatch_option(const char *name);
 
+
 #endif							/* _POSTMASTER_H */
diff --git a/src/include/tcop/backend_startup.h b/src/include/tcop/backend_startup.h
index 01baf4aad75..b08521c2043 100644
--- a/src/include/tcop/backend_startup.h
+++ b/src/include/tcop/backend_startup.h
@@ -14,6 +14,8 @@
 #ifndef BACKEND_STARTUP_H
 #define BACKEND_STARTUP_H
 
+#include "datatype/timestamp.h"
+
 /* GUCs */
 extern PGDLLIMPORT bool Trace_connection_negotiation;
 
@@ -37,6 +39,7 @@ typedef enum CAC_state
 typedef struct BackendStartupData
 {
 	CAC_state	canAcceptConnections;
+	TimestampTz fork_time;
 } BackendStartupData;
 
 extern void BackendMain(char *startup_data, size_t startup_data_len) pg_attribute_noreturn();
diff --git a/src/include/utils/timestamp.h b/src/include/utils/timestamp.h
index d26f023fb87..90b64b45dc5 100644
--- a/src/include/utils/timestamp.h
+++ b/src/include/utils/timestamp.h
@@ -104,6 +104,9 @@ extern void TimestampDifference(TimestampTz start_time, TimestampTz stop_time,
 								long *secs, int *microsecs);
 extern long TimestampDifferenceMilliseconds(TimestampTz start_time,
 											TimestampTz stop_time);
+
+extern uint64 TimestampDifferenceMicroseconds(TimestampTz start_time,
+											  TimestampTz stop_time);
 extern bool TimestampDifferenceExceeds(TimestampTz start_time,
 									   TimestampTz stop_time,
 									   int msec);
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index d5aa5c295ae..29fc1bba01a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -480,6 +480,7 @@ ConnParams
 ConnStatusType
 ConnType
 ConnectionStateEnum
+ConnectionTiming
 ConsiderSplitContext
 Const
 ConstrCheck
-- 
2.34.1

Reply via email to