On Thu, May 5, 2022 at 2:57 PM Bharath Rupireddy
<bharath.rupireddyforpostg...@gmail.com> wrote:
>
> On Fri, Apr 29, 2022 at 4:02 PM Alvaro Herrera <alvhe...@alvh.no-ip.org> 
> wrote:
> >
> > If I read the code right, this patch emits logs when
> > pg_logical_slot_get_changes and pg_replication_slot_advance SQL
> > functions are called.  Is this desirable/useful, for the use case you
> > stated at start of thread?  I think it is most likely pointless.  If you
> > get rid of those, then the only acquisitions that would log messages are
> > those in StartReplication and StartLogicalReplication.  So I wonder if
> > it would be better to leave the API of ReplicationSlotAcquire() alone,
> > and instead make StartReplication and StartLogicalReplication
> > responsible for those messages.
> >
> > I didn't look at the release-side messages you're adding, but I suppose
> > it should be symmetrical.

Here's the v6 patch, a much simpler one - no changes to any of the
existing function APIs. Please see the sample logs at [1]. There's a
bit of duplicate code in the v6 patch, if the overall approach looks
okay, I can remove that too in the next version of the patch.

Thoughts?

[1]
2022-07-25 12:30:14.847 UTC [152873] LOG:  acquired physical
replication slot "foo"
2022-07-25 12:30:20.878 UTC [152873] LOG:  released physical
replication slot "foo"

2022-07-25 12:49:18.023 UTC [168738] LOG:  acquired logical
replication slot "bar"
2022-07-25 12:49:28.105 UTC [168738] LOG:  released logical
replication slot "bar"

Regards,
Bharath Rupireddy.
From c57cfd8dd27eaf9e0c17c47769cb24b8ab961116 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Mon, 25 Jul 2022 12:51:26 +0000
Subject: [PATCH v6] Add LOG messages when replication slots become active and
 inactive

These logs will be extremely useful on production servers to debug
and analyze inactive replication slot issues.
---
 doc/src/sgml/config.sgml            |  7 +++---
 src/backend/replication/slot.c      | 21 ++++++++++++++++
 src/backend/replication/walsender.c | 39 +++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e2d728e0c4..8bf3c2ab53 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7461,9 +7461,10 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
       </term>
       <listitem>
        <para>
-        Causes each replication command to be logged in the server log.
-        See <xref linkend="protocol-replication"/> for more information about
-        replication command. The default value is <literal>off</literal>.
+        Causes each replication command and related activity to be logged in
+        the server log. See <xref linkend="protocol-replication"/> for more
+        information about replication command. The default value is
+        <literal>off</literal>.
         Only superusers and users with the appropriate <literal>SET</literal>
         privilege can change this setting.
        </para>
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index 850b74936f..43563bec7e 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -179,8 +179,29 @@ ReplicationSlotShmemExit(int code, Datum arg)
 {
 	/* Make sure active replication slots are released */
 	if (MyReplicationSlot != NULL)
+	{
+		bool is_physical;
+		char slotname[NAMEDATALEN] = {0};
+
+		is_physical = SlotIsPhysical(MyReplicationSlot);
+		strcpy(slotname, NameStr(MyReplicationSlot->data.name));
+
 		ReplicationSlotRelease();
 
+		if (is_physical)
+		{
+			ereport(log_replication_commands ? LOG : DEBUG3,
+					(errmsg("released physical replication slot \"%s\"",
+							slotname)));
+		}
+		else
+		{
+			ereport(log_replication_commands ? LOG : DEBUG3,
+					(errmsg("released logical replication slot \"%s\"",
+							slotname)));
+		}
+	}
+
 	/* Also cleanup all the temporary slots. */
 	ReplicationSlotCleanup();
 }
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 3a86786cc3..ebd34316e2 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -322,8 +322,29 @@ WalSndErrorCleanup(void)
 		wal_segment_close(xlogreader);
 
 	if (MyReplicationSlot != NULL)
+	{
+		bool is_physical;
+		char slotname[NAMEDATALEN] = {0};
+
+		is_physical = SlotIsPhysical(MyReplicationSlot);
+		strcpy(slotname, NameStr(MyReplicationSlot->data.name));
+
 		ReplicationSlotRelease();
 
+		if (is_physical)
+		{
+			ereport(log_replication_commands ? LOG : DEBUG3,
+					(errmsg("released physical replication slot \"%s\"",
+							slotname)));
+		}
+		else
+		{
+			ereport(log_replication_commands ? LOG : DEBUG3,
+					(errmsg("released logical replication slot \"%s\"",
+							slotname)));
+		}
+	}
+
 	ReplicationSlotCleanup();
 
 	replication_active = false;
@@ -704,6 +725,10 @@ StartReplication(StartReplicationCmd *cmd)
 					(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 					 errmsg("cannot use a logical replication slot for physical replication")));
 
+		ereport(log_replication_commands ? LOG : DEBUG3,
+				(errmsg("acquired physical replication slot \"%s\"",
+						cmd->slotname)));
+
 		/*
 		 * We don't need to verify the slot's restart_lsn here; instead we
 		 * rely on the caller requesting the starting point to use.  If the
@@ -844,8 +869,14 @@ StartReplication(StartReplicationCmd *cmd)
 	}
 
 	if (cmd->slotname)
+	{
 		ReplicationSlotRelease();
 
+		ereport(log_replication_commands ? LOG : DEBUG3,
+				(errmsg("released physical replication slot \"%s\"",
+						cmd->slotname)));
+	}
+
 	/*
 	 * Copy is finished now. Send a single-row result set indicating the next
 	 * timeline.
@@ -1256,6 +1287,10 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 						cmd->slotname),
 				 errdetail("This slot has been invalidated because it exceeded the maximum reserved size.")));
 
+	ereport(log_replication_commands ? LOG : DEBUG3,
+			(errmsg("acquired logical replication slot \"%s\"",
+					cmd->slotname)));
+
 	/*
 	 * Force a disconnect, so that the decoding code doesn't need to care
 	 * about an eventual switch from running in recovery, to running in a
@@ -1318,6 +1353,10 @@ StartLogicalReplication(StartReplicationCmd *cmd)
 	FreeDecodingContext(logical_decoding_ctx);
 	ReplicationSlotRelease();
 
+	ereport(log_replication_commands ? LOG : DEBUG3,
+			(errmsg("released logical replication slot \"%s\"",
+					cmd->slotname)));
+
 	replication_active = false;
 	if (got_STOPPING)
 		proc_exit(0);
-- 
2.34.1

Reply via email to