On Wed, Oct 21, 2020 at 6:43 PM Magnus Hagander <mag...@hagander.net> wrote:
>
> I think it would be nicer to have a pg_terminate_backend(pid, wait=false), so 
> a function with a second parameter which defaults to the current behaviour of 
> not waiting. And it might be a good idea to also give it a timeout parameter?
>

Done.

>
>> 2. pg_wait_backend() -- which waits for a given backend process. Note that 
>> this function has to be used carefully after pg_terminate_backend(), if used 
>> on a backend that's not ternmited it simply keeps waiting in a loop.
>
> It seems this one also very much would need a timeout value.
>

Done.

>
> And surely we should show some sort of wait event when it's waiting.
>

Added two wait events.

>
>> If the backend is terminated within the user specified timeout then
>> the function returns true, otherwise false.
>
> I’m suggesting an option for the second case to fail instead of returning 
> false.
>

Done.

>
> > I could imagine, in theory at least, wanting to wait for a backend to go 
> > idle as well as for it disappearing.  Scope creep in terms of this patch's 
> > goal but worth at least considering now.
>
> IIUC, do we need a new option, something like pg_wait_backend(pid,
> timeout, waituntil) where "waituntil" if specified "idle" waits until
> the given backend goes to idle mode, or "termination" waits until
> termination?
>

Done.

Attaching a v2 patch herewith.

Thoughts and feedback are welcome.

Below things are still pending, which I plan to work on soon:

1. More testing and addition of test cases into the regression test suite.
2. Addition of the new function information into the docs.


With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
From da1d6d54d877689dac4c2cc5338e15b73e0c6979 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddy@enterprisedb.com>
Date: Wed, 28 Oct 2020 06:56:16 +0530
Subject: [PATCH v2] pg_wait_backend() and pg_terminate_backend() with wait and
 timeout options

---
 src/backend/postmaster/pgstat.c       | 175 ++++++++++++++++----------
 src/backend/storage/ipc/signalfuncs.c | 164 ++++++++++++++++++++++++
 src/include/catalog/pg_proc.dat       |   9 ++
 src/include/pgstat.h                  |   6 +-
 4 files changed, 290 insertions(+), 64 deletions(-)

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 822f0ebc62..c077f3c072 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -3778,6 +3778,89 @@ pgstat_get_wait_event(uint32 wait_event_info)
 	return event_name;
 }
 
+/* ----------
+ * pgstat_get_backend_status_entry() -
+ *
+ * Return the entry of the backend with the given PID from the backend status
+ * array. This looks directly at the BackendStatusArray, and so will provide
+ * current information regardless of the age of our transaction's snapshot of
+ * the status array.
+ * ----------
+ */
+bool
+pgstat_get_backend_status_entry(int pid, PgBackendStatus *bestatusentry)
+{
+	PgBackendStatus *beentry;
+	int			i;
+
+	beentry = BackendStatusArray;
+	for (i = 1; i <= MaxBackends; i++)
+	{
+		/*
+		 * Although we expect the target backend's entry to be stable, that
+		 * doesn't imply that anyone else's is.  To avoid identifying the
+		 * wrong backend, while we check for a match to the desired PID we
+		 * must follow the protocol of retrying if st_changecount changes
+		 * while we examine the entry, or if it's odd.  (This might be
+		 * unnecessary, since fetching or storing an int is almost certainly
+		 * atomic, but let's play it safe.)  We use a volatile pointer here to
+		 * ensure the compiler doesn't try to get cute.
+		 */
+		volatile PgBackendStatus *vbeentry = beentry;
+		bool		found;
+
+		for (;;)
+		{
+			int			before_changecount;
+			int			after_changecount;
+
+			pgstat_begin_read_activity(vbeentry, before_changecount);
+
+			found = (vbeentry->st_procpid == pid);
+
+			pgstat_end_read_activity(vbeentry, after_changecount);
+
+			if (pgstat_read_activity_complete(before_changecount,
+											  after_changecount))
+				break;
+
+			/* Make sure we can break out of loop if stuck... */
+			CHECK_FOR_INTERRUPTS();
+		}
+
+		if (found)
+		{
+			/* Now it is safe to use the non-volatile pointer */
+			bestatusentry = beentry;
+			return true;
+		}
+
+		beentry++;
+	}
+
+	bestatusentry = NULL;
+	return false;
+}
+
+/* ----------
+ * pgstat_get_backend_state() -
+ *
+ * Return a pointer to the state of the backend with the specified PID.
+ * ----------
+ */
+BackendState *
+pgstat_get_backend_state(int pid)
+{
+	PgBackendStatus *beentry = NULL;
+	bool found = pgstat_get_backend_status_entry(pid, beentry);
+
+	if(found && beentry != NULL)
+		return &beentry->st_state;
+
+	/* PID not found */
+	return NULL;
+}
+
 /* ----------
  * pgstat_get_wait_activity() -
  *
@@ -4021,6 +4104,12 @@ pgstat_get_wait_ipc(WaitEventIPC w)
 		case WAIT_EVENT_XACT_GROUP_UPDATE:
 			event_name = "XactGroupUpdate";
 			break;
+		case WAIT_EVENT_BACKEND_TERMINATION:
+			event_name = "BackendTermination";
+			break;
+		case WAIT_EVENT_BACKEND_IDLE:
+			event_name = "BackendIdle";
+			break;
 			/* no default case, so that compiler will warn */
 	}
 
@@ -4307,78 +4396,38 @@ pgstat_get_wait_io(WaitEventIO w)
 /* ----------
  * pgstat_get_backend_current_activity() -
  *
- *	Return a string representing the current activity of the backend with
- *	the specified PID.  This looks directly at the BackendStatusArray,
- *	and so will provide current information regardless of the age of our
- *	transaction's snapshot of the status array.
- *
- *	It is the caller's responsibility to invoke this only for backends whose
- *	state is expected to remain stable while the result is in use.  The
- *	only current use is in deadlock reporting, where we can expect that
- *	the target backend is blocked on a lock.  (There are corner cases
- *	where the target's wait could get aborted while we are looking at it,
- *	but the very worst consequence is to return a pointer to a string
- *	that's been changed, so we won't worry too much.)
- *
- *	Note: return strings for special cases match pg_stat_get_backend_activity.
+ * Return a string representing the current activity of the backend with
+ * the specified PID.
+ *
+ * It is the caller's responsibility to invoke this only for backends whose
+ * state is expected to remain stable while the result is in use.  The only
+ * current use is in deadlock reporting, where we can expect that the target
+ * backend is blocked on a lock.  (There are corner cases where the target's
+ * wait could get aborted while we are looking at it, but the very worst
+ * consequence is to return a pointer to a string that's been changed, so we
+ * won't worry too much.)
+ *
+ * Note: return strings for special cases match pg_stat_get_backend_activity.
  * ----------
  */
 const char *
 pgstat_get_backend_current_activity(int pid, bool checkUser)
 {
-	PgBackendStatus *beentry;
-	int			i;
+	PgBackendStatus *beentry = NULL;
+	bool found = pgstat_get_backend_status_entry(pid, beentry);
 
-	beentry = BackendStatusArray;
-	for (i = 1; i <= MaxBackends; i++)
+	if (found && beentry != NULL)
 	{
-		/*
-		 * Although we expect the target backend's entry to be stable, that
-		 * doesn't imply that anyone else's is.  To avoid identifying the
-		 * wrong backend, while we check for a match to the desired PID we
-		 * must follow the protocol of retrying if st_changecount changes
-		 * while we examine the entry, or if it's odd.  (This might be
-		 * unnecessary, since fetching or storing an int is almost certainly
-		 * atomic, but let's play it safe.)  We use a volatile pointer here to
-		 * ensure the compiler doesn't try to get cute.
-		 */
-		volatile PgBackendStatus *vbeentry = beentry;
-		bool		found;
-
-		for (;;)
-		{
-			int			before_changecount;
-			int			after_changecount;
-
-			pgstat_begin_read_activity(vbeentry, before_changecount);
-
-			found = (vbeentry->st_procpid == pid);
-
-			pgstat_end_read_activity(vbeentry, after_changecount);
-
-			if (pgstat_read_activity_complete(before_changecount,
-											  after_changecount))
-				break;
-
-			/* Make sure we can break out of loop if stuck... */
-			CHECK_FOR_INTERRUPTS();
-		}
-
-		if (found)
+		/* Now it is safe to use the non-volatile pointer */
+		if (checkUser && !superuser() && beentry->st_userid != GetUserId())
+			return "<insufficient privilege>";
+		else if (*(beentry->st_activity_raw) == '\0')
+			return "<command string not enabled>";
+		else
 		{
-			/* Now it is safe to use the non-volatile pointer */
-			if (checkUser && !superuser() && beentry->st_userid != GetUserId())
-				return "<insufficient privilege>";
-			else if (*(beentry->st_activity_raw) == '\0')
-				return "<command string not enabled>";
-			else
-			{
-				/* this'll leak a bit of memory, but that seems acceptable */
-				return pgstat_clip_activity(beentry->st_activity_raw);
-			}
+			/* this'll leak a bit of memory, but that seems acceptable */
+			return pgstat_clip_activity(beentry->st_activity_raw);
 		}
-
-		beentry++;
 	}
 
 	/* If we get here, caller is in error ... */
diff --git a/src/backend/storage/ipc/signalfuncs.c b/src/backend/storage/ipc/signalfuncs.c
index d822e82cb9..8a9b8ef254 100644
--- a/src/backend/storage/ipc/signalfuncs.c
+++ b/src/backend/storage/ipc/signalfuncs.c
@@ -17,13 +17,16 @@
 #include <signal.h>
 
 #include "catalog/pg_authid.h"
+#include "catalog/pg_collation.h"
 #include "miscadmin.h"
+#include "pgstat.h"
 #include "postmaster/syslogger.h"
 #include "storage/pmsignal.h"
 #include "storage/proc.h"
 #include "storage/procarray.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
+#include "utils/formatting.h"
 
 
 /*
@@ -44,6 +47,18 @@
 #define SIGNAL_BACKEND_ERROR 1
 #define SIGNAL_BACKEND_NOPERMISSION 2
 #define SIGNAL_BACKEND_NOSUPERUSER 3
+
+/* ----------
+ * The types of backend wait mode
+ * ----------
+ */
+typedef enum BackendWaitMode
+{
+	PGSTAT_BACKEND_UNDEFINED,
+	PGSTAT_BACKEND_TERMINATE,
+	PGSTAT_BACKEND_IDLE
+} BackendWaitMode;
+
 static int
 pg_signal_backend(int pid, int sig)
 {
@@ -149,6 +164,155 @@ pg_terminate_backend(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(r == SIGNAL_BACKEND_SUCCESS);
 }
 
+/*
+ * Wait until there is no backend process with given PID or wait until timeout.
+ * On timeout, an error is thrown to the user.
+ */
+static void
+pg_wait_until_termination_or_idle(int pid,
+								  float8 timeout,
+								  BackendWaitMode mode)
+{
+	float8	sleeptime = 50000;
+	float8	totalsleeptime = 0;
+	float8	remainingtime = 0;
+
+	if (timeout < 0)
+		ereport(ERROR,
+				(errmsg("timeout cannot be negative: %f", timeout)));
+
+	remainingtime = timeout * 1000L;
+
+	if (mode == PGSTAT_BACKEND_TERMINATE)
+	{
+		while (kill(pid, 0) == 0)
+		{
+			if (remainingtime < sleeptime)
+				sleeptime = remainingtime;
+
+			pgstat_report_wait_start(WAIT_EVENT_BACKEND_TERMINATION);
+			pg_usleep(sleeptime);
+			pgstat_report_wait_end();
+
+			totalsleeptime += sleeptime;
+
+			if (totalsleeptime >= timeout * 1000L)
+				ereport(ERROR,
+						(errmsg("could not check PID %d liveness due to timeout",
+								pid)));
+
+			remainingtime = timeout * 1000L - totalsleeptime;
+		}
+
+		if (errno != ESRCH)
+			ereport(ERROR,
+					(errmsg("could not check PID %d liveness: %m", pid)));
+	}
+	else if (mode == PGSTAT_BACKEND_IDLE)
+	{
+		BackendState *state;
+
+		while (1)
+		{
+			state = pgstat_get_backend_state(pid);
+
+			if (state != NULL &&
+				*state == STATE_IDLE)
+				break;
+
+			if (remainingtime < sleeptime)
+				sleeptime = remainingtime;
+
+			pgstat_report_wait_start(WAIT_EVENT_BACKEND_IDLE);
+			pg_usleep(sleeptime);
+			pgstat_report_wait_end();
+
+			totalsleeptime += sleeptime;
+
+			if (totalsleeptime >= timeout * 1000L)
+				ereport(ERROR,
+						(errmsg("could not wait until backend with PID %d becomes idle due to timeout",
+								pid)));
+
+			remainingtime = timeout * 1000L - totalsleeptime;
+		}
+	}
+}
+
+/*
+ * Signal to terminate a backend process and wait until timeout or no process
+ * has a given PID. If the wait input argument is false, this function
+ * behaviour is same as pg_terminate_backend. On timeout, an error is thrown to
+ * the user. Self termination is allowed but waiting is not. Because once the
+ * backend is self terminated there is no point in waiting for it to go away.
+ */
+Datum
+pg_terminate_backend_and_wait(PG_FUNCTION_ARGS)
+{
+	int 	pid = PG_GETARG_DATUM(0);
+	bool	wait = PG_GETARG_BOOL(1);
+	bool	r;
+
+	/* Self termination is allowed but waiting is not. */
+	if (pid == MyProcPid && wait)
+		wait = false;
+
+	r = DatumGetBool(DirectFunctionCall1(pg_terminate_backend, pid));
+
+	if (wait && r)
+		pg_wait_until_termination_or_idle(pid,
+										  PG_GETARG_FLOAT8(2),
+										  PGSTAT_BACKEND_TERMINATE);
+
+	PG_RETURN_BOOL(r);
+}
+
+/*
+ * This function waits for a backend with given PID to become idle or be
+ * terminated or it gets timed out. If the PID given is of the current backend
+ * that issued this function, then, in wait until idle mode mostly it gets
+ * timed out because the backend never reaches idle state. In wait until
+ * termination mode, either timeout can occur or the function can succeed. It
+ * depends on whether the backend itself got teriminated or someone else
+ * has terminated the backend.
+ */
+Datum
+pg_wait_backend(PG_FUNCTION_ARGS)
+{
+	int				pid = PG_GETARG_INT32(0);
+	PGPROC	   		*proc = BackendPidGetProc(pid);
+	char	   		*wait_until = NULL;
+	char 	   		*wait_until_lower = NULL;
+	BackendWaitMode	mode = PGSTAT_BACKEND_UNDEFINED;
+
+	if (proc == NULL)
+	{
+		ereport(WARNING,
+				(errmsg("PID %d is not a PostgreSQL server process",
+				 pid)));
+		PG_RETURN_BOOL(false);
+	}
+
+	wait_until = text_to_cstring(PG_GETARG_TEXT_PP(1));
+	wait_until_lower = str_tolower(wait_until, strlen(wait_until),
+								 DEFAULT_COLLATION_OID);
+
+	if (strcmp(wait_until_lower, "termination") == 0)
+		mode = PGSTAT_BACKEND_TERMINATE;
+	else if(strcmp(wait_until_lower, "idle") == 0)
+		mode = PGSTAT_BACKEND_IDLE;
+	else
+		ereport(ERROR,
+				(errmsg("specified wait until parameter \"%s\" is invalid",
+						 wait_until)));
+
+	pg_wait_until_termination_or_idle(pid, PG_GETARG_FLOAT8(2), mode);
+
+	pfree(wait_until_lower);
+
+	PG_RETURN_BOOL(true);
+}
+
 /*
  * Signal to reload the database configuration
  *
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index bbcac69d48..e04d6b0a9e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6085,6 +6085,15 @@
 { oid => '2096', descr => 'terminate a server process',
   proname => 'pg_terminate_backend', provolatile => 'v', prorettype => 'bool',
   proargtypes => 'int4', prosrc => 'pg_terminate_backend' },
+{ oid => '16386', descr => 'terminate a server process and wait for it\'s exit or until timeout occurs',
+  proname => 'pg_terminate_backend', provolatile => 'v',
+  prorettype => 'bool', proargtypes => 'int4 bool float8',
+  proargnames => '{pid,wait,timeout}',
+  prosrc => 'pg_terminate_backend_and_wait' },
+{ oid => '16387', descr => 'waits for a backend\'s exit or until the backend goes to idle mode or timeout occurs',
+  proname => 'pg_wait_backend', provolatile => 'v', prorettype => 'bool',
+  proargtypes => 'int4 text float8', proargnames => '{pid,wait_until,timeout}',
+  prosrc => 'pg_wait_backend' },
 { oid => '2172', descr => 'prepare for taking an online backup',
   proname => 'pg_start_backup', provolatile => 'v', proparallel => 'r',
   prorettype => 'pg_lsn', proargtypes => 'text bool bool',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index a821ff4f15..fc44aa6173 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -952,7 +952,9 @@ typedef enum
 	WAIT_EVENT_REPLICATION_SLOT_DROP,
 	WAIT_EVENT_SAFE_SNAPSHOT,
 	WAIT_EVENT_SYNC_REP,
-	WAIT_EVENT_XACT_GROUP_UPDATE
+	WAIT_EVENT_XACT_GROUP_UPDATE,
+	WAIT_EVENT_BACKEND_TERMINATION,
+	WAIT_EVENT_BACKEND_IDLE
 } WaitEventIPC;
 
 /* ----------
@@ -1399,6 +1401,8 @@ extern void pgstat_report_appname(const char *appname);
 extern void pgstat_report_xact_timestamp(TimestampTz tstamp);
 extern const char *pgstat_get_wait_event(uint32 wait_event_info);
 extern const char *pgstat_get_wait_event_type(uint32 wait_event_info);
+extern bool pgstat_get_backend_status_entry(int pid, PgBackendStatus *bestatusentry);
+extern BackendState *pgstat_get_backend_state(int pid);
 extern const char *pgstat_get_backend_current_activity(int pid, bool checkUser);
 extern const char *pgstat_get_crashed_backend_activity(int pid, char *buffer,
 													   int buflen);
-- 
2.25.1

Reply via email to