Hi,

On Fri, Sep 08, 2017 at 10:30:20AM -0700, Jeff Janes wrote:
> On Wed, Sep 6, 2017 at 9:22 AM, Peter Eisentraut <
> peter.eisentr...@2ndquadrant.com> wrote:
> 
> > On 8/18/17 05:28, Michael Banck wrote:
> > >>> Rebased, squashed and slighly edited version attached. I've added this
> > >>> to the 2017-07 commitfest now as well:
> > >>>
> > >>> https://commitfest.postgresql.org/14/1112/
> > >> Can you rebase this past some conflicting changes?
> > > Thanks for letting me know, PFA a rebased version.
> >
> > I have reviewed the thread so far.  I think there is general agreement
> > that something like this would be good to have.
> >
> > I have not found any explanation, however, why the "if not exists"
> > behavior is desirable, let alone as the default.  I can only think of
> > two workflows here:  Either you have scripts for previous PG versions
> > that create the slot externally, in which can you omit --create, or you
> > use the new functionality to have pg_basebackup create the slot.  I
> > don't see any use for pg_basebackup to opportunistically use a slot if
> > it happens to exist.  Even if there is one, it should not be the
> > default.  So please change that.
> 
> +1.  I'd rather just get an error and re-run without the --create switch.
> That way you are forced to think about what you are doing.

OK.
 
> Is there a race condition here?  The slot is created after the checkpoint
> is completed.  But you have to start streaming from the LSN where the
> checkpoint started, so shouldn't the slot be created before the checkpoint
> is started?

So my patch only moves the slot creation slightly further forward,
AFAICT.

AIUI, wal streaming always begins at last checkpoint and from my tests
the restart_lsn of the created replication slot is also before that
checkpoint's lsn. However, I hope somebody more familiar with the
WAL/replication slot code could comment on that.  What I dropped in the
refactoring is the RESERVE_WAL that used to be there when the temporary
slot gets created, I have readded that now.

I also added a TAP test case that tries to check that the restart_lsn is
lower than the checkpoint_lsn, which appears to be the case.

If there is still a race condition here, do you have a suggestion in how
to try to trigger it?


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
>From fca622f6f7c6ef9e566c5a1b72044b824e7e02cc Mon Sep 17 00:00:00 2001
From: Michael Banck <michael.ba...@credativ.de>
Date: Sun, 10 Sep 2017 18:07:33 +0200
Subject: [PATCH] Add option to create a replication slot in pg_basebackup if
 not yet present.

When requesting a particular replication slot, the new pg_basebackup option
--create-slot tries to create it before starting to replicate from it unless
it exists already.

In order to allow reporting of temporary replication slot creation in --verbose
mode, further refactor the slot creation logic. Add new argument `is_temporary'
to CreateReplicationSlot() in streamutil.c which specifies whether a physical
replication slot is temporary or not. Update all callers. Move the creation of
the temporary replication slot for pg_basebackup from receivelog.c to
pg_basebackup.c and mention it in --verbose mode. At the same time, also create
the temporary slot via CreateReplicationSlot() instead of creating the
temporary slot with an explicit SQL command.
---
 doc/src/sgml/ref/pg_basebackup.sgml          | 12 ++++++
 src/bin/pg_basebackup/pg_basebackup.c        | 56 ++++++++++++++++++++++++++--
 src/bin/pg_basebackup/pg_receivewal.c        |  2 +-
 src/bin/pg_basebackup/pg_recvlogical.c       |  2 +-
 src/bin/pg_basebackup/receivelog.c           | 18 ---------
 src/bin/pg_basebackup/streamutil.c           | 15 +++++---
 src/bin/pg_basebackup/streamutil.h           |  2 +-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 34 +++++++++++++++--
 8 files changed, 107 insertions(+), 34 deletions(-)

diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 2454d35af3..57086295d3 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -269,6 +269,18 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>-C</option></term>
+      <term><option>--create-slot</option></term>
+      <listitem>
+       <para>
+        This option can only be used together with the <literal>--slot</literal>
+        option.  It causes the specified slot name to be created unless it
+        exists already.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-T <replaceable class="parameter">olddir</replaceable>=<replaceable class="parameter">newdir</replaceable></option></term>
       <term><option>--tablespace-mapping=<replaceable class="parameter">olddir</replaceable>=<replaceable class="parameter">newdir</replaceable></option></term>
       <listitem>
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 51509d150e..35ea707384 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -92,6 +92,8 @@ static pg_time_t last_progress_report = 0;
 static int32 maxrate = 0;		/* no limit by default */
 static char *replication_slot = NULL;
 static bool temp_replication_slot = true;
+static bool create_slot = false;
+static bool no_slot = false;
 
 static bool success = false;
 static bool made_new_pgdata = false;
@@ -337,6 +339,7 @@ usage(void)
 			 "                         write recovery.conf for replication\n"));
 	printf(_("  -S, --slot=SLOTNAME    replication slot to use\n"));
 	printf(_("      --no-slot          prevent creation of temporary replication slot\n"));
+	printf(_("  -C, --create-slot      create replication slot if not present already\n"));
 	printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR\n"
 			 "                         relocate tablespace in OLDDIR to NEWDIR\n"));
 	printf(_("  -X, --wal-method=none|fetch|stream\n"
@@ -492,8 +495,6 @@ LogStreamerMain(logstreamer_param *param)
 	stream.partial_suffix = NULL;
 	stream.replication_slot = replication_slot;
 	stream.temp_slot = param->temp_slot;
-	if (stream.temp_slot && !stream.replication_slot)
-		stream.replication_slot = psprintf("pg_basebackup_%d", (int) PQbackendPID(param->bgconn));
 
 	if (format == 'p')
 		stream.walmethod = CreateWalDirectoryMethod(param->xlog, 0, do_sync);
@@ -586,6 +587,27 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier)
 	else
 		param->temp_slot = temp_replication_slot;
 
+	/*
+	 * Create replication slot if one is needed.
+	 */
+	if (!no_slot && (temp_replication_slot || create_slot))
+	{
+		if (!replication_slot)
+			replication_slot = psprintf("pg_basebackup_%d", (int) PQbackendPID(param->bgconn));
+
+		if (!CreateReplicationSlot(param->bgconn, replication_slot, NULL, true,
+								   temp_replication_slot, false))
+			disconnect_and_exit(1);
+
+		if (verbose)
+			if (temp_replication_slot)
+				fprintf(stderr, _("%s: temporary replication slot \"%s\" created\n"),
+						progname, replication_slot);
+			else
+				fprintf(stderr, _("%s: replication slot \"%s\" created\n"),
+						progname, replication_slot);
+	}
+
 	if (format == 'p')
 	{
 		/*
@@ -2099,12 +2121,12 @@ main(int argc, char **argv)
 		{"progress", no_argument, NULL, 'P'},
 		{"waldir", required_argument, NULL, 1},
 		{"no-slot", no_argument, NULL, 2},
+		{"create-slot", no_argument, NULL, 'C'},
 		{NULL, 0, NULL, 0}
 	};
 	int			c;
 
 	int			option_index;
-	bool		no_slot = false;
 
 	progname = get_progname(argv[0]);
 	set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_basebackup"));
@@ -2126,7 +2148,7 @@ main(int argc, char **argv)
 
 	atexit(cleanup_directories_atexit);
 
-	while ((c = getopt_long(argc, argv, "D:F:r:RT:X:l:nNzZ:d:c:h:p:U:s:S:wWvP",
+	while ((c = getopt_long(argc, argv, "D:F:r:RS:CT:X:l:nNzZ:d:c:h:p:U:s:wWvP",
 							long_options, &option_index)) != -1)
 	{
 		switch (c)
@@ -2162,6 +2184,9 @@ main(int argc, char **argv)
 				replication_slot = pg_strdup(optarg);
 				temp_replication_slot = false;
 				break;
+			case 'C':
+				create_slot = true;
+				break;
 			case 2:
 				no_slot = true;
 				break;
@@ -2347,6 +2372,29 @@ main(int argc, char **argv)
 		temp_replication_slot = false;
 	}
 
+	if (create_slot)
+	{
+		if (!replication_slot)
+		{
+			fprintf(stderr,
+					_("%s: creation of replication slots requires a slot name\n"),
+					progname);
+			fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
+					progname);
+			exit(1);
+		}
+
+		if (no_slot)
+		{
+			fprintf(stderr,
+					_("%s: creation of replication slots requires a slot, but --no-slot was requested\n"),
+					progname);
+			fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
+					progname);
+			exit(1);
+		}
+	}
+
 	if (xlog_dir)
 	{
 		if (format != 'p')
diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c
index 4a1a5658fb..8dfe7fd549 100644
--- a/src/bin/pg_basebackup/pg_receivewal.c
+++ b/src/bin/pg_basebackup/pg_receivewal.c
@@ -700,7 +700,7 @@ main(int argc, char **argv)
 					_("%s: creating replication slot \"%s\"\n"),
 					progname, replication_slot);
 
-		if (!CreateReplicationSlot(conn, replication_slot, NULL, true,
+		if (!CreateReplicationSlot(conn, replication_slot, NULL, true, false,
 								   slot_exists_ok))
 			disconnect_and_exit(1);
 		disconnect_and_exit(0);
diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c
index 6811a55e76..6499f58e3c 100644
--- a/src/bin/pg_basebackup/pg_recvlogical.c
+++ b/src/bin/pg_basebackup/pg_recvlogical.c
@@ -979,7 +979,7 @@ main(int argc, char **argv)
 					_("%s: creating replication slot \"%s\"\n"),
 					progname, replication_slot);
 
-		if (!CreateReplicationSlot(conn, replication_slot, plugin,
+		if (!CreateReplicationSlot(conn, replication_slot, plugin, false,
 								   false, slot_exists_ok))
 			disconnect_and_exit(1);
 		startpos = InvalidXLogRecPtr;
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index 888458f4a9..1f1e5e1268 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -522,24 +522,6 @@ ReceiveXlogStream(PGconn *conn, StreamCtl *stream)
 	}
 
 	/*
-	 * Create temporary replication slot if one is needed
-	 */
-	if (stream->temp_slot)
-	{
-		snprintf(query, sizeof(query),
-				 "CREATE_REPLICATION_SLOT \"%s\" TEMPORARY PHYSICAL RESERVE_WAL",
-				 stream->replication_slot);
-		res = PQexec(conn, query);
-		if (PQresultStatus(res) != PGRES_TUPLES_OK)
-		{
-			fprintf(stderr, _("%s: could not create temporary replication slot \"%s\": %s"),
-					progname, stream->replication_slot, PQerrorMessage(conn));
-			PQclear(res);
-			return false;
-		}
-	}
-
-	/*
 	 * initialize flush position to starting point, it's the caller's
 	 * responsibility that that's sane.
 	 */
diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c
index 9d40744a34..be3a9786fd 100644
--- a/src/bin/pg_basebackup/streamutil.c
+++ b/src/bin/pg_basebackup/streamutil.c
@@ -322,7 +322,7 @@ RunIdentifySystem(PGconn *conn, char **sysid, TimeLineID *starttli,
  */
 bool
 CreateReplicationSlot(PGconn *conn, const char *slot_name, const char *plugin,
-					  bool is_physical, bool slot_exists_ok)
+					  bool is_physical, bool is_temporary, bool slot_exists_ok)
 {
 	PQExpBuffer query;
 	PGresult   *res;
@@ -334,13 +334,18 @@ CreateReplicationSlot(PGconn *conn, const char *slot_name, const char *plugin,
 	Assert(slot_name != NULL);
 
 	/* Build query */
+	appendPQExpBuffer(query, "CREATE_REPLICATION_SLOT \"%s\"", slot_name);
 	if (is_physical)
-		appendPQExpBuffer(query, "CREATE_REPLICATION_SLOT \"%s\" PHYSICAL",
-						  slot_name);
+	{
+		if (is_temporary)
+			appendPQExpBuffer(query, " TEMPORARY");
+		appendPQExpBuffer(query, " PHYSICAL RESERVE_WAL");
+	}
 	else
 	{
-		appendPQExpBuffer(query, "CREATE_REPLICATION_SLOT \"%s\" LOGICAL \"%s\"",
-						  slot_name, plugin);
+		if (is_temporary)
+			appendPQExpBuffer(query, " TEMPORARY");
+		appendPQExpBuffer(query, " LOGICAL \"%s\"", plugin);
 		if (PQserverVersion(conn) >= 100000)
 			/* pg_recvlogical doesn't use an exported snapshot, so suppress */
 			appendPQExpBuffer(query, " NOEXPORT_SNAPSHOT");
diff --git a/src/bin/pg_basebackup/streamutil.h b/src/bin/pg_basebackup/streamutil.h
index 6f6878679f..1dffb55b91 100644
--- a/src/bin/pg_basebackup/streamutil.h
+++ b/src/bin/pg_basebackup/streamutil.h
@@ -32,7 +32,7 @@ extern PGconn *GetConnection(void);
 
 /* Replication commands */
 extern bool CreateReplicationSlot(PGconn *conn, const char *slot_name,
-					  const char *plugin, bool is_physical,
+					  const char *plugin, bool is_physical, bool is_temporary,
 					  bool slot_exists_ok);
 extern bool DropReplicationSlot(PGconn *conn, const char *slot_name);
 extern bool RunIdentifySystem(PGconn *conn, char **sysid,
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index a00f7b0e1a..e287645745 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -4,7 +4,7 @@ use Cwd;
 use Config;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 72;
+use Test::More tests => 78;
 
 program_help_ok('pg_basebackup');
 program_version_ok('pg_basebackup');
@@ -256,18 +256,44 @@ $node->command_ok(
 	'pg_basebackup -X stream runs with --no-slot');
 
 $node->command_fails(
-	[ 'pg_basebackup', '-D', "$tempdir/fail", '-S', 'slot1' ],
+	[ 'pg_basebackup', '-D', "$tempdir/fail", '-X', 'none', '-S', 'slot0' ],
 	'pg_basebackup with replication slot fails without -X stream');
 $node->command_fails(
 	[   'pg_basebackup',             '-D',
 		"$tempdir/backupxs_sl_fail", '-X',
 		'stream',                    '-S',
-		'slot1' ],
+		'slot0' ],
 	'pg_basebackup fails with nonexistent replication slot');
 
+$node->command_fails(
+	[   'pg_basebackup', '-D', "$tempdir/backupxs_slot", '-C' ],
+	'pg_basebackup -C fails without slot name');
+
+$node->command_fails(
+	[   'pg_basebackup', '-D', "$tempdir/backupxs_slot", '-C', '-S', 'slot0', '--no-slot' ],
+	'pg_basebackup fails with -C -S --no-slot');
+
+$node->command_ok(
+	[   'pg_basebackup', '-D', "$tempdir/backupxs_slot", '-C', '-S', 'slot0' ],
+	'pg_basebackup -C -S creates previously nonexistent replication slot');
+
+my $lsn = $node->safe_psql('postgres',
+	q{SELECT restart_lsn FROM pg_replication_slots WHERE slot_name = 'slot0'}
+);
+like($lsn, qr!^0/[0-9A-F]{7,8}$!, 'replication slot is present');
+
+my $checkpoint_lsn = $node->safe_psql('postgres',
+	q{SELECT checkpoint_lsn FROM pg_control_checkpoint(), pg_replication_slots WHERE slot_name = 'slot0' AND restart_lsn <= pg_control_checkpoint.checkpoint_lsn}
+);
+like($lsn, qr!^0/[0-9A-F]{7,8}$!, "replication slot LSN $lsn is lower than latest checkpoint LSN $checkpoint_lsn");
+
+$node->command_fails(
+	[   'pg_basebackup', '-D', "$tempdir/backupxs_slot1", '-C', '-S', 'slot0' ],
+	'pg_basebackup fails with -C -S and a previously existing slot');
+
 $node->safe_psql('postgres',
 	q{SELECT * FROM pg_create_physical_replication_slot('slot1')});
-my $lsn = $node->safe_psql('postgres',
+$lsn = $node->safe_psql('postgres',
 	q{SELECT restart_lsn FROM pg_replication_slots WHERE slot_name = 'slot1'}
 );
 is($lsn, '', 'restart LSN of new slot is null');
-- 
2.11.0

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to