Here's a couple of fixups.  0001 is the same as before.  In 0002 I think
CheckTablespaceDirectory ends up easier to read if we split out the test
for validity of the link.  Looking at that again, I think we don't need
to piggyback on ignore_invalid_pages, which is already a stretch, so
let's not -- instead we can use allow_in_place_tablespaces if users need
a workaround.  So that's 0003 (this bit needs more than zero docs,
however.)

0004 is straightforward: let's check for bad directories before logging
about consistent state.

After all this, I'm not sure what to think of dbase_redo.  At line 3102,
is the directory supposed to exist or not?  I'm confused as to what is
the expected state at that point.  I rewrote this, but now I think my
rewrite continues to be confusing, so I'll have to think more about it.

Another aspect are the tests.  Robert described a scenario where the
previously committed version of this patch created trouble.  Do we have
a test case to cover that problematic case?  I think we should strive to
cover it, if possible.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"The eagle never lost so much time, as
when he submitted to learn of the crow." (William Blake)
>From bdb691c2a86301e5369b3ae7f735fa5f7c29304d Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvhe...@alvh.no-ip.org>
Date: Wed, 13 Jul 2022 18:14:18 +0200
Subject: [PATCH v25 1/4] Fix replay of create database records on standby

Crash recovery on standby may encounter missing directories when
replaying create database WAL records.  Prior to this patch, the
standby would fail to recover in such a case.  However, the
directories could be legitimately missing.  Consider a sequence of WAL
records as follows:

    CREATE DATABASE
    DROP DATABASE
    DROP TABLESPACE

If, after replaying the last WAL record and removing the tablespace
directory, the standby crashes and has to replay the create database
record again, the crash recovery must be able to move on.

This patch allows missing tablespaces to be created during recovery
before reaching consistency.  The tablespaces are created as real
directories that should not exists but will be removed until reaching
consistency. CheckRecoveryConsistency is responsible to make sure they
have disappeared.

Similar to log_invalid_page mechanism, the GUC ignore_invalid_pages
turns into PANIC errors detected by this patch into WARNING, which
allows continueing recovery.

Diagnosed-by: Paul Guo <paul...@gmail.com>
Author: Paul Guo <paul...@gmail.com>
Author: Kyotaro Horiguchi <horikyota....@gmail.com>
Author: Asim R Praveen <aprav...@pivotal.io>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27dn5xwj2p9-roh16e4...@mail.gmail.com
---
 doc/src/sgml/config.sgml                    |   5 +-
 src/backend/access/transam/xlogrecovery.c   |  59 ++++++++
 src/backend/commands/dbcommands.c           |  71 +++++++++
 src/backend/commands/tablespace.c           |  28 +---
 src/backend/utils/misc/guc.c                |   8 +-
 src/include/access/xlogutils.h              |   2 +
 src/test/recovery/t/029_replay_tsp_drops.pl | 155 ++++++++++++++++++++
 7 files changed, 296 insertions(+), 32 deletions(-)
 create mode 100644 src/test/recovery/t/029_replay_tsp_drops.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 37fd80388c..1e1c8c1cb7 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -11363,11 +11363,12 @@ LOG:  CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1)
       <listitem>
        <para>
         If set to <literal>off</literal> (the default), detection of
-        WAL records having references to invalid pages during
+        WAL records having references to invalid pages or
+        WAL records resulting in invalid directory operations during
         recovery causes <productname>PostgreSQL</productname> to
         raise a PANIC-level error, aborting the recovery. Setting
         <varname>ignore_invalid_pages</varname> to <literal>on</literal>
-        causes the system to ignore invalid page references in WAL records
+        causes the system to ignore invalid actions caused by such WAL records
         (but still report a warning), and continue the recovery.
         This behavior may <emphasis>cause crashes, data loss,
         propagate or hide corruption, or other serious problems</emphasis>.
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 5d6f1b5e46..ae81244e06 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2008,6 +2008,57 @@ xlogrecovery_redo(XLogReaderState *record, TimeLineID replayTLI)
 	}
 }
 
+/*
+ * Makes sure that ./pg_tblspc directory doesn't contain a real directory.
+ *
+ * This is intended to be called after reaching consistency.
+ * ignore_invalid_pages=on turns into the PANIC error into WARNING so that
+ * recovery can continue.
+ *
+ * This can't be checked in allow_in_place_tablespaces mode, so skip it in
+ * that case.
+ */
+static void
+CheckTablespaceDirectory(void)
+{
+	char *tblspc_path = "./pg_tblspc";
+	DIR		   *dir;
+	struct dirent *de;
+
+	/* Do not check for this when test tablespaces are in use */
+	if (allow_in_place_tablespaces)
+		return;
+
+	dir = AllocateDir(tblspc_path);
+	while ((de = ReadDir(dir, tblspc_path)) != NULL)
+	{
+		char	path[MAXPGPATH];
+		char   *p;
+#ifndef WIN32
+		struct stat st;
+#endif
+
+		/* Skip entries of non-oid names */
+		for (p = de->d_name; *p && isdigit(*p); p++);
+		if (*p)
+			continue;
+
+		snprintf(path, MAXPGPATH, "%s/%s", tblspc_path, de->d_name);
+
+#ifndef WIN32
+		if (lstat(path, &st) < 0)
+			ereport(ERROR, errcode_for_file_access(),
+					errmsg("could not stat file \"%s\": %m", path));
+
+		if (!S_ISLNK(st.st_mode))
+#else
+		if (!pgwin32_is_junction(path))
+#endif
+			elog(ignore_invalid_pages ? WARNING : PANIC,
+				 "real directory found in pg_tblspc directory: %s", de->d_name);
+	}
+}
+
 /*
  * Checks if recovery has reached a consistent state. When consistency is
  * reached and we have a valid starting standby snapshot, tell postmaster
@@ -2072,6 +2123,14 @@ CheckRecoveryConsistency(void)
 		ereport(LOG,
 				(errmsg("consistent recovery state reached at %X/%X",
 						LSN_FORMAT_ARGS(lastReplayedEndRecPtr))));
+
+		/*
+		 * Check that pg_tblspc doesn't contain a real
+		 * directory. Database/CREATE_* records may create a tablespace
+		 * directory that should have been removed until consistency is
+		 * reached.
+		 */
+		CheckTablespaceDirectory();
 	}
 
 	/*
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 1901b434c5..0f860030fa 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -30,6 +30,7 @@
 #include "access/tableam.h"
 #include "access/xact.h"
 #include "access/xloginsert.h"
+#include "access/xlogrecovery.h"
 #include "access/xlogutils.h"
 #include "catalog/catalog.h"
 #include "catalog/dependency.h"
@@ -47,6 +48,7 @@
 #include "commands/defrem.h"
 #include "commands/seclabel.h"
 #include "commands/tablespace.h"
+#include "common/file_perm.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -62,6 +64,7 @@
 #include "utils/acl.h"
 #include "utils/builtins.h"
 #include "utils/fmgroids.h"
+#include "utils/guc.h"
 #include "utils/pg_locale.h"
 #include "utils/relmapper.h"
 #include "utils/snapmgr.h"
@@ -135,6 +138,7 @@ static void CreateDirAndVersionFile(char *dbpath, Oid dbid, Oid tsid,
 									bool isRedo);
 static void CreateDatabaseUsingFileCopy(Oid src_dboid, Oid dboid, Oid src_tsid,
 										Oid dst_tsid);
+static void maybe_create_directory(char *path);
 
 /*
  * Create a new database using the WAL_LOG strategy.
@@ -3008,6 +3012,43 @@ get_database_name(Oid dbid)
 	return result;
 }
 
+/*
+ * maybe_create_directory()
+ *
+ * During recovery, there's a case where we validly need to recover a missing
+ * tablespace directory so that recovery can continue.  This happens when
+ * recovery wants to create a database but the holding tablespace has been
+ * removed before the server stopped.  Since we expect that the directory will
+ * be gone before reaching recovery consistency, and we have no knowledge about
+ * the tablespace other than its OID here, we create a real directory under
+ * pg_tblspc here instead of restoring the symlink.  ignore_invalid_pages=on
+ * reduces the error level so that recovery can continue.
+ */
+static void
+maybe_create_directory(char *path)
+{
+	struct stat	st;
+
+	Assert(RecoveryInProgress());
+
+	if (stat(path, &st) == 0)
+		return;
+
+	/* XXX: Do we make sure that the path is under pg_tblspc? */
+
+	if (reachedConsistency && !ignore_invalid_pages)
+		ereport(PANIC,
+				errmsg("missing directory \"%s\"", path));
+
+	elog(reachedConsistency ? WARNING : DEBUG1,
+		 "creating missing directory: %s", path);
+
+	if (pg_mkdir_p(path, pg_dir_create_mode) != 0)
+		ereport(PANIC,
+				errmsg("could not create missing directory \"%s\": %m", path));
+}
+
+
 /*
  * DATABASE resource manager's routines
  */
@@ -3044,6 +3085,30 @@ dbase_redo(XLogReaderState *record)
 								dst_path)));
 		}
 
+		if (stat(dst_path, &st) < 0)
+		{
+			char *parent_path;
+
+			if (errno != ENOENT)
+				ereport(FATAL,
+						errmsg("could not stat directory \"%s\": %m",
+							   dst_path));
+
+			/* create the parent directory if needed and valid */
+			parent_path = pstrdup(dst_path);
+			get_parent_directory(parent_path);
+			maybe_create_directory(parent_path);
+		}
+
+		/*
+		 * There's a case where the copy source directory is missing for the
+		 * same reason above.  Create the emtpy source directory so that
+		 * copydir below doesn't fail.  The directory will be dropped soon by
+		 * recovery.
+		 */
+		if (stat(src_path, &st) < 0 && errno == ENOENT)
+			maybe_create_directory(src_path);
+
 		/*
 		 * Force dirty buffers out to disk, to ensure source database is
 		 * up-to-date for the copy.
@@ -3068,9 +3133,15 @@ dbase_redo(XLogReaderState *record)
 		xl_dbase_create_wal_log_rec *xlrec =
 		(xl_dbase_create_wal_log_rec *) XLogRecGetData(record);
 		char	   *dbpath;
+		char	   *parent_path;
 
 		dbpath = GetDatabasePath(xlrec->db_id, xlrec->tablespace_id);
 
+		/* create the parent directory if needed and valid */
+		parent_path = pstrdup(dbpath);
+		get_parent_directory(parent_path);
+		maybe_create_directory(parent_path);
+
 		/* Create the database directory with the version file. */
 		CreateDirAndVersionFile(dbpath, xlrec->db_id, xlrec->tablespace_id,
 								true);
diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index c8bdd9992a..ddb031a83f 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -156,8 +156,6 @@ TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo)
 				/* Directory creation failed? */
 				if (MakePGDirectory(dir) < 0)
 				{
-					char	   *parentdir;
-
 					/* Failure other than not exists or not in WAL replay? */
 					if (errno != ENOENT || !isRedo)
 						ereport(ERROR,
@@ -170,32 +168,8 @@ TablespaceCreateDbspace(Oid spcOid, Oid dbOid, bool isRedo)
 					 * continue by creating simple parent directories rather
 					 * than a symlink.
 					 */
-
-					/* create two parents up if not exist */
-					parentdir = pstrdup(dir);
-					get_parent_directory(parentdir);
-					get_parent_directory(parentdir);
-					/* Can't create parent and it doesn't already exist? */
-					if (MakePGDirectory(parentdir) < 0 && errno != EEXIST)
-						ereport(ERROR,
-								(errcode_for_file_access(),
-								 errmsg("could not create directory \"%s\": %m",
-										parentdir)));
-					pfree(parentdir);
-
-					/* create one parent up if not exist */
-					parentdir = pstrdup(dir);
-					get_parent_directory(parentdir);
-					/* Can't create parent and it doesn't already exist? */
-					if (MakePGDirectory(parentdir) < 0 && errno != EEXIST)
-						ereport(ERROR,
-								(errcode_for_file_access(),
-								 errmsg("could not create directory \"%s\": %m",
-										parentdir)));
-					pfree(parentdir);
-
 					/* Create database directory */
-					if (MakePGDirectory(dir) < 0)
+					if (pg_mkdir_p(dir, pg_dir_create_mode) < 0)
 						ereport(ERROR,
 								(errcode_for_file_access(),
 								 errmsg("could not create directory \"%s\": %m",
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 0328029d43..cd6fa84e22 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -43,6 +43,7 @@
 #include "access/xlog_internal.h"
 #include "access/xlogprefetcher.h"
 #include "access/xlogrecovery.h"
+#include "access/xlogutils.h"
 #include "catalog/namespace.h"
 #include "catalog/objectaccess.h"
 #include "catalog/pg_authid.h"
@@ -141,7 +142,6 @@ extern int	CommitSiblings;
 extern char *default_tablespace;
 extern char *temp_tablespaces;
 extern bool ignore_checksum_failure;
-extern bool ignore_invalid_pages;
 extern bool synchronize_seqscans;
 
 #ifdef TRACE_SYNCSCAN
@@ -1336,10 +1336,12 @@ static struct config_bool ConfigureNamesBool[] =
 		{"ignore_invalid_pages", PGC_POSTMASTER, DEVELOPER_OPTIONS,
 			gettext_noop("Continues recovery after an invalid pages failure."),
 			gettext_noop("Detection of WAL records having references to "
-						 "invalid pages during recovery causes PostgreSQL to "
+						 "invalid pages or WAL records resulting in invalid "
+						 "directory operations during "
+						 "recovery that cause PostgreSQL"
 						 "raise a PANIC-level error, aborting the recovery. "
 						 "Setting ignore_invalid_pages to true causes "
-						 "the system to ignore invalid page references "
+						 "the system to ignore those inconsistencies "
 						 "in WAL records (but still report a warning), "
 						 "and continue recovery. This behavior may cause "
 						 "crashes, data loss, propagate or hide corruption, "
diff --git a/src/include/access/xlogutils.h b/src/include/access/xlogutils.h
index ef182977bf..f203fdf539 100644
--- a/src/include/access/xlogutils.h
+++ b/src/include/access/xlogutils.h
@@ -54,6 +54,8 @@ typedef enum
 
 extern PGDLLIMPORT HotStandbyState standbyState;
 
+extern bool ignore_invalid_pages;
+
 #define InHotStandby (standbyState >= STANDBY_SNAPSHOT_PENDING)
 
 
diff --git a/src/test/recovery/t/029_replay_tsp_drops.pl b/src/test/recovery/t/029_replay_tsp_drops.pl
new file mode 100644
index 0000000000..d537fe0ce5
--- /dev/null
+++ b/src/test/recovery/t/029_replay_tsp_drops.pl
@@ -0,0 +1,155 @@
+
+# Copyright (c) 2021-2022, PostgreSQL Global Development Group
+
+#
+# Tests relating to PostgreSQL crash recovery and redo
+#
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+sub test_tablespace
+{
+	my ($strategy) = @_;
+
+	my $node_primary = PostgreSQL::Test::Cluster->new("primary1_$strategy");
+	$node_primary->init(allows_streaming => 1);
+	$node_primary->start;
+	$node_primary->psql('postgres',
+			qq[
+				SET allow_in_place_tablespaces=on;
+				CREATE TABLESPACE dropme_ts1 LOCATION '';
+				CREATE TABLESPACE dropme_ts2 LOCATION '';
+				CREATE TABLESPACE source_ts  LOCATION '';
+				CREATE TABLESPACE target_ts  LOCATION '';
+				CREATE DATABASE template_db IS_TEMPLATE = true;
+			]);
+	my $backup_name = 'my_backup';
+	$node_primary->backup($backup_name);
+
+	my $node_standby = PostgreSQL::Test::Cluster->new("standby2_$strategy");
+	$node_standby->init_from_backup($node_primary, $backup_name, has_streaming => 1);
+	$node_standby->append_conf('postgresql.conf', "ignore_invalid_pages = on");
+	$node_standby->start;
+
+	# Make sure connection is made
+	$node_primary->poll_query_until(
+		'postgres', 'SELECT count(*) = 1 FROM pg_stat_replication');
+
+	$node_standby->safe_psql('postgres', 'CHECKPOINT');
+
+	# Do immediate shutdown just after a sequence of CREAT DATABASE / DROP
+	# DATABASE / DROP TABLESPACE. This causes CREATE DATABASE WAL records
+	# to be applied to already-removed directories.
+	my $query = q[
+	CREATE DATABASE dropme_db1 WITH TABLESPACE dropme_ts1 STRATEGY=<STRATEGY>;
+	CREATE TABLE t (a int) TABLESPACE dropme_ts2;
+	CREATE DATABASE dropme_db2 WITH TABLESPACE dropme_ts2 STRATEGY=<STRATEGY>;
+	CREATE DATABASE moveme_db TABLESPACE source_ts STRATEGY=<STRATEGY>;
+	ALTER DATABASE moveme_db SET TABLESPACE target_ts;
+	CREATE DATABASE newdb TEMPLATE template_db STRATEGY=<STRATEGY>;
+	ALTER DATABASE template_db IS_TEMPLATE = false;
+	DROP DATABASE dropme_db1;
+	DROP TABLE t;
+	DROP DATABASE dropme_db2; DROP TABLESPACE dropme_ts2;
+	DROP TABLESPACE source_ts;
+	DROP DATABASE template_db;];
+
+	$query =~ s/<STRATEGY>/$strategy/g;
+	$node_primary->safe_psql('postgres', $query);
+	$node_primary->wait_for_catchup($node_standby, 'replay',
+									$node_primary->lsn('replay'));
+
+	# show "create missing directory" log message
+	$node_standby->safe_psql('postgres',
+							 "ALTER SYSTEM SET log_min_messages TO debug1;");
+	$node_standby->stop('immediate');
+	# Should restart ignoring directory creation error.
+	is($node_standby->start(fail_ok => 1), 1);
+	$node_standby->stop('immediate');
+}
+
+test_tablespace("FILE_COPY");
+test_tablespace("WAL_LOG");
+
+# Ensure that a missing tablespace directory during create database
+# replay immediately causes panic if the standby has already reached
+# consistent state (archive recovery is in progress).  This is
+# effective only for CREATE DATABASE WITH STRATEGY=FILE_COPY.
+
+my $node_primary = PostgreSQL::Test::Cluster->new('primary2');
+$node_primary->init(allows_streaming => 1);
+$node_primary->start;
+
+# Create tablespace
+$node_primary->safe_psql('postgres', q[
+						 SET allow_in_place_tablespaces=on;
+						 CREATE TABLESPACE ts1 LOCATION '']);
+$node_primary->safe_psql('postgres', "CREATE DATABASE db1 WITH TABLESPACE ts1 STRATEGY=FILE_COPY");
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_primary->backup($backup_name);
+my $node_standby = PostgreSQL::Test::Cluster->new('standby3');
+$node_standby->init_from_backup($node_primary, $backup_name, has_streaming => 1);
+$node_standby->append_conf('postgresql.conf', "ignore_invalid_pages = on");
+$node_standby->start;
+
+# Make sure standby reached consistency and starts accepting connections
+$node_standby->poll_query_until('postgres', 'SELECT 1', '1');
+
+# Remove standby tablespace directory so it will be missing when
+# replay resumes.
+my $tspoid = $node_standby->safe_psql('postgres',
+									  "SELECT oid FROM pg_tablespace WHERE spcname = 'ts1';");
+my $tspdir = $node_standby->data_dir . "/pg_tblspc/$tspoid";
+File::Path::rmtree($tspdir);
+
+my $logstart = get_log_size($node_standby);
+
+# Create a database in the tablespace and a table in default tablespace
+$node_primary->safe_psql('postgres',
+						q[CREATE TABLE should_not_replay_insertion(a int);
+						  CREATE DATABASE db2 WITH TABLESPACE ts1 STRATEGY=FILE_COPY;
+						  INSERT INTO should_not_replay_insertion VALUES (1);]);
+
+# Standby should fail and should not silently skip replaying the wal
+# In this test, PANIC turns into WARNING by ignore_invalid_pages.
+# Check the log messages instead of confirming standby failure.
+my $max_attempts = $PostgreSQL::Test::Utils::timeout_default;
+while ($max_attempts-- >= 0)
+{
+	last if (find_in_log(
+				 $node_standby,
+				 "WARNING:  creating missing directory: pg_tblspc/",
+				 $logstart));
+	sleep 1;
+}
+ok($max_attempts > 0, "invalid directory creation is detected");
+
+done_testing();
+
+
+# return the size of logfile of $node in bytes
+sub get_log_size
+{
+	my ($node) = @_;
+
+	return (stat $node->logfile)[7];
+}
+
+# find $pat in logfile of $node after $off-th byte
+sub find_in_log
+{
+	my ($node, $pat, $off) = @_;
+
+	$off = 0 unless defined $off;
+	my $log = PostgreSQL::Test::Utils::slurp_file($node->logfile);
+	return 0 if (length($log) <= $off);
+
+	$log = substr($log, $off);
+
+	return $log =~ m/$pat/;
+}
-- 
2.30.2

>From 87d05718d69cf26d0f2017dc30e30f0f62dcbbff Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvhe...@alvh.no-ip.org>
Date: Thu, 14 Jul 2022 16:43:03 +0200
Subject: [PATCH v25 2/4] split is_path_tslink as a new routine

---
 src/backend/access/transam/xlogrecovery.c | 68 +++++++++++++----------
 1 file changed, 40 insertions(+), 28 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index ae81244e06..e04d30cf3e 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2008,54 +2008,66 @@ xlogrecovery_redo(XLogReaderState *record, TimeLineID replayTLI)
 	}
 }
 
+/*
+ * Is the given directory entry a symlink/junction point?  Subroutine for
+ * CheckTablespaceDirectory.
+ */
+static bool
+is_path_tslink(const char *path)
+{
+#ifndef WIN32
+		struct stat st;
+
+		if (lstat(path, &st) < 0)
+			ereport(ERROR, errcode_for_file_access(),
+					errmsg("could not stat file \"%s\": %m", path));
+		return S_ISLNK(st.st_mode);
+#else
+		return pgwin32_is_junction(path);
+#endif
+}
+
 /*
  * Makes sure that ./pg_tblspc directory doesn't contain a real directory.
  *
- * This is intended to be called after reaching consistency.
- * ignore_invalid_pages=on turns into the PANIC error into WARNING so that
- * recovery can continue.
+ * Replay of database creation XLOG records for databases that were later
+ * dropped can create fake directories in pg_tblspc.  By the time consistency
+ * is reached these directories should have been removed; here we verify
+ * that this did indeed happen.  This must be called after reached consistent
+ * state.
  *
- * This can't be checked in allow_in_place_tablespaces mode, so skip it in
- * that case.
+ * ignore_invalid_pages=on turns into the PANIC error into WARNING so that
+ * recovery can continue.  XXX piggybacking on this particular GUC sounds like
+ * a bad idea.  Why not just advise to use allow_in_place_tablespaces?
  */
 static void
 CheckTablespaceDirectory(void)
 {
-	char *tblspc_path = "./pg_tblspc";
 	DIR		   *dir;
 	struct dirent *de;
 
-	/* Do not check for this when test tablespaces are in use */
+	/*
+	 * In allow_in_place_tablespaces mode, it is valid to have non-symlink
+	 * directories in pg_tblspc, so we cannot run this check.  Give up.
+	 */
 	if (allow_in_place_tablespaces)
 		return;
 
-	dir = AllocateDir(tblspc_path);
-	while ((de = ReadDir(dir, tblspc_path)) != NULL)
+	dir = AllocateDir("pg_tblspc");
+	while ((de = ReadDir(dir, "pg_tblspc")) != NULL)
 	{
-		char	path[MAXPGPATH];
-		char   *p;
-#ifndef WIN32
-		struct stat st;
-#endif
+		char	path[MAXPGPATH + 10];
 
 		/* Skip entries of non-oid names */
-		for (p = de->d_name; *p && isdigit(*p); p++);
-		if (*p)
+		if (strspn(de->d_name, "0123456789") != strlen(de->d_name))
 			continue;
 
-		snprintf(path, MAXPGPATH, "%s/%s", tblspc_path, de->d_name);
+		snprintf(path, sizeof(path), "pg_tblspc/%s", de->d_name);
 
-#ifndef WIN32
-		if (lstat(path, &st) < 0)
-			ereport(ERROR, errcode_for_file_access(),
-					errmsg("could not stat file \"%s\": %m", path));
-
-		if (!S_ISLNK(st.st_mode))
-#else
-		if (!pgwin32_is_junction(path))
-#endif
-			elog(ignore_invalid_pages ? WARNING : PANIC,
-				 "real directory found in pg_tblspc directory: %s", de->d_name);
+		if (!is_path_tslink(path))
+			ereport(ignore_invalid_pages ? WARNING : PANIC,
+					(errcode(ERRCODE_DATA_CORRUPTED),
+					 errmsg("real directory found in pg_tblspc directory: %s", de->d_name)));
 	}
 }
 
-- 
2.30.2

>From 0e03c82ee07569b869b382fac83d98d0b5d5d870 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvhe...@alvh.no-ip.org>
Date: Thu, 14 Jul 2022 23:39:08 +0200
Subject: [PATCH v25 3/4] Don't rely on ignore_invalid_pages at all

Since a workaround with allow_in_place_tablespaces is possible,
there doesn't seem to be a need for the ignore_invalid_pages one.
Remove it.
---
 src/backend/access/transam/xlogrecovery.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index e04d30cf3e..b0ae63fbac 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2035,10 +2035,6 @@ is_path_tslink(const char *path)
  * is reached these directories should have been removed; here we verify
  * that this did indeed happen.  This must be called after reached consistent
  * state.
- *
- * ignore_invalid_pages=on turns into the PANIC error into WARNING so that
- * recovery can continue.  XXX piggybacking on this particular GUC sounds like
- * a bad idea.  Why not just advise to use allow_in_place_tablespaces?
  */
 static void
 CheckTablespaceDirectory(void)
@@ -2065,7 +2061,7 @@ CheckTablespaceDirectory(void)
 		snprintf(path, sizeof(path), "pg_tblspc/%s", de->d_name);
 
 		if (!is_path_tslink(path))
-			ereport(ignore_invalid_pages ? WARNING : PANIC,
+			ereport(PANIC,
 					(errcode(ERRCODE_DATA_CORRUPTED),
 					 errmsg("real directory found in pg_tblspc directory: %s", de->d_name)));
 	}
-- 
2.30.2

>From 427186297c41e38c61ebb73d07e962400c7ccc22 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvhe...@alvh.no-ip.org>
Date: Thu, 14 Jul 2022 16:43:18 +0200
Subject: [PATCH v25 4/4] do CheckTablespaceDirectory first

---
 src/backend/access/transam/xlogrecovery.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index b0ae63fbac..c457c36daa 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2127,18 +2127,18 @@ CheckRecoveryConsistency(void)
 		 */
 		XLogCheckInvalidPages();
 
+		/*
+		 * Check that pg_tblspc doesn't contain any real directories.
+		 * Replay of Database/CREATE_* records may have created ficticious
+		 * tablespace directories that should have been removed by the time
+		 * consistency was reached.
+		 */
+		CheckTablespaceDirectory();
+
 		reachedConsistency = true;
 		ereport(LOG,
 				(errmsg("consistent recovery state reached at %X/%X",
 						LSN_FORMAT_ARGS(lastReplayedEndRecPtr))));
-
-		/*
-		 * Check that pg_tblspc doesn't contain a real
-		 * directory. Database/CREATE_* records may create a tablespace
-		 * directory that should have been removed until consistency is
-		 * reached.
-		 */
-		CheckTablespaceDirectory();
 	}
 
 	/*
-- 
2.30.2

Reply via email to