On Sat, Feb 27, 2021 at 4:14 PM Thomas Munro <thomas.mu...@gmail.com> wrote:
> Here's a new version.  The condition variable patch 0001 fixes a bug:
> CleanupProcSignalState() also needs to broadcast.  The hunk that
> allows the interrupt handlers to use CVs while you're already waiting
> on a CV is now in a separate patch 0002.  I'm thinking of going ahead
> and committing those two.

Done.  Of course nothing in the tree reaches any of this code yet.
It'll be exercised by cfbot in this thread and (I assume) Amul's
"ALTER SYSTEM READ { ONLY | WRITE }" thread.

> The 0003 patch to achieve $SUBJECT needs
> more discussion.

Rebased.

The more I think about it, the more I think that this approach is good
enough for an initial solution to the problem.  It only affects
Windows, dropping tablespaces is hopefully rare, and it's currently
broken on that OS.  That said, it's complex enough, and I guess more
to the point, enough of a compromise, that I'm hoping to get some
explicit consensus about that.

A better solution would probably have to be based on the sinval queue,
somehow.  Perhaps with a new theory or rule making it safe to process
at every CFI(), or by deciding that we're prepared have the operation
wait arbitrarily long until backends reach a point where it is known
to be safe (probably near ProcessClientReadInterrupt()'s call to
ProcessCatchupInterrupt()), or by inventing a new kind of lightweight
"sinval peek" that is safe to run at every CFI() point, being based on
reading (but not consuming!) the sinval queue and performing just
vfd-close of referenced smgr relations in this case.  The more I think
about all that complexity for a super rare event on only one OS, the
more I want to just do it the stupid way and close all the fds.
Robert opined similarly in an off-list chat about this problem.
From 5daf2244894bf5a399aa8022efbf71b5428335c7 Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmu...@postgresql.org>
Date: Mon, 1 Mar 2021 17:14:11 +1300
Subject: [PATCH v5] Use a global barrier to fix DROP TABLESPACE on Windows.

Previously, it was possible for DROP TABLESPACE to fail because a table
had been dropped, but other backends still had open file handles.
Windows won't allow a directory containing unlinked-but-still-open files
to be unlinked.  Tackle this problem by forcing all backends to close
all fds.  No change for Unix systems, which didn't suffer from the
problem, but simulate the need to do it in assertion builds just to
exercise the code.

Reviewed-by: Andres Freund <and...@anarazel.de>
Discussion: https://postgr.es/m/ca+hukgldemy2gbm80kz20gte6hnvwoere8kwcjk6-u56ost...@mail.gmail.com
---
 src/backend/commands/tablespace.c    | 22 +++++++++++++++-------
 src/backend/storage/ipc/procsignal.c | 22 ++++++----------------
 src/backend/storage/smgr/md.c        |  6 ++++++
 src/backend/storage/smgr/smgr.c      | 12 ++++++++++++
 src/include/storage/md.h             |  1 +
 src/include/storage/procsignal.h     |  7 +------
 src/include/storage/smgr.h           |  1 +
 7 files changed, 42 insertions(+), 29 deletions(-)

diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index 69ea155d50..8853f1b658 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -520,15 +520,23 @@ DropTableSpace(DropTableSpaceStmt *stmt)
 		 * but we can't tell them apart from important data files that we
 		 * mustn't delete.  So instead, we force a checkpoint which will clean
 		 * out any lingering files, and try again.
-		 *
-		 * XXX On Windows, an unlinked file persists in the directory listing
-		 * until no process retains an open handle for the file.  The DDL
-		 * commands that schedule files for unlink send invalidation messages
-		 * directing other PostgreSQL processes to close the files.  DROP
-		 * TABLESPACE should not give up on the tablespace becoming empty
-		 * until all relevant invalidation processing is complete.
 		 */
 		RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT);
+		/*
+		 * On Windows, an unlinked file persists in the directory listing until
+		 * no process retains an open handle for the file.  The DDL
+		 * commands that schedule files for unlink send invalidation messages
+		 * directing other PostgreSQL processes to close the files, but nothing
+		 * guarantees they'll be processed in time.  So, we'll also use a
+		 * global barrier to ask all backends to close all files, and wait
+		 * until they're finished.
+		 */
+#if defined(WIN32) || defined(USE_ASSERT_CHECKING)
+		LWLockRelease(TablespaceCreateLock);
+		WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_SMGRRELEASE));
+		LWLockAcquire(TablespaceCreateLock, LW_EXCLUSIVE);
+#endif
+		/* And now try again. */
 		if (!destroy_tablespace_directories(tablespaceoid, false))
 		{
 			/* Still not empty, the files must be important then */
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 8e5ee49fbd..236f6edd60 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -28,6 +28,7 @@
 #include "storage/latch.h"
 #include "storage/proc.h"
 #include "storage/shmem.h"
+#include "storage/smgr.h"
 #include "storage/sinval.h"
 #include "tcop/tcopprot.h"
 
@@ -100,7 +101,7 @@ static ProcSignalSlot *MyProcSignalSlot = NULL;
 static bool CheckProcSignal(ProcSignalReason reason);
 static void CleanupProcSignalState(int status, Datum arg);
 static void ResetProcSignalBarrierBits(uint32 flags);
-static bool ProcessBarrierPlaceholder(void);
+static bool ProcessBarrierSmgrRelease(void);
 
 /*
  * ProcSignalShmemSize
@@ -531,8 +532,8 @@ ProcessProcSignalBarrier(void)
 				type = (ProcSignalBarrierType) pg_rightmost_one_pos32(flags);
 				switch (type)
 				{
-					case PROCSIGNAL_BARRIER_PLACEHOLDER:
-						processed = ProcessBarrierPlaceholder();
+					case PROCSIGNAL_BARRIER_SMGRRELEASE:
+						processed = ProcessBarrierSmgrRelease();
 						break;
 				}
 
@@ -599,20 +600,9 @@ ResetProcSignalBarrierBits(uint32 flags)
 }
 
 static bool
-ProcessBarrierPlaceholder(void)
+ProcessBarrierSmgrRelease(void)
 {
-	/*
-	 * XXX. This is just a placeholder until the first real user of this
-	 * machinery gets committed. Rename PROCSIGNAL_BARRIER_PLACEHOLDER to
-	 * PROCSIGNAL_BARRIER_SOMETHING_ELSE where SOMETHING_ELSE is something
-	 * appropriately descriptive. Get rid of this function and instead have
-	 * ProcessBarrierSomethingElse. Most likely, that function should live in
-	 * the file pertaining to that subsystem, rather than here.
-	 *
-	 * The return value should be 'true' if the barrier was successfully
-	 * absorbed and 'false' if not. Note that returning 'false' can lead to
-	 * very frequent retries, so try hard to make that an uncommon case.
-	 */
+	smgrrelease();
 	return true;
 }
 
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 1e12cfad8e..1b751205b1 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -548,6 +548,12 @@ mdclose(SMgrRelation reln, ForkNumber forknum)
 	}
 }
 
+void
+mdrelease(void)
+{
+	closeAllVfds();
+}
+
 /*
  *	mdprefetch() -- Initiate asynchronous read of the specified block of a relation
  */
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index 4dc24649df..b9acff73d2 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -41,6 +41,7 @@ typedef struct f_smgr
 {
 	void		(*smgr_init) (void);	/* may be NULL */
 	void		(*smgr_shutdown) (void);	/* may be NULL */
+	void		(*smgr_release) (void);	/* may be NULL */
 	void		(*smgr_open) (SMgrRelation reln);
 	void		(*smgr_close) (SMgrRelation reln, ForkNumber forknum);
 	void		(*smgr_create) (SMgrRelation reln, ForkNumber forknum,
@@ -69,6 +70,7 @@ static const f_smgr smgrsw[] = {
 	{
 		.smgr_init = mdinit,
 		.smgr_shutdown = NULL,
+		.smgr_release = mdrelease,
 		.smgr_open = mdopen,
 		.smgr_close = mdclose,
 		.smgr_create = mdcreate,
@@ -298,6 +300,16 @@ smgrcloseall(void)
 		smgrclose(reln);
 }
 
+void
+smgrrelease(void)
+{
+	for (int i = 0; i < NSmgr; i++)
+	{
+		if (smgrsw[i].smgr_release)
+			smgrsw[i].smgr_release();
+	}
+}
+
 /*
  *	smgrclosenode() -- Close SMgrRelation object for given RelFileNode,
  *					   if one exists.
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index 752b440864..7685367b97 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -23,6 +23,7 @@
 extern void mdinit(void);
 extern void mdopen(SMgrRelation reln);
 extern void mdclose(SMgrRelation reln, ForkNumber forknum);
+extern void mdrelease(void);
 extern void mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
 extern bool mdexists(SMgrRelation reln, ForkNumber forknum);
 extern void mdunlink(RelFileNodeBackend rnode, ForkNumber forknum, bool isRedo);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 4ae7dc33b8..aaaa8ed07c 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -48,12 +48,7 @@ typedef enum
 
 typedef enum
 {
-	/*
-	 * XXX. PROCSIGNAL_BARRIER_PLACEHOLDER should be replaced when the first
-	 * real user of the ProcSignalBarrier mechanism is added. It's just here
-	 * for now because we can't have an empty enum.
-	 */
-	PROCSIGNAL_BARRIER_PLACEHOLDER = 0
+	PROCSIGNAL_BARRIER_SMGRRELEASE	/* ask smgr to close files */
 } ProcSignalBarrierType;
 
 /*
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index a6fbf7b6a6..1a21bd9e88 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -84,6 +84,7 @@ extern void smgrsetowner(SMgrRelation *owner, SMgrRelation reln);
 extern void smgrclearowner(SMgrRelation *owner, SMgrRelation reln);
 extern void smgrclose(SMgrRelation reln);
 extern void smgrcloseall(void);
+extern void smgrrelease(void);
 extern void smgrclosenode(RelFileNodeBackend rnode);
 extern void smgrcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo);
 extern void smgrdosyncall(SMgrRelation *rels, int nrels);
-- 
2.30.0

Reply via email to