On Sun, Jul 31, 2022 at 8:36 PM mahendrakar s
<mahendrakarfo...@gmail.com> wrote:
>
>> On Mon, 25 Jul 2022 at 16:42, Bharath Rupireddy 
>> <bharath.rupireddyforpostg...@gmail.com> wrote:
>> Here's the v3 patch after rebasing.
>>
>> I just would like to reiterate the issue the patch is trying to solve:
>> At times (when no space  is left on the device or when the process is
>> taken down forcibly (VM/container crash)), there can be leftover
>> uninitialized .partial files (note that padding i.e. filling 16MB WAL
>> files with all zeros is done in non-compression cases) due to which
>> pg_receivewal fails to come up after the crash. To address this, the
>> proposed patch makes the padding 16MB WAL files atomic (first write a
>> .temp file, pad it and durably rename it to the original .partial
>> file, ready to be filled with received WAL records). This approach is
>> similar to what core postgres achieves atomicity while creating new
>> WAL file (see how XLogFileInitInternal() creates xlogtemp.%d file
>> first and then how InstallXLogFileSegment() durably renames it to
>> original WAL file).
>>
>> Thoughts?
>
> Hi Bharath,
>
> Idea to atomically allocate WAL file by creating tmp file and renaming it is 
> nice.

Thanks for reviewing it.

> I have one question though:
> How is partially temp file created will be cleaned if the VM crashes or out 
> of disk space cases?  Does it endup creating multiple files for every VM 
> crash/disk space during process of pg_receivewal?
>
> Thoughts?

It is handled in the patch, see [1].

Attaching v4 patch herewith which now uses the temporary file suffix
'.tmp' as opposed to v3 patch '.temp'. This is just to be in sync with
other atomic file write codes in the core - autoprewarm,
pg_stat_statement, slot, basebacup, replorigin, snapbuild, receivelog
and so on.

Please review the v4 patch.

[1]
+ /*
+ * Actual file doesn't exist. Now, create a temporary file pad it
+ * and rename to the target file. The temporary file may exist from
+ * the last failed attempt (failed during partial padding or
+ * renaming or some other). If it exists, let's play safe and
+ * delete it before creating a new one.
+ */
+ snprintf(tmpsuffixpath, sizeof(tmpsuffixpath), "%s.%s",
+ targetpath, "temp");
+
+ if (dir_existsfile(tmpsuffixpath))
+ {
+ if (unlink(tmpsuffixpath) != 0)
+ {
+ dir_data->lasterrno = errno;

+ return NULL;
+ }
+ }

--
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/
From ca3df5ae70baae7141a3e321a0fa5f6f0afd5d9a Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Thu, 4 Aug 2022 05:09:09 +0000
Subject: [PATCH v4] Avoid partially-padded files under
 pg_receivewal/pg_basebackup dirs

pg_receivewal and pg_basebackup can have partially-padded files
in their target directories when a failure occurs while padding.
The failure can be because of no space left on device (which the
user can later increase/clean up to make some free disk space) or
host VM/server crashed.

To address the above problem, first create a temporary file, pad
it and then rename it to the target file.
---
 src/bin/pg_basebackup/walmethods.c | 98 +++++++++++++++++++++++++++---
 1 file changed, 89 insertions(+), 9 deletions(-)

diff --git a/src/bin/pg_basebackup/walmethods.c b/src/bin/pg_basebackup/walmethods.c
index e90aa0ba37..8c775d0da0 100644
--- a/src/bin/pg_basebackup/walmethods.c
+++ b/src/bin/pg_basebackup/walmethods.c
@@ -82,6 +82,8 @@ typedef struct DirectoryMethodFile
 #define dir_set_error(msg) \
 	(dir_data->lasterrstring = _(msg))
 
+static bool dir_existsfile(const char *pathname);
+
 static const char *
 dir_getlasterror(void)
 {
@@ -107,7 +109,10 @@ dir_get_file_name(const char *pathname, const char *temp_suffix)
 static Walfile
 dir_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_size)
 {
-	char		tmppath[MAXPGPATH];
+	char	targetpath[MAXPGPATH];
+	char 	tmpsuffixpath[MAXPGPATH];
+	bool	shouldcreatetempfile = false;
+	int			flags;
 	char	   *filename;
 	int			fd;
 	DirectoryMethodFile *f;
@@ -123,7 +128,7 @@ dir_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_
 	dir_clear_error();
 
 	filename = dir_get_file_name(pathname, temp_suffix);
-	snprintf(tmppath, sizeof(tmppath), "%s/%s",
+	snprintf(targetpath, sizeof(targetpath), "%s/%s",
 			 dir_data->basedir, filename);
 	pg_free(filename);
 
@@ -132,12 +137,57 @@ dir_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_
 	 * the file descriptor is important for dir_sync() method as gzflush()
 	 * does not do any system calls to fsync() to make changes permanent on
 	 * disk.
+	 *
+	 * Create a temporary file for padding and no compression cases to ensure
+	 * a fully initialized file is created.
 	 */
-	fd = open(tmppath, O_WRONLY | O_CREAT | PG_BINARY, pg_file_create_mode);
+	if (pad_to_size > 0 &&
+		dir_data->compression_algorithm == PG_COMPRESSION_NONE)
+	{
+		shouldcreatetempfile = true;
+		flags = O_WRONLY | PG_BINARY;
+	}
+	else
+	{
+		shouldcreatetempfile = false;
+		flags = O_WRONLY | O_CREAT | PG_BINARY;
+	}
+
+	fd = open(targetpath, flags, pg_file_create_mode);
 	if (fd < 0)
 	{
-		dir_data->lasterrno = errno;
-		return NULL;
+		if (errno != ENOENT || !shouldcreatetempfile)
+		{
+			dir_data->lasterrno = errno;
+			return NULL;
+		}
+		else if (shouldcreatetempfile)
+		{
+			/*
+			 * Actual file doesn't exist. Now, create a temporary file pad it
+			 * and rename to the target file. The temporary file may exist from
+			 * the last failed attempt (failed during partial padding or
+			 * renaming or some other). If it exists, let's play safe and
+			 * delete it before creating a new one.
+			 */
+			snprintf(tmpsuffixpath, MAXPGPATH, "%s.tmp", targetpath);
+
+			if (dir_existsfile(tmpsuffixpath))
+			{
+				if (unlink(tmpsuffixpath) != 0)
+				{
+					dir_data->lasterrno = errno;
+					return NULL;
+				}
+			}
+
+			fd = open(tmpsuffixpath, flags | O_CREAT, pg_file_create_mode);
+			if (fd < 0)
+			{
+				dir_data->lasterrno = errno;
+				return NULL;
+			}
+		}
 	}
 
 #ifdef HAVE_LIBZ
@@ -233,16 +283,46 @@ dir_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_
 		}
 	}
 
+	/* Rename the temporary file to target file */
+	if (shouldcreatetempfile)
+	{
+		int rc;
+
+		if (dir_data->sync)
+			rc = durable_rename(tmpsuffixpath, targetpath);
+		else
+			rc = rename(tmpsuffixpath, targetpath);
+
+		if (rc != 0)
+		{
+			dir_data->lasterrno = errno;
+			close(fd);
+
+			/*
+			 * Removing the temporary file may as well fail here, but we are
+			 * not concerned about it right now as we are already returning an
+			 * error while renaming. However, the leftover temporary file gets
+			 * deleted during the next padding cycle.
+			 */
+			unlink(tmpsuffixpath);
+
+			return NULL;
+		}
+	}
+
 	/*
 	 * fsync WAL file and containing directory, to ensure the file is
 	 * persistently created and zeroed (if padded). That's particularly
 	 * important when using synchronous mode, where the file is modified and
 	 * fsynced in-place, without a directory fsync.
+	 *
+	 * For padding/temporary file case, durable_rename would have already
+	 * fsynced file and containing directory.
 	 */
-	if (dir_data->sync)
+	if (!shouldcreatetempfile && dir_data->sync)
 	{
-		if (fsync_fname(tmppath, false) != 0 ||
-			fsync_parent_path(tmppath) != 0)
+		if (fsync_fname(targetpath, false) != 0 ||
+			fsync_parent_path(targetpath) != 0)
 		{
 			dir_data->lasterrno = errno;
 #ifdef HAVE_LIBZ
@@ -282,7 +362,7 @@ dir_open_for_write(const char *pathname, const char *temp_suffix, size_t pad_to_
 	f->fd = fd;
 	f->currpos = 0;
 	f->pathname = pg_strdup(pathname);
-	f->fullpath = pg_strdup(tmppath);
+	f->fullpath = pg_strdup(targetpath);
 	if (temp_suffix)
 		f->temp_suffix = pg_strdup(temp_suffix);
 
-- 
2.34.1

Reply via email to