The commit message explains prettty well, and it seems to work in simple testing, and yeah commit c6f2f016 was not a work of art. pg_basebackeup --format=plain "worked", but your way is better. I guess we should add a test of -Fp too, to keep it working? Here's one of those.
I know it's not your patch's fault, but I wonder if this might break something. We have this strange beast ti->rpath (commit b168c5ef in 2014): + /* + * Relpath holds the relative path of the tablespace directory + * when it's located within PGDATA, or NULL if it's located + * elsewhere. + */ That's pretty confusing, because relative paths have been banned since the birth of tablespaces (commit 2467394ee15, 2004): + /* + * Allowing relative paths seems risky + * + * this also helps us ensure that location is not empty or whitespace + */ + if (!is_absolute_path(location)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("tablespace location must be an absolute path"))); The discussion that produced the contradiction: https://www.postgresql.org/message-id/flat/m2ob3vl3et.fsf%402ndQuadrant.fr I guess what I'm wondering here is if there is a hazard where we confuse these outlawed tablespaces that supposedly roam the plains somewhere in your code that is assuming that "relative implies in-place". Or if not now, maybe in future changes. I wonder if these "semi-supported-but-don't-tell-anyone" relative symlinks are worthy of a defensive test (or is it in there somewhere already?). Originally when trying to implement allow_in_place_tablespaces, I instead tried allowing limited relative tablespaces, so you could use LOCATION 'pg_tblspc/my_directory', and then it would still create a symlink 1234 -> my_directory, which probably would have All Just Worked™ given the existing ti->rpath stuff, and possibly made the users that thread was about happy, but I had to give that up because it didn't work on Windows (no relative symlinks). Oh well.
From 4d2a8fac9c5ecbbbee5c55c5da997a7e6cb03952 Mon Sep 17 00:00:00 2001 From: Robert Haas <rh...@postgresql.org> Date: Thu, 9 Mar 2023 16:00:02 -0500 Subject: [PATCH 1/2] Fix pg_basebackup with in-place tablespaces some more. Commit c6f2f01611d4f2c412e92eb7893f76fa590818e8 purported to make this work, but problems remained. In a plain-format backup, the files from an in-place tablespace got included in the tar file for the main tablespace, which is wrong but it's not clear that it has any user-visible consequences. In a tar-format backup, the TABLESPACE_MAP option is used, and so we never iterated over pg_tblspc and thus never backed up the in-place tablespaces anywhere at all. To fix this, reverse the changes in that commit, so that when we scan pg_tblspc during a backup, we create tablespaceinfo objects even for in-place tablespaces. We set the field that would normally contain the absolute pathname to the relative path pg_tblspc/${TSOID}, and that's good enough to make basebackup.c happy without any further changes. However, pg_basebackup needs a couple of adjustments to make it work. First, it needs to understand that a relative path for a tablespace means it's an in-place tablespace. Second, it needs to tolerate the situation where restoring the main tablespace tries to create pg_tblspc or a subdirectory and finds that it already exists, because we restore user-defined tablespaces before the main tablespace. --- src/backend/access/transam/xlog.c | 110 ++++++++++-------- src/bin/pg_basebackup/bbstreamer_file.c | 53 ++++++--- src/bin/pg_basebackup/pg_basebackup.c | 22 +++- .../t/011_in_place_tablespace.pl | 40 +++++++ 4 files changed, 159 insertions(+), 66 deletions(-) create mode 100644 src/bin/pg_basebackup/t/011_in_place_tablespace.pl diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 543d4d897a..89529ac158 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -8431,9 +8431,8 @@ do_pg_backup_start(const char *backupidstr, bool fast, List **tablespaces, char fullpath[MAXPGPATH + 10]; char linkpath[MAXPGPATH]; char *relpath = NULL; - int rllen; - StringInfoData escapedpath; char *s; + PGFileType de_type; /* Skip anything that doesn't look like a tablespace */ if (strspn(de->d_name, "0123456789") != strlen(de->d_name)) @@ -8441,66 +8440,83 @@ do_pg_backup_start(const char *backupidstr, bool fast, List **tablespaces, snprintf(fullpath, sizeof(fullpath), "pg_tblspc/%s", de->d_name); - /* - * Skip anything that isn't a symlink/junction. For testing only, - * we sometimes use allow_in_place_tablespaces to create - * directories directly under pg_tblspc, which would fail below. - */ - if (get_dirent_type(fullpath, de, false, ERROR) != PGFILETYPE_LNK) - continue; + de_type = get_dirent_type(fullpath, de, false, ERROR); - rllen = readlink(fullpath, linkpath, sizeof(linkpath)); - if (rllen < 0) + if (de_type == PGFILETYPE_LNK) { - ereport(WARNING, - (errmsg("could not read symbolic link \"%s\": %m", - fullpath))); - continue; + StringInfoData escapedpath; + int rllen; + + rllen = readlink(fullpath, linkpath, sizeof(linkpath)); + if (rllen < 0) + { + ereport(WARNING, + (errmsg("could not read symbolic link \"%s\": %m", + fullpath))); + continue; + } + else if (rllen >= sizeof(linkpath)) + { + ereport(WARNING, + (errmsg("symbolic link \"%s\" target is too long", + fullpath))); + continue; + } + linkpath[rllen] = '\0'; + + /* + * Relpath holds the relative path of the tablespace directory + * when it's located within PGDATA, or NULL if it's located + * elsewhere. + */ + if (rllen > datadirpathlen && + strncmp(linkpath, DataDir, datadirpathlen) == 0 && + IS_DIR_SEP(linkpath[datadirpathlen])) + relpath = pstrdup(linkpath + datadirpathlen + 1); + + /* + * Add a backslash-escaped version of the link path to the + * tablespace map file. + */ + initStringInfo(&escapedpath); + for (s = linkpath; *s; s++) + { + if (*s == '\n' || *s == '\r' || *s == '\\') + appendStringInfoChar(&escapedpath, '\\'); + appendStringInfoChar(&escapedpath, *s); + } + appendStringInfo(tblspcmapfile, "%s %s\n", + de->d_name, escapedpath.data); + pfree(escapedpath.data); } - else if (rllen >= sizeof(linkpath)) + else if (de_type == PGFILETYPE_DIR) { - ereport(WARNING, - (errmsg("symbolic link \"%s\" target is too long", - fullpath))); - continue; + /* + * It's possible to use allow_in_place_tablespaces to create + * directories directly under pg_tblspc, for testing purposes + * only. + * + * In this case, we store a relative path rather than an + * absolute path into the tablespaceinfo. + */ + snprintf(linkpath, sizeof(linkpath), "pg_tblspc/%s", + de->d_name); + relpath = pstrdup(linkpath); } - linkpath[rllen] = '\0'; - - /* - * Build a backslash-escaped version of the link path to include - * in the tablespace map file. - */ - initStringInfo(&escapedpath); - for (s = linkpath; *s; s++) + else { - if (*s == '\n' || *s == '\r' || *s == '\\') - appendStringInfoChar(&escapedpath, '\\'); - appendStringInfoChar(&escapedpath, *s); + /* Skip any other file type that appears here. */ + continue; } - /* - * Relpath holds the relative path of the tablespace directory - * when it's located within PGDATA, or NULL if it's located - * elsewhere. - */ - if (rllen > datadirpathlen && - strncmp(linkpath, DataDir, datadirpathlen) == 0 && - IS_DIR_SEP(linkpath[datadirpathlen])) - relpath = linkpath + datadirpathlen + 1; - ti = palloc(sizeof(tablespaceinfo)); ti->oid = pstrdup(de->d_name); ti->path = pstrdup(linkpath); - ti->rpath = relpath ? pstrdup(relpath) : NULL; + ti->rpath = relpath; ti->size = -1; if (tablespaces) *tablespaces = lappend(*tablespaces, ti); - - appendStringInfo(tblspcmapfile, "%s %s\n", - ti->oid, escapedpath.data); - - pfree(escapedpath.data); } FreeDir(tblspcdir); diff --git a/src/bin/pg_basebackup/bbstreamer_file.c b/src/bin/pg_basebackup/bbstreamer_file.c index df4fb864fe..45f32974ff 100644 --- a/src/bin/pg_basebackup/bbstreamer_file.c +++ b/src/bin/pg_basebackup/bbstreamer_file.c @@ -276,28 +276,49 @@ bbstreamer_extractor_content(bbstreamer *streamer, bbstreamer_member *member, } } +/* + * Should we tolerate an already-existing directory? + * + * When streaming WAL, pg_wal (or pg_xlog for pre-9.6 clusters) will have been + * created by the wal receiver process. Also, when the WAL directory location + * was specified, pg_wal (or pg_xlog) has already been created as a symbolic + * link before starting the actual backup. So just ignore creation failures + * on related directories. + * + * If in-place tablespaces are used, pg_tblspc and subdirectories may already + * exist when we get here. So tolerate that case, too. + */ +static bool +should_allow_existing_directory(const char *pathname) +{ + const char *filename = last_dir_separator(pathname) + 1; + + if (strcmp(filename, "pg_wal") == 0 || + strcmp(filename, "pg_xlog") == 0 || + strcmp(filename, "archive_status") == 0 || + strcmp(filename, "pg_tblspc") == 0) + return true; + + if (strspn(filename, "0123456789") == strlen(filename)) + { + const char *pg_tblspc = strstr(pathname, "/pg_tblspc/"); + + return pg_tblspc != NULL && pg_tblspc + 11 == filename; + } + + return false; +} + /* * Create a directory. */ static void extract_directory(const char *filename, mode_t mode) { - if (mkdir(filename, pg_dir_create_mode) != 0) - { - /* - * When streaming WAL, pg_wal (or pg_xlog for pre-9.6 clusters) will - * have been created by the wal receiver process. Also, when the WAL - * directory location was specified, pg_wal (or pg_xlog) has already - * been created as a symbolic link before starting the actual backup. - * So just ignore creation failures on related directories. - */ - if (!((pg_str_endswith(filename, "/pg_wal") || - pg_str_endswith(filename, "/pg_xlog") || - pg_str_endswith(filename, "/archive_status")) && - errno == EEXIST)) - pg_fatal("could not create directory \"%s\": %m", - filename); - } + if (mkdir(filename, pg_dir_create_mode) != 0 && + (errno != EEXIST || !should_allow_existing_directory(filename))) + pg_fatal("could not create directory \"%s\": %m", + filename); #ifndef WIN32 if (chmod(filename, mode)) diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index 948b859b56..ba471f898c 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -1122,9 +1122,17 @@ CreateBackupStreamer(char *archive_name, char *spclocation, * other tablespaces will be written to the directory where they're * located on the server, after applying any user-specified tablespace * mappings. + * + * In the case of an in-place tablespace, spclocation will be a + * relative path. We just convert it to an absolute path by prepending + * basedir. */ - directory = spclocation == NULL ? basedir - : get_tablespace_mapping(spclocation); + if (spclocation == NULL) + directory = basedir; + else if (!is_absolute_path(spclocation)) + directory = psprintf("%s/%s", basedir, spclocation); + else + directory = get_tablespace_mapping(spclocation); streamer = bbstreamer_extractor_new(directory, get_tablespace_mapping, progress_update_filename); @@ -1955,7 +1963,15 @@ BaseBackup(char *compression_algorithm, char *compression_detail, */ if (backup_target == NULL && format == 'p' && !PQgetisnull(res, i, 1)) { - char *path = unconstify(char *, get_tablespace_mapping(PQgetvalue(res, i, 1))); + char *path = PQgetvalue(res, i, 1); + + if (is_absolute_path(path)) + path = unconstify(char *, get_tablespace_mapping(path)); + else + { + /* This is an in-place tablespace, so prepend basedir. */ + path = psprintf("%s/%s", basedir, path); + } verify_dir_is_empty_or_create(path, &made_tablespace_dirs, &found_tablespace_dirs); } diff --git a/src/bin/pg_basebackup/t/011_in_place_tablespace.pl b/src/bin/pg_basebackup/t/011_in_place_tablespace.pl new file mode 100644 index 0000000000..d58696e8f9 --- /dev/null +++ b/src/bin/pg_basebackup/t/011_in_place_tablespace.pl @@ -0,0 +1,40 @@ +# Copyright (c) 2021-2023, PostgreSQL Global Development Group + +use strict; +use warnings; +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More; + +my $tempdir = PostgreSQL::Test::Utils::tempdir; + +# For nearly all pg_basebackup invocations some options should be specified, +# to keep test times reasonable. Using @pg_basebackup_defs as the first +# element of the array passed to IPC::Run interpolate the array (as it is +# not a reference to an array)... +my @pg_basebackup_defs = ('pg_basebackup', '--no-sync', '-cfast'); + +# Set up an instance. +my $node = PostgreSQL::Test::Cluster->new('main'); +$node->init('allows_streaming' => 1); +$node->start(); + +# Create an in-place tablespace. +$node->safe_psql('postgres', <<EOM); +SET allow_in_place_tablespaces = on; +CREATE TABLESPACE inplace LOCATION ''; +EOM + +# Back it up. +my $backupdir = $tempdir . '/backup'; +$node->command_ok( + [ @pg_basebackup_defs, '-D', $backupdir, '-Ft', '-X', 'none' ], + 'pg_basebackup runs'); + +# Make sure we got base.tar and one tablespace. +ok(-f "$backupdir/base.tar", 'backup tar was created'); +my @tblspc_tars = glob "$backupdir/[0-9]*.tar"; +is(scalar(@tblspc_tars), 1, 'one tablespace tar was created'); + +# All good. +done_testing(); -- 2.39.2
From 92fc68d23d9e53c1b3a92de776a01945e0a687c5 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Tue, 28 Mar 2023 15:12:56 +1300 Subject: [PATCH 2/2] fixup --- src/bin/pg_basebackup/meson.build | 1 + .../pg_basebackup/t/011_in_place_tablespace.pl | 17 ++++++++++++++--- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/src/bin/pg_basebackup/meson.build b/src/bin/pg_basebackup/meson.build index c684622bfb..c426173db3 100644 --- a/src/bin/pg_basebackup/meson.build +++ b/src/bin/pg_basebackup/meson.build @@ -86,6 +86,7 @@ tests += { }, 'tests': [ 't/010_pg_basebackup.pl', + 't/011_in_place_tablespace.pl', 't/020_pg_receivewal.pl', 't/030_pg_recvlogical.pl', ], diff --git a/src/bin/pg_basebackup/t/011_in_place_tablespace.pl b/src/bin/pg_basebackup/t/011_in_place_tablespace.pl index d58696e8f9..2ea70cbd3c 100644 --- a/src/bin/pg_basebackup/t/011_in_place_tablespace.pl +++ b/src/bin/pg_basebackup/t/011_in_place_tablespace.pl @@ -25,16 +25,27 @@ SET allow_in_place_tablespaces = on; CREATE TABLESPACE inplace LOCATION ''; EOM -# Back it up. -my $backupdir = $tempdir . '/backup'; +# Back it up with tar format. +my $backupdir = $tempdir . '/backup-tar'; $node->command_ok( [ @pg_basebackup_defs, '-D', $backupdir, '-Ft', '-X', 'none' ], - 'pg_basebackup runs'); + 'pg_basebackup runs with tar format'); # Make sure we got base.tar and one tablespace. ok(-f "$backupdir/base.tar", 'backup tar was created'); my @tblspc_tars = glob "$backupdir/[0-9]*.tar"; is(scalar(@tblspc_tars), 1, 'one tablespace tar was created'); +# Back it up with plain format. +$backupdir = $tempdir . '/backup-plain'; +$node->command_ok( + [ @pg_basebackup_defs, '-D', $backupdir, '-Fp', '-X', 'none' ], + 'pg_basebackup runs with plain format'); + +# Make sure we got an in-place tablespace directly in the PGDATA. +my @tblspc_dirs = glob "$backupdir/pg_tblspc/[0-9]*"; +is(scalar(@tblspc_dirs), 1, 'one tablespace was created'); +ok(-d $tblspc_dirs[0], 'it is a directory'); + # All good. done_testing(); -- 2.39.2