Hi, I was looking to commit this, but the changes I made ended up being pretty large. Here's what I changed in the attached: - split GUC_UNIT_BYTE into a separate commit, squashed rest - renamed GUC_UNIT_BYT to GUC_UNIT_BYTE, don't see why we'd have such a weird abbreviation? - bumped control file version, otherwise things wouldn't work correctly - wal_segment_size text still said "Shows the number of pages per write ahead log segment." - I still feel strongly that exporting XLogSegSize, which previously was a macro and now a integer variable, is a bad idea. Hence I've renamed it to wal_segment_size. - There still were comments referencing XLOG_SEG_SIZE - IsPowerOf2 regarded 0 as a valid power of two - ConvertToXSegs() depended on a variable not passed as arg, bad idea. - As previously mentioned, I don't think it's ok to rely on vars like XLogSegSize to be defined both in backend and frontend code. - I don't think XLogReader can rely on XLogSegSize, needs to be parametrized. - pg_rewind exported another copy of extern int XLogSegSize - streamutil.h had a extern uint32 WalSegsz; but used RetrieveXlogSegSize, that seems needlessly different - moved wal_segment_size (aka XLogSegSize) to xlog.h - pg_standby included xlogreader, not sure why? - MaxSegmentsPerLogFile still had a conflicting naming scheme - you'd included "sys/stat.h", that's not really appropriate for system headers, should be <sys/stat.h> (and then grouped w/ rest) - pg_controldata's warning about an invalid segsize missed newlines
Unresolved: - this needs some new performance tests, the number of added instructions isn't trivial. Don't think there's anything, but ... - read through it again, check long lines - pg_standby's RetrieveWALSegSize() does too much for it's name. It seems quite weird that a function named that way has the section below "/* check if clean up is necessary */" - the way you redid the ReadControlFile() invocation doesn't quite seem right. Consider what happens if XLOGbuffers isn't -1 - then we wouldn't read the control file, but you unconditionally copy it in XLOGShmemInit(). I think we instead should introduce something like XLOGPreShmemInit() that reads the control file unless in bootstrap mode. Then get rid of the second ReadControlFile() already present. - In pg_resetwal.c:ReadControlFile() we ignore the file contents if there's an invalid segment size, but accept the contents as guessed if there's a crc failure - that seems a bit weird? - verify EXEC_BACKEND does the right thing - not this commit/patch, but XLogReadDetermineTimeline() could really use some simplifying of repetitive expresssions - XLOGShmemInit shouldn't memcpy to temp_cfile and such, why not just save previous pointer in a local variable? - could you fill in the Reviewed-By: line in the commit message? Running out of concentration / time now. - Andres
>From 15d16b8e2146b0491e8b64e780227424162dd784 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Tue, 5 Sep 2017 13:26:55 -0700 Subject: [PATCH 1/2] Introduce BYTES unit for GUCs. This is already useful for track_activity_query_size, and will further be used in followup commits. Author: Beena Emerson Reviewed-By: Andres Freund Discussion: https://postgr.es/m/caog9apeu8bxvwbxkoo9j7zpm76task_vfmeeicejwhmmsli...@mail.gmail.com --- src/backend/utils/misc/guc.c | 14 +++++++++----- src/include/utils/guc.h | 1 + 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 246fea8693..25da06fffc 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -722,6 +722,11 @@ static const char *memory_units_hint = gettext_noop("Valid units for this parame static const unit_conversion memory_unit_conversion_table[] = { + {"GB", GUC_UNIT_BYTE, 1024 * 1024 * 1024}, + {"MB", GUC_UNIT_BYTE, 1024 * 1024}, + {"kB", GUC_UNIT_BYTE, 1024}, + {"B", GUC_UNIT_BYTE, 1}, + {"TB", GUC_UNIT_KB, 1024 * 1024 * 1024}, {"GB", GUC_UNIT_KB, 1024 * 1024}, {"MB", GUC_UNIT_KB, 1024}, @@ -2863,11 +2868,7 @@ static struct config_int ConfigureNamesInt[] = {"track_activity_query_size", PGC_POSTMASTER, RESOURCES_MEM, gettext_noop("Sets the size reserved for pg_stat_activity.query, in bytes."), NULL, - - /* - * There is no _bytes_ unit, so the user can't supply units for - * this. - */ + GUC_UNIT_BYTE }, &pgstat_track_activity_query_size, 1024, 100, 102400, @@ -8113,6 +8114,9 @@ GetConfigOptionByNum(int varnum, const char **values, bool *noshow) { switch (conf->flags & (GUC_UNIT_MEMORY | GUC_UNIT_TIME)) { + case GUC_UNIT_BYTE: + values[2] = "B"; + break; case GUC_UNIT_KB: values[2] = "kB"; break; diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h index c1870d2130..467125a09d 100644 --- a/src/include/utils/guc.h +++ b/src/include/utils/guc.h @@ -219,6 +219,7 @@ typedef enum #define GUC_UNIT_BLOCKS 0x2000 /* value is in blocks */ #define GUC_UNIT_XBLOCKS 0x3000 /* value is in xlog blocks */ #define GUC_UNIT_MB 0x4000 /* value is in megabytes */ +#define GUC_UNIT_BYTE 0x8000 /* value is in bytes */ #define GUC_UNIT_MEMORY 0xF000 /* mask for size-related units */ #define GUC_UNIT_MS 0x10000 /* value is in milliseconds */ -- 2.14.1.2.g4274c698f4.dirty
>From 1706c8dc7c038b732a39bc947d0eee95c34291ae Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Tue, 5 Sep 2017 19:03:48 -0700 Subject: [PATCH 2/2] Make wal segment size configurable at initdb time. Author: Beena Emerson Reviewed-By: ..., Andres Freund, ... Discussion: https://postgr.es/m/caog9apeacq--1iekbhfzxsqpw_ylmepaa4hndny5+zulpt8...@mail.gmail.com --- configure | 54 ----- configure.in | 31 --- contrib/pg_standby/pg_standby.c | 90 ++++++-- doc/src/sgml/backup.sgml | 2 +- doc/src/sgml/installation.sgml | 14 -- doc/src/sgml/ref/initdb.sgml | 15 ++ doc/src/sgml/wal.sgml | 13 +- src/backend/access/transam/twophase.c | 3 +- src/backend/access/transam/xlog.c | 288 +++++++++++++++--------- src/backend/access/transam/xlogarchive.c | 14 +- src/backend/access/transam/xlogfuncs.c | 10 +- src/backend/access/transam/xlogreader.c | 32 +-- src/backend/access/transam/xlogutils.c | 35 +-- src/backend/bootstrap/bootstrap.c | 10 +- src/backend/postmaster/checkpointer.c | 5 +- src/backend/postmaster/postmaster.c | 4 + src/backend/replication/basebackup.c | 32 +-- src/backend/replication/logical/logical.c | 2 +- src/backend/replication/logical/reorderbuffer.c | 19 +- src/backend/replication/slot.c | 2 +- src/backend/replication/walreceiver.c | 14 +- src/backend/replication/walreceiverfuncs.c | 4 +- src/backend/replication/walsender.c | 16 +- src/backend/utils/misc/guc.c | 20 +- src/backend/utils/misc/pg_controldata.c | 5 +- src/backend/utils/misc/postgresql.conf.sample | 2 +- src/bin/initdb/initdb.c | 61 ++++- src/bin/pg_basebackup/pg_basebackup.c | 7 +- src/bin/pg_basebackup/pg_receivewal.c | 16 +- src/bin/pg_basebackup/receivelog.c | 35 +-- src/bin/pg_basebackup/streamutil.c | 76 +++++++ src/bin/pg_basebackup/streamutil.h | 2 + src/bin/pg_controldata/pg_controldata.c | 15 +- src/bin/pg_resetwal/pg_resetwal.c | 54 +++-- src/bin/pg_rewind/parsexlog.c | 24 +- src/bin/pg_rewind/pg_rewind.c | 12 +- src/bin/pg_rewind/pg_rewind.h | 1 + src/bin/pg_test_fsync/pg_test_fsync.c | 6 +- src/bin/pg_waldump/pg_waldump.c | 250 ++++++++++++++------ src/include/access/xlog.h | 3 + src/include/access/xlog_internal.h | 76 ++++--- src/include/access/xlogreader.h | 8 +- src/include/catalog/pg_control.h | 2 +- src/include/pg_config.h.in | 5 - src/include/pg_config_manual.h | 6 + src/tools/msvc/Solution.pm | 2 - 46 files changed, 893 insertions(+), 504 deletions(-) diff --git a/configure b/configure index 0d76e5ea42..5c38149a3d 100755 --- a/configure +++ b/configure @@ -821,7 +821,6 @@ enable_tap_tests with_blocksize with_segsize with_wal_blocksize -with_wal_segsize with_CC enable_depend enable_cassert @@ -1518,8 +1517,6 @@ Optional Packages: --with-segsize=SEGSIZE set table segment size in GB [1] --with-wal-blocksize=BLOCKSIZE set WAL block size in kB [8] - --with-wal-segsize=SEGSIZE - set WAL segment size in MB [16] --with-CC=CMD set compiler (deprecated) --with-icu build with ICU support --with-tcl build Tcl modules (PL/Tcl) @@ -3733,57 +3730,6 @@ cat >>confdefs.h <<_ACEOF _ACEOF -# -# WAL segment size -# -{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for WAL segment size" >&5 -$as_echo_n "checking for WAL segment size... " >&6; } - - - -# Check whether --with-wal-segsize was given. -if test "${with_wal_segsize+set}" = set; then : - withval=$with_wal_segsize; - case $withval in - yes) - as_fn_error $? "argument required for --with-wal-segsize option" "$LINENO" 5 - ;; - no) - as_fn_error $? "argument required for --with-wal-segsize option" "$LINENO" 5 - ;; - *) - wal_segsize=$withval - ;; - esac - -else - wal_segsize=16 -fi - - -case ${wal_segsize} in - 1) ;; - 2) ;; - 4) ;; - 8) ;; - 16) ;; - 32) ;; - 64) ;; - 128) ;; - 256) ;; - 512) ;; - 1024) ;; - *) as_fn_error $? "Invalid WAL segment size. Allowed values are 1,2,4,8,16,32,64,128,256,512,1024." "$LINENO" 5 -esac -{ $as_echo "$as_me:${as_lineno-$LINENO}: result: ${wal_segsize}MB" >&5 -$as_echo "${wal_segsize}MB" >&6; } - - -cat >>confdefs.h <<_ACEOF -#define XLOG_SEG_SIZE (${wal_segsize} * 1024 * 1024) -_ACEOF - - # # C compiler # diff --git a/configure.in b/configure.in index bdc41b071f..176b29a792 100644 --- a/configure.in +++ b/configure.in @@ -343,37 +343,6 @@ AC_DEFINE_UNQUOTED([XLOG_BLCKSZ], ${XLOG_BLCKSZ}, [ Changing XLOG_BLCKSZ requires an initdb. ]) -# -# WAL segment size -# -AC_MSG_CHECKING([for WAL segment size]) -PGAC_ARG_REQ(with, wal-segsize, [SEGSIZE], [set WAL segment size in MB [16]], - [wal_segsize=$withval], - [wal_segsize=16]) -case ${wal_segsize} in - 1) ;; - 2) ;; - 4) ;; - 8) ;; - 16) ;; - 32) ;; - 64) ;; - 128) ;; - 256) ;; - 512) ;; - 1024) ;; - *) AC_MSG_ERROR([Invalid WAL segment size. Allowed values are 1,2,4,8,16,32,64,128,256,512,1024.]) -esac -AC_MSG_RESULT([${wal_segsize}MB]) - -AC_DEFINE_UNQUOTED([XLOG_SEG_SIZE], [(${wal_segsize} * 1024 * 1024)], [ - XLOG_SEG_SIZE is the size of a single WAL file. This must be a power of 2 - and larger than XLOG_BLCKSZ (preferably, a great deal larger than - XLOG_BLCKSZ). - - Changing XLOG_SEG_SIZE requires an initdb. -]) - # # C compiler # diff --git a/contrib/pg_standby/pg_standby.c b/contrib/pg_standby/pg_standby.c index d7fa2a80c6..0f90075990 100644 --- a/contrib/pg_standby/pg_standby.c +++ b/contrib/pg_standby/pg_standby.c @@ -36,6 +36,8 @@ const char *progname; +int WalSegsz; + /* Options and defaults */ int sleeptime = 5; /* amount of time to sleep between file checks */ int waittime = -1; /* how long we have been waiting, -1 no wait @@ -100,6 +102,72 @@ int nextWALFileType; struct stat stat_buf; +static bool SetWALFileNameForCleanup(void); + +/* Set wal segment size from the WAL file specified by WALFilePath */ +static bool +RetrieveWALSegSize(void) +{ + bool ret_val = false; + int fd; + char *buf = (char *) malloc(XLOG_BLCKSZ); + + /* Already set a valid WalSegsz? */ + if (IsValidWalSegSize(WalSegsz)) + return true; + + if ((fd = open(WALFilePath, O_RDWR, 0)) < 0) + { + fprintf(stderr, "%s: couldn't open WAL file \"%s\"\n", + progname, WALFilePath); + return false; + } + if (read(fd, buf, XLOG_BLCKSZ) == XLOG_BLCKSZ) + { + XLogPageHeader hdr = (XLogPageHeader) buf; + XLogLongPageHeader longhdr = (XLogLongPageHeader) hdr; + + WalSegsz = longhdr->xlp_seg_size; + + if (IsValidWalSegSize(WalSegsz)) + { + /* successfully retrieved WAL segment size */ + ret_val = true; + + /* check if clean up is necessary */ + need_cleanup = SetWALFileNameForCleanup(); + if (debug) + { + fprintf(stderr, + _("WAL segment size: %d \n"), WalSegsz); + fprintf(stderr, "Keep archive history: "); + + if (need_cleanup) + fprintf(stderr, "%s and later\n", + exclusiveCleanupFileName); + else + fprintf(stderr, "no cleanup required\n"); + } + } + else + fprintf(stderr, + "%s: WAL segment size must be a power of two between 1MB and 1GB, but the WAL file header specifies %d bytes\n", + progname, WalSegsz); + close(fd); + } + else + { + if (errno != 0) + fprintf(stderr, "could not read file \"%s\": %s", + WALFilePath, strerror(errno)); + else + fprintf(stderr, "not enough data in file \"%s\"", WALFilePath); + } + + fflush(stderr); + return ret_val; +} + /* ===================================================================== * * Customizable section @@ -184,7 +252,9 @@ CustomizableNextWALFileReady(void) nextWALFileType = XLOG_BACKUP_LABEL; return true; } - else if (stat_buf.st_size == XLOG_SEG_SIZE) + else if (!RetrieveWALSegSize()) + return false; + else if (stat_buf.st_size == WalSegsz) { #ifdef WIN32 @@ -204,7 +274,7 @@ CustomizableNextWALFileReady(void) /* * If still too small, wait until it is the correct size */ - if (stat_buf.st_size > XLOG_SEG_SIZE) + if (stat_buf.st_size > WalSegsz) { if (debug) { @@ -218,8 +288,6 @@ CustomizableNextWALFileReady(void) return false; } -#define MaxSegmentsPerLogFile ( 0xFFFFFFFF / XLOG_SEG_SIZE ) - static void CustomizableCleanupPriorWALFiles(void) { @@ -315,6 +383,7 @@ SetWALFileNameForCleanup(void) uint32 log_diff = 0, seg_diff = 0; bool cleanup = false; + int max_segments_per_logfile = (0xFFFFFFFF / WalSegsz); if (restartWALFileName) { @@ -336,12 +405,12 @@ SetWALFileNameForCleanup(void) sscanf(nextWALFileName, "%08X%08X%08X", &tli, &log, &seg); if (tli > 0 && seg > 0) { - log_diff = keepfiles / MaxSegmentsPerLogFile; - seg_diff = keepfiles % MaxSegmentsPerLogFile; + log_diff = keepfiles / max_segments_per_logfile; + seg_diff = keepfiles % max_segments_per_logfile; if (seg_diff > seg) { log_diff++; - seg = MaxSegmentsPerLogFile - (seg_diff - seg); + seg = max_segments_per_logfile - (seg_diff - seg); } else seg -= seg_diff; @@ -708,8 +777,6 @@ main(int argc, char **argv) CustomizableInitialize(); - need_cleanup = SetWALFileNameForCleanup(); - if (debug) { fprintf(stderr, "Trigger file: %s\n", triggerPath ? triggerPath : "<not set>"); @@ -721,11 +788,6 @@ main(int argc, char **argv) fprintf(stderr, "Max wait interval: %d %s\n", maxwaittime, (maxwaittime > 0 ? "seconds" : "forever")); fprintf(stderr, "Command for restore: %s\n", restoreCommand); - fprintf(stderr, "Keep archive history: "); - if (need_cleanup) - fprintf(stderr, "%s and later\n", exclusiveCleanupFileName); - else - fprintf(stderr, "no cleanup required\n"); fflush(stderr); } diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index 95aeb35507..bd55e8bb77 100644 --- a/doc/src/sgml/backup.sgml +++ b/doc/src/sgml/backup.sgml @@ -562,7 +562,7 @@ tar -cf backup.tar /usr/local/pgsql/data produces an indefinitely long sequence of WAL records. The system physically divides this sequence into WAL <firstterm>segment files</>, which are normally 16MB apiece (although the segment size - can be altered when building <productname>PostgreSQL</>). The segment + can be altered during <application>initdb</>). The segment files are given numeric names that reflect their position in the abstract WAL sequence. When not using WAL archiving, the system normally creates just a few segment files and then diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml index 12866b4bf7..0f512baafa 100644 --- a/doc/src/sgml/installation.sgml +++ b/doc/src/sgml/installation.sgml @@ -1058,20 +1058,6 @@ su - postgres </listitem> </varlistentry> - <varlistentry> - <term><option>--with-wal-segsize=<replaceable>SEGSIZE</replaceable></option></term> - <listitem> - <para> - Set the <firstterm>WAL segment size</>, in megabytes. This is - the size of each individual file in the WAL log. It may be useful - to adjust this size to control the granularity of WAL log shipping. - The default size is 16 megabytes. - The value must be a power of 2 between 1 and 1024 (megabytes). - Note that changing this value requires an initdb. - </para> - </listitem> - </varlistentry> - <varlistentry> <term><option>--with-wal-blocksize=<replaceable>BLOCKSIZE</replaceable></option></term> <listitem> diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml index 6efb2e442d..732fecab8e 100644 --- a/doc/src/sgml/ref/initdb.sgml +++ b/doc/src/sgml/ref/initdb.sgml @@ -316,6 +316,21 @@ PostgreSQL documentation </varlistentry> <varlistentry> + <term><option>--wal-segsize=<replaceable>SEGSIZE</replaceable></option></term> + <listitem> + <para> + Set the <firstterm>WAL segment size</>, in megabytes. This is + the size of each individual file in the WAL log. It may be useful + to adjust this size to control the granularity of WAL log shipping. + This option can only be set during initialization, and cannot be + changed later. + The default size is 16 megabytes. + The value must be a power of 2 between 1 and 1024 (megabytes). + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><option>-X <replaceable class="parameter">directory</replaceable></option></term> <term><option>--waldir=<replaceable class="parameter">directory</replaceable></option></term> <listitem> diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml index 940c37b21a..58c13d3ddf 100644 --- a/doc/src/sgml/wal.sgml +++ b/doc/src/sgml/wal.sgml @@ -752,13 +752,12 @@ <acronym>WAL</acronym> logs are stored in the directory <filename>pg_wal</filename> under the data directory, as a set of segment files, normally each 16 MB in size (but the size can be changed - by altering the <option>--with-wal-segsize</> configure option when - building the server). Each segment is divided into pages, normally - 8 kB each (this size can be changed via the <option>--with-wal-blocksize</> - configure option). The log record headers are described in - <filename>access/xlogrecord.h</filename>; the record content is dependent - on the type of event that is being logged. Segment files are given - ever-increasing numbers as names, starting at + by altering the <option>--wal-segsize</> initdb option). Each segment is + divided into pages, normally 8 kB each (this size can be changed via the + <option>--with-wal-blocksize</> configure option). The log record headers + are described in <filename>access/xlogrecord.h</filename>; the record content + is dependent on the type of event that is being logged. Segment files are + given ever-increasing numbers as names, starting at <filename>000000010000000000000000</filename>. The numbers do not wrap, but it will take a very, very long time to exhaust the available stock of numbers. diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index ba03d9687e..cd3384b88f 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -1299,7 +1299,8 @@ XlogReadTwoPhaseData(XLogRecPtr lsn, char **buf, int *len) XLogReaderState *xlogreader; char *errormsg; - xlogreader = XLogReaderAllocate(&read_local_xlog_page, NULL); + xlogreader = XLogReaderAllocate(wal_segment_size, &read_local_xlog_page, + NULL); if (!xlogreader) ereport(ERROR, (errcode(ERRCODE_OUT_OF_MEMORY), diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index df4843f409..b8de8b63f8 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -110,6 +110,8 @@ int wal_retrieve_retry_interval = 5000; bool XLOG_DEBUG = false; #endif +int wal_segment_size = DEFAULT_XLOG_SEG_SIZE; + /* * Number of WAL insertion locks to use. A higher value allows more insertions * to happen concurrently, but adds some CPU overhead to flushing the WAL, @@ -731,14 +733,16 @@ static ControlFileData *ControlFile = NULL; (((recptr) / XLOG_BLCKSZ) % (XLogCtl->XLogCacheBlck + 1)) /* - * These are the number of bytes in a WAL page and segment usable for WAL data. + * These are the number of bytes in a WAL page usable for WAL data. */ #define UsableBytesInPage (XLOG_BLCKSZ - SizeOfXLogShortPHD) -#define UsableBytesInSegment ((XLOG_SEG_SIZE / XLOG_BLCKSZ) * UsableBytesInPage - (SizeOfXLogLongPHD - SizeOfXLogShortPHD)) /* Convert min_wal_size_mb and max wal_size_mb to equivalent segment count */ -#define ConvertToXSegs(x) \ - (x / (XLOG_SEG_SIZE / (1024 * 1024))) +#define ConvertToXSegs(x, segsize) \ + (x / ((segsize) / (1024 * 1024))) + +/* The number of bytes in a WAL segment usable for WAL data. */ +static int UsableBytesInSegment; /* * Private, possibly out-of-date copy of shared LogwrtResult. @@ -1137,7 +1141,8 @@ XLogInsertRecord(XLogRecData *rdata, EndPos = StartPos + SizeOfXLogRecord; if (StartPos / XLOG_BLCKSZ != EndPos / XLOG_BLCKSZ) { - if (EndPos % XLOG_SEG_SIZE == EndPos % XLOG_BLCKSZ) + uint64 offset = XLogSegmentOffset(EndPos, wal_segment_size); + if (offset == EndPos % XLOG_BLCKSZ) EndPos += SizeOfXLogLongPHD; else EndPos += SizeOfXLogShortPHD; @@ -1170,7 +1175,7 @@ XLogInsertRecord(XLogRecData *rdata, appendBinaryStringInfo(&recordBuf, rdata->data, rdata->len); if (!debug_reader) - debug_reader = XLogReaderAllocate(NULL, NULL); + debug_reader = XLogReaderAllocate(wal_segment_size, NULL, NULL); if (!debug_reader) { @@ -1296,7 +1301,7 @@ ReserveXLogSwitch(XLogRecPtr *StartPos, XLogRecPtr *EndPos, XLogRecPtr *PrevPtr) startbytepos = Insert->CurrBytePos; ptr = XLogBytePosToEndRecPtr(startbytepos); - if (ptr % XLOG_SEG_SIZE == 0) + if (XLogSegmentOffset(ptr, wal_segment_size) == 0) { SpinLockRelease(&Insert->insertpos_lck); *EndPos = *StartPos = ptr; @@ -1309,8 +1314,8 @@ ReserveXLogSwitch(XLogRecPtr *StartPos, XLogRecPtr *EndPos, XLogRecPtr *PrevPtr) *StartPos = XLogBytePosToRecPtr(startbytepos); *EndPos = XLogBytePosToEndRecPtr(endbytepos); - segleft = XLOG_SEG_SIZE - ((*EndPos) % XLOG_SEG_SIZE); - if (segleft != XLOG_SEG_SIZE) + segleft = wal_segment_size - XLogSegmentOffset(*EndPos, wal_segment_size); + if (segleft != wal_segment_size) { /* consume the rest of the segment */ *EndPos += segleft; @@ -1323,7 +1328,7 @@ ReserveXLogSwitch(XLogRecPtr *StartPos, XLogRecPtr *EndPos, XLogRecPtr *PrevPtr) *PrevPtr = XLogBytePosToRecPtr(prevbytepos); - Assert((*EndPos) % XLOG_SEG_SIZE == 0); + Assert(XLogSegmentOffset(*EndPos, wal_segment_size) == 0); Assert(XLogRecPtrToBytePos(*EndPos) == endbytepos); Assert(XLogRecPtrToBytePos(*StartPos) == startbytepos); Assert(XLogRecPtrToBytePos(*PrevPtr) == prevbytepos); @@ -1501,7 +1506,7 @@ CopyXLogRecordToWAL(int write_len, bool isLogSwitch, XLogRecData *rdata, pagehdr->xlp_info |= XLP_FIRST_IS_CONTRECORD; /* skip over the page header */ - if (CurrPos % XLogSegSize == 0) + if (XLogSegmentOffset(CurrPos, wal_segment_size) == 0) { CurrPos += SizeOfXLogLongPHD; currpos += SizeOfXLogLongPHD; @@ -1532,16 +1537,16 @@ CopyXLogRecordToWAL(int write_len, bool isLogSwitch, XLogRecData *rdata, * allocated and zeroed in the WAL buffers so that when the caller (or * someone else) does XLogWrite(), it can really write out all the zeros. */ - if (isLogSwitch && CurrPos % XLOG_SEG_SIZE != 0) + if (isLogSwitch && XLogSegmentOffset(CurrPos, wal_segment_size) != 0) { /* An xlog-switch record doesn't contain any data besides the header */ Assert(write_len == SizeOfXLogRecord); /* * We do this one page at a time, to make sure we don't deadlock - * against ourselves if wal_buffers < XLOG_SEG_SIZE. + * against ourselves if wal_buffers < wal_segment_size. */ - Assert(EndPos % XLogSegSize == 0); + Assert(XLogSegmentOffset(EndPos, wal_segment_size) == 0); /* Use up all the remaining space on the first page */ CurrPos += freespace; @@ -1866,10 +1871,10 @@ GetXLogBuffer(XLogRecPtr ptr) * the page header. */ if (ptr % XLOG_BLCKSZ == SizeOfXLogShortPHD && - ptr % XLOG_SEG_SIZE > XLOG_BLCKSZ) + XLogSegmentOffset(ptr, wal_segment_size) > XLOG_BLCKSZ) initializedUpto = ptr - SizeOfXLogShortPHD; else if (ptr % XLOG_BLCKSZ == SizeOfXLogLongPHD && - ptr % XLOG_SEG_SIZE < XLOG_BLCKSZ) + XLogSegmentOffset(ptr, wal_segment_size) < XLOG_BLCKSZ) initializedUpto = ptr - SizeOfXLogLongPHD; else initializedUpto = ptr; @@ -1939,7 +1944,7 @@ XLogBytePosToRecPtr(uint64 bytepos) seg_offset += fullpages * XLOG_BLCKSZ + bytesleft + SizeOfXLogShortPHD; } - XLogSegNoOffsetToRecPtr(fullsegs, seg_offset, result); + XLogSegNoOffsetToRecPtr(fullsegs, seg_offset, result, wal_segment_size); return result; } @@ -1985,7 +1990,7 @@ XLogBytePosToEndRecPtr(uint64 bytepos) seg_offset += fullpages * XLOG_BLCKSZ + bytesleft + SizeOfXLogShortPHD; } - XLogSegNoOffsetToRecPtr(fullsegs, seg_offset, result); + XLogSegNoOffsetToRecPtr(fullsegs, seg_offset, result, wal_segment_size); return result; } @@ -2001,9 +2006,9 @@ XLogRecPtrToBytePos(XLogRecPtr ptr) uint32 offset; uint64 result; - XLByteToSeg(ptr, fullsegs); + XLByteToSeg(ptr, fullsegs, wal_segment_size); - fullpages = (ptr % XLOG_SEG_SIZE) / XLOG_BLCKSZ; + fullpages = (XLogSegmentOffset(ptr, wal_segment_size)) / XLOG_BLCKSZ; offset = ptr % XLOG_BLCKSZ; if (fullpages == 0) @@ -2168,12 +2173,12 @@ AdvanceXLInsertBuffer(XLogRecPtr upto, bool opportunistic) /* * If first page of an XLOG segment file, make it a long header. */ - if ((NewPage->xlp_pageaddr % XLogSegSize) == 0) + if ((XLogSegmentOffset(NewPage->xlp_pageaddr, wal_segment_size)) == 0) { XLogLongPageHeader NewLongPage = (XLogLongPageHeader) NewPage; NewLongPage->xlp_sysid = ControlFile->system_identifier; - NewLongPage->xlp_seg_size = XLogSegSize; + NewLongPage->xlp_seg_size = wal_segment_size; NewLongPage->xlp_xlog_blcksz = XLOG_BLCKSZ; NewPage->xlp_info |= XLP_LONG_HEADER; } @@ -2220,7 +2225,8 @@ CalculateCheckpointSegments(void) * number of segments consumed between checkpoints. *------- */ - target = (double) ConvertToXSegs(max_wal_size_mb) / (2.0 + CheckPointCompletionTarget); + target = (double) ConvertToXSegs(max_wal_size_mb, wal_segment_size) / + (2.0 + CheckPointCompletionTarget); /* round down */ CheckPointSegments = (int) target; @@ -2260,8 +2266,10 @@ XLOGfileslop(XLogRecPtr PriorRedoPtr) * correspond to. Always recycle enough segments to meet the minimum, and * remove enough segments to stay below the maximum. */ - minSegNo = PriorRedoPtr / XLOG_SEG_SIZE + ConvertToXSegs(min_wal_size_mb) - 1; - maxSegNo = PriorRedoPtr / XLOG_SEG_SIZE + ConvertToXSegs(max_wal_size_mb) - 1; + minSegNo = PriorRedoPtr / wal_segment_size + + ConvertToXSegs(min_wal_size_mb, wal_segment_size) - 1; + maxSegNo = PriorRedoPtr / wal_segment_size + + ConvertToXSegs(max_wal_size_mb, wal_segment_size) - 1; /* * Between those limits, recycle enough segments to get us through to the @@ -2290,7 +2298,8 @@ XLOGfileslop(XLogRecPtr PriorRedoPtr) /* add 10% for good measure. */ distance *= 1.10; - recycleSegNo = (XLogSegNo) ceil(((double) PriorRedoPtr + distance) / XLOG_SEG_SIZE); + recycleSegNo = (XLogSegNo) ceil(((double) PriorRedoPtr + distance) / + wal_segment_size); if (recycleSegNo < minSegNo) recycleSegNo = minSegNo; @@ -2314,7 +2323,7 @@ XLogCheckpointNeeded(XLogSegNo new_segno) { XLogSegNo old_segno; - XLByteToSeg(RedoRecPtr, old_segno); + XLByteToSeg(RedoRecPtr, old_segno, wal_segment_size); if (new_segno >= old_segno + (uint64) (CheckPointSegments - 1)) return true; @@ -2392,7 +2401,8 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible) LogwrtResult.Write = EndPtr; ispartialpage = WriteRqst.Write < LogwrtResult.Write; - if (!XLByteInPrevSeg(LogwrtResult.Write, openLogSegNo)) + if (!XLByteInPrevSeg(LogwrtResult.Write, openLogSegNo, + wal_segment_size)) { /* * Switch to new logfile segment. We cannot have any pending @@ -2401,7 +2411,8 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible) Assert(npages == 0); if (openLogFile >= 0) XLogFileClose(); - XLByteToPrevSeg(LogwrtResult.Write, openLogSegNo); + XLByteToPrevSeg(LogwrtResult.Write, openLogSegNo, + wal_segment_size); /* create/use new log file */ use_existent = true; @@ -2412,7 +2423,8 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible) /* Make sure we have the current logfile open */ if (openLogFile < 0) { - XLByteToPrevSeg(LogwrtResult.Write, openLogSegNo); + XLByteToPrevSeg(LogwrtResult.Write, openLogSegNo, + wal_segment_size); openLogFile = XLogFileOpen(openLogSegNo); openLogOff = 0; } @@ -2422,7 +2434,8 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible) { /* first of group */ startidx = curridx; - startoffset = (LogwrtResult.Write - XLOG_BLCKSZ) % XLogSegSize; + startoffset = XLogSegmentOffset(LogwrtResult.Write - XLOG_BLCKSZ, + wal_segment_size); } npages++; @@ -2435,7 +2448,7 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible) last_iteration = WriteRqst.Write <= LogwrtResult.Write; finishing_seg = !ispartialpage && - (startoffset + npages * XLOG_BLCKSZ) >= XLogSegSize; + (startoffset + npages * XLOG_BLCKSZ) >= wal_segment_size; if (last_iteration || curridx == XLogCtl->XLogCacheBlck || @@ -2562,11 +2575,13 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible) sync_method != SYNC_METHOD_OPEN_DSYNC) { if (openLogFile >= 0 && - !XLByteInPrevSeg(LogwrtResult.Write, openLogSegNo)) + !XLByteInPrevSeg(LogwrtResult.Write, openLogSegNo, + wal_segment_size)) XLogFileClose(); if (openLogFile < 0) { - XLByteToPrevSeg(LogwrtResult.Write, openLogSegNo); + XLByteToPrevSeg(LogwrtResult.Write, openLogSegNo, + wal_segment_size); openLogFile = XLogFileOpen(openLogSegNo); openLogOff = 0; } @@ -2982,7 +2997,8 @@ XLogBackgroundFlush(void) { if (openLogFile >= 0) { - if (!XLByteInPrevSeg(LogwrtResult.Write, openLogSegNo)) + if (!XLByteInPrevSeg(LogwrtResult.Write, openLogSegNo, + wal_segment_size)) { XLogFileClose(); } @@ -3161,7 +3177,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock) int fd; int nbytes; - XLogFilePath(path, ThisTimeLineID, logsegno); + XLogFilePath(path, ThisTimeLineID, logsegno, wal_segment_size); /* * Try to use existent file (checkpoint maker may have created it already) @@ -3215,7 +3231,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock) */ zbuffer = (char *) MAXALIGN(zbuffer_raw); memset(zbuffer, 0, XLOG_BLCKSZ); - for (nbytes = 0; nbytes < XLogSegSize; nbytes += XLOG_BLCKSZ) + for (nbytes = 0; nbytes < wal_segment_size; nbytes += XLOG_BLCKSZ) { errno = 0; pgstat_report_wait_start(WAIT_EVENT_WAL_INIT_WRITE); @@ -3332,7 +3348,7 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno, /* * Open the source file */ - XLogFilePath(path, srcTLI, srcsegno); + XLogFilePath(path, srcTLI, srcsegno, wal_segment_size); srcfd = OpenTransientFile(path, O_RDONLY | PG_BINARY, 0); if (srcfd < 0) ereport(ERROR, @@ -3357,7 +3373,7 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno, /* * Do the data copying. */ - for (nbytes = 0; nbytes < XLogSegSize; nbytes += sizeof(buffer)) + for (nbytes = 0; nbytes < wal_segment_size; nbytes += sizeof(buffer)) { int nread; @@ -3467,7 +3483,7 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath, char path[MAXPGPATH]; struct stat stat_buf; - XLogFilePath(path, ThisTimeLineID, *segno); + XLogFilePath(path, ThisTimeLineID, *segno, wal_segment_size); /* * We want to be sure that only one process does this at a time. @@ -3493,7 +3509,7 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath, return false; } (*segno)++; - XLogFilePath(path, ThisTimeLineID, *segno); + XLogFilePath(path, ThisTimeLineID, *segno, wal_segment_size); } } @@ -3524,7 +3540,7 @@ XLogFileOpen(XLogSegNo segno) char path[MAXPGPATH]; int fd; - XLogFilePath(path, ThisTimeLineID, segno); + XLogFilePath(path, ThisTimeLineID, segno, wal_segment_size); fd = BasicOpenFile(path, O_RDWR | PG_BINARY | get_sync_bit(sync_method), S_IRUSR | S_IWUSR); @@ -3551,7 +3567,7 @@ XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli, char path[MAXPGPATH]; int fd; - XLogFileName(xlogfname, tli, segno); + XLogFileName(xlogfname, tli, segno, wal_segment_size); switch (source) { @@ -3563,7 +3579,7 @@ XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli, restoredFromArchive = RestoreArchivedFile(path, xlogfname, "RECOVERYXLOG", - XLogSegSize, + wal_segment_size, InRedo); if (!restoredFromArchive) return -1; @@ -3571,7 +3587,7 @@ XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli, case XLOG_FROM_PG_WAL: case XLOG_FROM_STREAM: - XLogFilePath(path, tli, segno); + XLogFilePath(path, tli, segno, wal_segment_size); restoredFromArchive = false; break; @@ -3690,7 +3706,7 @@ XLogFileReadAnyTLI(XLogSegNo segno, int emode, int source) } /* Couldn't find it. For simplicity, complain about front timeline */ - XLogFilePath(path, recoveryTargetTLI, segno); + XLogFilePath(path, recoveryTargetTLI, segno, wal_segment_size); errno = ENOENT; ereport(emode, (errcode_for_file_access(), @@ -3741,9 +3757,11 @@ PreallocXlogFiles(XLogRecPtr endptr) XLogSegNo _logSegNo; int lf; bool use_existent; + uint64 offset; - XLByteToPrevSeg(endptr, _logSegNo); - if ((endptr - 1) % XLogSegSize >= (uint32) (0.75 * XLogSegSize)) + XLByteToPrevSeg(endptr, _logSegNo, wal_segment_size); + offset = XLogSegmentOffset(endptr - 1, wal_segment_size); + if (offset >= (uint32) (0.75 * wal_segment_size)) { _logSegNo++; use_existent = true; @@ -3774,7 +3792,7 @@ CheckXLogRemoved(XLogSegNo segno, TimeLineID tli) { char filename[MAXFNAMELEN]; - XLogFileName(filename, tli, segno); + XLogFileName(filename, tli, segno, wal_segment_size); ereport(ERROR, (errcode_for_file_access(), errmsg("requested WAL segment %s has already been removed", @@ -3811,7 +3829,7 @@ UpdateLastRemovedPtr(char *filename) uint32 tli; XLogSegNo segno; - XLogFromFileName(filename, &tli, &segno); + XLogFromFileName(filename, &tli, &segno, wal_segment_size); SpinLockAcquire(&XLogCtl->info_lck); if (segno > XLogCtl->lastRemovedSegNo) @@ -3845,7 +3863,7 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr) * doesn't matter, we ignore that in the comparison. (During recovery, * ThisTimeLineID isn't set, so we can't use that.) */ - XLogFileName(lastoff, 0, segno); + XLogFileName(lastoff, 0, segno, wal_segment_size); elog(DEBUG2, "attempting to remove WAL segments older than log file %s", lastoff); @@ -3906,7 +3924,7 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI) char switchseg[MAXFNAMELEN]; XLogSegNo endLogSegNo; - XLByteToPrevSeg(switchpoint, endLogSegNo); + XLByteToPrevSeg(switchpoint, endLogSegNo, wal_segment_size); xldir = AllocateDir(XLOGDIR); if (xldir == NULL) @@ -3918,7 +3936,7 @@ RemoveNonParentXlogFiles(XLogRecPtr switchpoint, TimeLineID newTLI) /* * Construct a filename of the last segment to be kept. */ - XLogFileName(switchseg, newTLI, endLogSegNo); + XLogFileName(switchseg, newTLI, endLogSegNo, wal_segment_size); elog(DEBUG2, "attempting to remove WAL segments newer than log file %s", switchseg); @@ -3974,7 +3992,7 @@ RemoveXlogFile(const char *segname, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr) /* * Initialize info about where to try to recycle to. */ - XLByteToSeg(endptr, endlogSegNo); + XLByteToSeg(endptr, endlogSegNo, wal_segment_size); if (PriorRedoPtr == InvalidXLogRecPtr) recycleSegNo = endlogSegNo + 10; else @@ -4192,9 +4210,11 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode, XLogSegNo segno; int32 offset; - XLByteToSeg(xlogreader->latestPagePtr, segno); - offset = xlogreader->latestPagePtr % XLogSegSize; - XLogFileName(fname, xlogreader->readPageTLI, segno); + XLByteToSeg(xlogreader->latestPagePtr, segno, wal_segment_size); + offset = XLogSegmentOffset(xlogreader->latestPagePtr, + wal_segment_size); + XLogFileName(fname, xlogreader->readPageTLI, segno, + wal_segment_size); ereport(emode_for_corrupt_record(emode, RecPtr ? RecPtr : EndRecPtr), (errmsg("unexpected timeline ID %u in log segment %s, offset %u", @@ -4399,7 +4419,7 @@ WriteControlFile(void) ControlFile->blcksz = BLCKSZ; ControlFile->relseg_size = RELSEG_SIZE; ControlFile->xlog_blcksz = XLOG_BLCKSZ; - ControlFile->xlog_seg_size = XLOG_SEG_SIZE; + ControlFile->xlog_seg_size = wal_segment_size; ControlFile->nameDataLen = NAMEDATALEN; ControlFile->indexMaxKeys = INDEX_MAX_KEYS; @@ -4467,6 +4487,7 @@ ReadControlFile(void) { pg_crc32c crc; int fd; + static char wal_segsz_str[20]; /* * Read data... @@ -4569,13 +4590,6 @@ ReadControlFile(void) " but the server was compiled with XLOG_BLCKSZ %d.", ControlFile->xlog_blcksz, XLOG_BLCKSZ), errhint("It looks like you need to recompile or initdb."))); - if (ControlFile->xlog_seg_size != XLOG_SEG_SIZE) - ereport(FATAL, - (errmsg("database files are incompatible with server"), - errdetail("The database cluster was initialized with XLOG_SEG_SIZE %d," - " but the server was compiled with XLOG_SEG_SIZE %d.", - ControlFile->xlog_seg_size, XLOG_SEG_SIZE), - errhint("It looks like you need to recompile or initdb."))); if (ControlFile->nameDataLen != NAMEDATALEN) ereport(FATAL, (errmsg("database files are incompatible with server"), @@ -4637,6 +4651,29 @@ ReadControlFile(void) errhint("It looks like you need to recompile or initdb."))); #endif + wal_segment_size = ControlFile->xlog_seg_size; + + if (!IsValidWalSegSize(wal_segment_size)) + ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("WAL segment size must be a power of two between 1MB and 1GB, but the control file specifies %d bytes", + wal_segment_size))); + + snprintf(wal_segsz_str, sizeof(wal_segsz_str), "%d", wal_segment_size); + SetConfigOption("wal_segment_size", wal_segsz_str, PGC_INTERNAL, + PGC_S_OVERRIDE); + + /* check and update variables dependent on wal_segment_size */ + if (ConvertToXSegs(min_wal_size_mb, wal_segment_size) < 2) + ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("\"min_wal_size\" must be at least twice \"wal_segment_size\"."))); + + if (ConvertToXSegs(max_wal_size_mb, wal_segment_size) < 2) + ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("\"max_wal_size\" must be at least twice \"wal_segment_size\"."))); + + CalculateUsableBytesInSegment(); + CalculateCheckpointSegments(); + /* Make the initdb settings visible as GUC variables, too */ SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no", PGC_INTERNAL, PGC_S_OVERRIDE); @@ -4757,8 +4794,8 @@ XLOGChooseNumBuffers(void) int xbuffers; xbuffers = NBuffers / 32; - if (xbuffers > XLOG_SEG_SIZE / XLOG_BLCKSZ) - xbuffers = XLOG_SEG_SIZE / XLOG_BLCKSZ; + if (xbuffers > (wal_segment_size / XLOG_BLCKSZ)) + xbuffers = (wal_segment_size / XLOG_BLCKSZ); if (xbuffers < 8) xbuffers = 8; return xbuffers; @@ -4817,6 +4854,17 @@ XLOGShmemSize(void) { char buf[32]; + /* + * The calculation of XLOGbuffers requires the run-time parameter + * wal_segment_size which is set from the control file. This value is + * required to create the shared memory segment. Hence, temporarily + * allocate space for reading the control file. + */ + if (!IsBootstrapProcessingMode()) + { + ControlFile = palloc(sizeof(ControlFileData)); + ReadControlFile(); + } snprintf(buf, sizeof(buf), "%d", XLOGChooseNumBuffers()); SetConfigOption("wal_buffers", buf, PGC_POSTMASTER, PGC_S_OVERRIDE); } @@ -4850,6 +4898,7 @@ XLOGShmemInit(void) foundXLog; char *allocptr; int i; + ControlFileData *temp_cfile; #ifdef WAL_DEBUG @@ -4867,6 +4916,16 @@ XLOGShmemInit(void) } #endif + /* + * If we are not in bootstrap mode, ControlFile is already read. Copy it + * to a temporary variable before reading it into shared memory later. + */ + if (!IsBootstrapProcessingMode()) + { + temp_cfile = palloc(sizeof(ControlFileData)); + memcpy(temp_cfile, ControlFile, sizeof(ControlFileData)); + } + ControlFile = (ControlFileData *) ShmemInitStruct("Control File", sizeof(ControlFileData), &foundCFile); XLogCtl = (XLogCtlData *) @@ -4935,12 +4994,11 @@ XLOGShmemInit(void) InitSharedLatch(&XLogCtl->recoveryWakeupLatch); /* - * If we are not in bootstrap mode, pg_control should already exist. Read - * and validate it immediately (see comments in ReadControlFile() for the - * reasons why). + * If we are not in bootstrap mode, copy the control file data from + * temporary variable into the shared memory. */ if (!IsBootstrapProcessingMode()) - ReadControlFile(); + memcpy(ControlFile, temp_cfile, sizeof(ControlFileData)); } /* @@ -5005,7 +5063,7 @@ BootStrapXLOG(void) * segment with logid=0 logseg=1. The very first WAL segment, 0/0, is not * used, so that we can use 0/0 to mean "before any valid WAL segment". */ - checkPoint.redo = XLogSegSize + SizeOfXLogLongPHD; + checkPoint.redo = wal_segment_size + SizeOfXLogLongPHD; checkPoint.ThisTimeLineID = ThisTimeLineID; checkPoint.PrevTimeLineID = ThisTimeLineID; checkPoint.fullPageWrites = fullPageWrites; @@ -5036,10 +5094,10 @@ BootStrapXLOG(void) page->xlp_magic = XLOG_PAGE_MAGIC; page->xlp_info = XLP_LONG_HEADER; page->xlp_tli = ThisTimeLineID; - page->xlp_pageaddr = XLogSegSize; + page->xlp_pageaddr = wal_segment_size; longpage = (XLogLongPageHeader) page; longpage->xlp_sysid = sysidentifier; - longpage->xlp_seg_size = XLogSegSize; + longpage->xlp_seg_size = wal_segment_size; longpage->xlp_xlog_blcksz = XLOG_BLCKSZ; /* Insert the initial checkpoint record */ @@ -5503,8 +5561,8 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog) * they are the same, but if the switch happens exactly at a segment * boundary, startLogSegNo will be endLogSegNo + 1. */ - XLByteToPrevSeg(endOfLog, endLogSegNo); - XLByteToSeg(endOfLog, startLogSegNo); + XLByteToPrevSeg(endOfLog, endLogSegNo, wal_segment_size); + XLByteToSeg(endOfLog, startLogSegNo, wal_segment_size); /* * Initialize the starting WAL segment for the new timeline. If the switch @@ -5522,7 +5580,7 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog) * avoid emplacing a bogus file. */ XLogFileCopy(endLogSegNo, endTLI, endLogSegNo, - endOfLog % XLOG_SEG_SIZE); + XLogSegmentOffset(endOfLog, wal_segment_size)); } else { @@ -5546,7 +5604,7 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog) * Let's just make real sure there are not .ready or .done flags posted * for the new segment. */ - XLogFileName(xlogfname, ThisTimeLineID, startLogSegNo); + XLogFileName(xlogfname, ThisTimeLineID, startLogSegNo, wal_segment_size); XLogArchiveCleanup(xlogfname); /* @@ -6187,6 +6245,16 @@ CheckRequiredParameterValues(void) } } +/* + * Calculate UsableBytesInSegment based on wal_segment_size + */ +void +CalculateUsableBytesInSegment(void) +{ + UsableBytesInSegment = (wal_segment_size / XLOG_BLCKSZ * UsableBytesInPage) - + (SizeOfXLogLongPHD - SizeOfXLogShortPHD); +} + /* * This must be called ONCE during postmaster or standalone-backend startup */ @@ -6348,7 +6416,7 @@ StartupXLOG(void) /* Set up XLOG reader facility */ MemSet(&private, 0, sizeof(XLogPageReadPrivate)); - xlogreader = XLogReaderAllocate(&XLogPageRead, &private); + xlogreader = XLogReaderAllocate(wal_segment_size, &XLogPageRead, &private); if (!xlogreader) ereport(ERROR, (errcode(ERRCODE_OUT_OF_MEMORY), @@ -7481,7 +7549,7 @@ StartupXLOG(void) XLogRecPtr pageBeginPtr; pageBeginPtr = EndOfLog - (EndOfLog % XLOG_BLCKSZ); - Assert(readOff == pageBeginPtr % XLogSegSize); + Assert(readOff == XLogSegmentOffset(pageBeginPtr, wal_segment_size)); firstIdx = XLogRecPtrToBufIdx(EndOfLog); @@ -7630,13 +7698,13 @@ StartupXLOG(void) * restored from the archive to begin with, it's expected to have a * .done file). */ - if (EndOfLog % XLOG_SEG_SIZE != 0 && XLogArchivingActive()) + if (XLogSegmentOffset(EndOfLog, wal_segment_size) != 0 && XLogArchivingActive()) { char origfname[MAXFNAMELEN]; XLogSegNo endLogSegNo; - XLByteToPrevSeg(EndOfLog, endLogSegNo); - XLogFileName(origfname, EndOfLogTLI, endLogSegNo); + XLByteToPrevSeg(EndOfLog, endLogSegNo, wal_segment_size); + XLogFileName(origfname, EndOfLogTLI, endLogSegNo, wal_segment_size); if (!XLogArchiveIsReadyOrDone(origfname)) { @@ -7644,7 +7712,7 @@ StartupXLOG(void) char partialfname[MAXFNAMELEN]; char partialpath[MAXPGPATH]; - XLogFilePath(origpath, EndOfLogTLI, endLogSegNo); + XLogFilePath(origpath, EndOfLogTLI, endLogSegNo, wal_segment_size); snprintf(partialfname, MAXFNAMELEN, "%s.partial", origfname); snprintf(partialpath, MAXPGPATH, "%s.partial", origpath); @@ -8150,6 +8218,9 @@ InitXLOGAccess(void) ThisTimeLineID = XLogCtl->ThisTimeLineID; Assert(ThisTimeLineID != 0 || IsBootstrapProcessingMode()); + /* set wal_segment_size */ + wal_segment_size = ControlFile->xlog_seg_size; + /* Use GetRedoRecPtr to copy the RedoRecPtr safely */ (void) GetRedoRecPtr(); /* Also update our copy of doPageWrites. */ @@ -8480,7 +8551,7 @@ UpdateCheckPointDistanceEstimate(uint64 nbytes) * more. * * When checkpoints are triggered by max_wal_size, this should converge to - * CheckpointSegments * XLOG_SEG_SIZE, + * CheckpointSegments * wal_segment_size, * * Note: This doesn't pay any attention to what caused the checkpoint. * Checkpoints triggered manually with CHECKPOINT command, or by e.g. @@ -8679,7 +8750,7 @@ CreateCheckPoint(int flags) freespace = INSERT_FREESPACE(curInsert); if (freespace == 0) { - if (curInsert % XLogSegSize == 0) + if (XLogSegmentOffset(curInsert, wal_segment_size) == 0) curInsert += SizeOfXLogLongPHD; else curInsert += SizeOfXLogShortPHD; @@ -8913,7 +8984,7 @@ CreateCheckPoint(int flags) /* Update the average distance between checkpoints. */ UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr); - XLByteToSeg(PriorRedoPtr, _logSegNo); + XLByteToSeg(PriorRedoPtr, _logSegNo, wal_segment_size); KeepLogSeg(recptr, &_logSegNo); _logSegNo--; RemoveOldXlogFiles(_logSegNo, PriorRedoPtr, recptr); @@ -9241,7 +9312,7 @@ CreateRestartPoint(int flags) /* Update the average distance between checkpoints/restartpoints. */ UpdateCheckPointDistanceEstimate(RedoRecPtr - PriorRedoPtr); - XLByteToSeg(PriorRedoPtr, _logSegNo); + XLByteToSeg(PriorRedoPtr, _logSegNo, wal_segment_size); /* * Get the current end of xlog replayed or received, whichever is @@ -9336,7 +9407,7 @@ KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo) XLogSegNo segno; XLogRecPtr keep; - XLByteToSeg(recptr, segno); + XLByteToSeg(recptr, segno, wal_segment_size); keep = XLogGetReplicationSlotMinimumLSN(); /* compute limit for wal_keep_segments first */ @@ -9354,7 +9425,7 @@ KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo) { XLogSegNo slotSegNo; - XLByteToSeg(keep, slotSegNo); + XLByteToSeg(keep, slotSegNo, wal_segment_size); if (slotSegNo <= 0) segno = 1; @@ -10137,7 +10208,7 @@ XLogFileNameP(TimeLineID tli, XLogSegNo segno) { char *result = palloc(MAXFNAMELEN); - XLogFileName(result, tli, segno); + XLogFileName(result, tli, segno, wal_segment_size); return result; } @@ -10391,8 +10462,8 @@ do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p, WALInsertLockRelease(); } while (!gotUniqueStartpoint); - XLByteToSeg(startpoint, _logSegNo); - XLogFileName(xlogfilename, starttli, _logSegNo); + XLByteToSeg(startpoint, _logSegNo, wal_segment_size); + XLogFileName(xlogfilename, starttli, _logSegNo, wal_segment_size); /* * Construct tablespace_map file @@ -10943,8 +11014,8 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p) */ RequestXLogSwitch(false); - XLByteToPrevSeg(stoppoint, _logSegNo); - XLogFileName(stopxlogfilename, stoptli, _logSegNo); + XLByteToPrevSeg(stoppoint, _logSegNo, wal_segment_size); + XLogFileName(stopxlogfilename, stoptli, _logSegNo, wal_segment_size); /* Use the log timezone here, not the session timezone */ stamp_time = (pg_time_t) time(NULL); @@ -10955,9 +11026,9 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p) /* * Write the backup history file */ - XLByteToSeg(startpoint, _logSegNo); + XLByteToSeg(startpoint, _logSegNo, wal_segment_size); BackupHistoryFilePath(histfilepath, stoptli, _logSegNo, - (uint32) (startpoint % XLogSegSize)); + startpoint, wal_segment_size); fp = AllocateFile(histfilepath, "w"); if (!fp) ereport(ERROR, @@ -11011,12 +11082,12 @@ do_pg_stop_backup(char *labelfile, bool waitforarchive, TimeLineID *stoptli_p) ((!backup_started_in_recovery && XLogArchivingActive()) || (backup_started_in_recovery && XLogArchivingAlways()))) { - XLByteToPrevSeg(stoppoint, _logSegNo); - XLogFileName(lastxlogfilename, stoptli, _logSegNo); + XLByteToPrevSeg(stoppoint, _logSegNo, wal_segment_size); + XLogFileName(lastxlogfilename, stoptli, _logSegNo, wal_segment_size); - XLByteToSeg(startpoint, _logSegNo); + XLByteToSeg(startpoint, _logSegNo, wal_segment_size); BackupHistoryFileName(histfilename, stoptli, _logSegNo, - (uint32) (startpoint % XLogSegSize)); + startpoint, wal_segment_size); seconds_before_warning = 60; waits = 0; @@ -11459,14 +11530,14 @@ XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen, uint32 targetPageOff; XLogSegNo targetSegNo PG_USED_FOR_ASSERTS_ONLY; - XLByteToSeg(targetPagePtr, targetSegNo); - targetPageOff = targetPagePtr % XLogSegSize; + XLByteToSeg(targetPagePtr, targetSegNo, wal_segment_size); + targetPageOff = XLogSegmentOffset(targetPagePtr, wal_segment_size); /* * See if we need to switch to a new segment because the requested record * is not in the currently open one. */ - if (readFile >= 0 && !XLByteInSeg(targetPagePtr, readSegNo)) + if (readFile >= 0 && !XLByteInSeg(targetPagePtr, readSegNo, wal_segment_size)) { /* * Request a restartpoint if we've replayed too much xlog since the @@ -11487,7 +11558,7 @@ XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen, readSource = 0; } - XLByteToSeg(targetPagePtr, readSegNo); + XLByteToSeg(targetPagePtr, readSegNo, wal_segment_size); retry: /* See if we need to retrieve more data */ @@ -11527,7 +11598,8 @@ retry: if (((targetPagePtr) / XLOG_BLCKSZ) != (receivedUpto / XLOG_BLCKSZ)) readLen = XLOG_BLCKSZ; else - readLen = receivedUpto % XLogSegSize - targetPageOff; + readLen = XLogSegmentOffset(receivedUpto, wal_segment_size) + - targetPageOff; } else readLen = XLOG_BLCKSZ; @@ -11538,7 +11610,7 @@ retry: { char fname[MAXFNAMELEN]; - XLogFileName(fname, curFileTLI, readSegNo); + XLogFileName(fname, curFileTLI, readSegNo, wal_segment_size); ereport(emode_for_corrupt_record(emode, targetPagePtr + reqLen), (errcode_for_file_access(), errmsg("could not seek in log segment %s to offset %u: %m", @@ -11552,7 +11624,7 @@ retry: char fname[MAXFNAMELEN]; pgstat_report_wait_end(); - XLogFileName(fname, curFileTLI, readSegNo); + XLogFileName(fname, curFileTLI, readSegNo, wal_segment_size); ereport(emode_for_corrupt_record(emode, targetPagePtr + reqLen), (errcode_for_file_access(), errmsg("could not read from log segment %s, offset %u: %m", diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c index 7afb73579b..c723c931d8 100644 --- a/src/backend/access/transam/xlogarchive.c +++ b/src/backend/access/transam/xlogarchive.c @@ -134,13 +134,14 @@ RestoreArchivedFile(char *path, const char *xlogfname, if (cleanupEnabled) { GetOldestRestartPoint(&restartRedoPtr, &restartTli); - XLByteToSeg(restartRedoPtr, restartSegNo); - XLogFileName(lastRestartPointFname, restartTli, restartSegNo); + XLByteToSeg(restartRedoPtr, restartSegNo, wal_segment_size); + XLogFileName(lastRestartPointFname, restartTli, restartSegNo, + wal_segment_size); /* we shouldn't need anything earlier than last restart point */ Assert(strcmp(lastRestartPointFname, xlogfname) <= 0); } else - XLogFileName(lastRestartPointFname, 0, 0L); + XLogFileName(lastRestartPointFname, 0, 0L, wal_segment_size); /* * construct the command to be executed @@ -347,8 +348,9 @@ ExecuteRecoveryCommand(char *command, char *commandName, bool failOnSignal) * archive, though there is no requirement to do so. */ GetOldestRestartPoint(&restartRedoPtr, &restartTli); - XLByteToSeg(restartRedoPtr, restartSegNo); - XLogFileName(lastRestartPointFname, restartTli, restartSegNo); + XLByteToSeg(restartRedoPtr, restartSegNo, wal_segment_size); + XLogFileName(lastRestartPointFname, restartTli, restartSegNo, + wal_segment_size); /* * construct the command to be executed @@ -547,7 +549,7 @@ XLogArchiveNotifySeg(XLogSegNo segno) { char xlog[MAXFNAMELEN]; - XLogFileName(xlog, ThisTimeLineID, segno); + XLogFileName(xlog, ThisTimeLineID, segno, wal_segment_size); XLogArchiveNotify(xlog); } diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c index f9b49ba498..443ccd6411 100644 --- a/src/backend/access/transam/xlogfuncs.c +++ b/src/backend/access/transam/xlogfuncs.c @@ -489,8 +489,8 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS) /* * xlogfilename */ - XLByteToPrevSeg(locationpoint, xlogsegno); - XLogFileName(xlogfilename, ThisTimeLineID, xlogsegno); + XLByteToPrevSeg(locationpoint, xlogsegno, wal_segment_size); + XLogFileName(xlogfilename, ThisTimeLineID, xlogsegno, wal_segment_size); values[0] = CStringGetTextDatum(xlogfilename); isnull[0] = false; @@ -498,7 +498,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS) /* * offset */ - xrecoff = locationpoint % XLogSegSize; + xrecoff = XLogSegmentOffset(locationpoint, wal_segment_size); values[1] = UInt32GetDatum(xrecoff); isnull[1] = false; @@ -530,8 +530,8 @@ pg_walfile_name(PG_FUNCTION_ARGS) errmsg("recovery is in progress"), errhint("pg_walfile_name() cannot be executed during recovery."))); - XLByteToPrevSeg(locationpoint, xlogsegno); - XLogFileName(xlogfilename, ThisTimeLineID, xlogsegno); + XLByteToPrevSeg(locationpoint, xlogsegno, wal_segment_size); + XLogFileName(xlogfilename, ThisTimeLineID, xlogsegno, wal_segment_size); PG_RETURN_TEXT_P(cstring_to_text(xlogfilename)); } diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 0781a7b9de..b1f9b90c50 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -64,7 +64,8 @@ report_invalid_record(XLogReaderState *state, const char *fmt,...) * Returns NULL if the xlogreader couldn't be allocated. */ XLogReaderState * -XLogReaderAllocate(XLogPageReadCB pagereadfunc, void *private_data) +XLogReaderAllocate(int wal_segment_size, XLogPageReadCB pagereadfunc, + void *private_data) { XLogReaderState *state; @@ -91,6 +92,7 @@ XLogReaderAllocate(XLogPageReadCB pagereadfunc, void *private_data) return NULL; } + state->wal_segment_size = wal_segment_size; state->read_page = pagereadfunc; /* system_identifier initialized to zeroes above */ state->private_data = private_data; @@ -466,8 +468,8 @@ XLogReadRecord(XLogReaderState *state, XLogRecPtr RecPtr, char **errormsg) (record->xl_info & ~XLR_INFO_MASK) == XLOG_SWITCH) { /* Pretend it extends to end of segment */ - state->EndRecPtr += XLogSegSize - 1; - state->EndRecPtr -= state->EndRecPtr % XLogSegSize; + state->EndRecPtr += state->wal_segment_size - 1; + state->EndRecPtr -= XLogSegmentOffset(state->EndRecPtr, state->wal_segment_size); } if (DecodeXLogRecord(state, record, errormsg)) @@ -509,8 +511,8 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen) Assert((pageptr % XLOG_BLCKSZ) == 0); - XLByteToSeg(pageptr, targetSegNo); - targetPageOff = (pageptr % XLogSegSize); + XLByteToSeg(pageptr, targetSegNo, state->wal_segment_size); + targetPageOff = XLogSegmentOffset(pageptr, state->wal_segment_size); /* check whether we have all the requested data already */ if (targetSegNo == state->readSegNo && targetPageOff == state->readOff && @@ -719,16 +721,16 @@ ValidXLogPageHeader(XLogReaderState *state, XLogRecPtr recptr, Assert((recptr % XLOG_BLCKSZ) == 0); - XLByteToSeg(recptr, segno); - offset = recptr % XLogSegSize; + XLByteToSeg(recptr, segno, state->wal_segment_size); + offset = XLogSegmentOffset(recptr, state->wal_segment_size); - XLogSegNoOffsetToRecPtr(segno, offset, recaddr); + XLogSegNoOffsetToRecPtr(segno, offset, recaddr, state->wal_segment_size); if (hdr->xlp_magic != XLOG_PAGE_MAGIC) { char fname[MAXFNAMELEN]; - XLogFileName(fname, state->readPageTLI, segno); + XLogFileName(fname, state->readPageTLI, segno, state->wal_segment_size); report_invalid_record(state, "invalid magic number %04X in log segment %s, offset %u", @@ -742,7 +744,7 @@ ValidXLogPageHeader(XLogReaderState *state, XLogRecPtr recptr, { char fname[MAXFNAMELEN]; - XLogFileName(fname, state->readPageTLI, segno); + XLogFileName(fname, state->readPageTLI, segno, state->wal_segment_size); report_invalid_record(state, "invalid info bits %04X in log segment %s, offset %u", @@ -775,10 +777,10 @@ ValidXLogPageHeader(XLogReaderState *state, XLogRecPtr recptr, fhdrident_str, sysident_str); return false; } - else if (longhdr->xlp_seg_size != XLogSegSize) + else if (longhdr->xlp_seg_size != state->wal_segment_size) { report_invalid_record(state, - "WAL file is from different database system: incorrect XLOG_SEG_SIZE in page header"); + "WAL file is from different database system: incorrect segment size in page header"); return false; } else if (longhdr->xlp_xlog_blcksz != XLOG_BLCKSZ) @@ -792,7 +794,7 @@ ValidXLogPageHeader(XLogReaderState *state, XLogRecPtr recptr, { char fname[MAXFNAMELEN]; - XLogFileName(fname, state->readPageTLI, segno); + XLogFileName(fname, state->readPageTLI, segno, state->wal_segment_size); /* hmm, first page of file doesn't have a long header? */ report_invalid_record(state, @@ -807,7 +809,7 @@ ValidXLogPageHeader(XLogReaderState *state, XLogRecPtr recptr, { char fname[MAXFNAMELEN]; - XLogFileName(fname, state->readPageTLI, segno); + XLogFileName(fname, state->readPageTLI, segno, state->wal_segment_size); report_invalid_record(state, "unexpected pageaddr %X/%X in log segment %s, offset %u", @@ -832,7 +834,7 @@ ValidXLogPageHeader(XLogReaderState *state, XLogRecPtr recptr, { char fname[MAXFNAMELEN]; - XLogFileName(fname, state->readPageTLI, segno); + XLogFileName(fname, state->readPageTLI, segno, state->wal_segment_size); report_invalid_record(state, "out-of-sequence timeline ID %u (after %u) in log segment %s, offset %u", diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index bbae733d65..68b0aad44c 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -654,7 +654,7 @@ XLogTruncateRelation(RelFileNode rnode, ForkNumber forkNum, * frontend). Probably these should be merged at some point. */ static void -XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, Size count) +XLogRead(char *buf, int segsize, TimeLineID tli, XLogRecPtr startptr, Size count) { char *p; XLogRecPtr recptr; @@ -666,6 +666,8 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, Size count) static TimeLineID sendTLI = 0; static uint32 sendOff = 0; + Assert(segsize == wal_segment_size); + p = buf; recptr = startptr; nbytes = count; @@ -676,10 +678,10 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, Size count) int segbytes; int readbytes; - startoff = recptr % XLogSegSize; + startoff = XLogSegmentOffset(recptr, segsize); /* Do we need to switch to a different xlog segment? */ - if (sendFile < 0 || !XLByteInSeg(recptr, sendSegNo) || + if (sendFile < 0 || !XLByteInSeg(recptr, sendSegNo, segsize) || sendTLI != tli) { char path[MAXPGPATH]; @@ -687,9 +689,9 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, Size count) if (sendFile >= 0) close(sendFile); - XLByteToSeg(recptr, sendSegNo); + XLByteToSeg(recptr, sendSegNo, segsize); - XLogFilePath(path, tli, sendSegNo); + XLogFilePath(path, tli, sendSegNo, segsize); sendFile = BasicOpenFile(path, O_RDONLY | PG_BINARY, 0); @@ -717,7 +719,7 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, Size count) { char path[MAXPGPATH]; - XLogFilePath(path, tli, sendSegNo); + XLogFilePath(path, tli, sendSegNo, segsize); ereport(ERROR, (errcode_for_file_access(), @@ -728,8 +730,8 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, Size count) } /* How many bytes are within this segment? */ - if (nbytes > (XLogSegSize - startoff)) - segbytes = XLogSegSize - startoff; + if (nbytes > (segsize - startoff)) + segbytes = segsize - startoff; else segbytes = nbytes; @@ -740,7 +742,7 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, Size count) { char path[MAXPGPATH]; - XLogFilePath(path, tli, sendSegNo); + XLogFilePath(path, tli, sendSegNo, segsize); ereport(ERROR, (errcode_for_file_access(), @@ -798,7 +800,8 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, Size count) void XLogReadDetermineTimeline(XLogReaderState *state, XLogRecPtr wantPage, uint32 wantLength) { - const XLogRecPtr lastReadPage = state->readSegNo * XLogSegSize + state->readOff; + const XLogRecPtr lastReadPage = state->readSegNo * state->wal_segment_size + + state->readOff; Assert(wantPage != InvalidXLogRecPtr && wantPage % XLOG_BLCKSZ == 0); Assert(wantLength <= XLOG_BLCKSZ); @@ -842,7 +845,8 @@ XLogReadDetermineTimeline(XLogReaderState *state, XLogRecPtr wantPage, uint32 wa if (state->currTLIValidUntil != InvalidXLogRecPtr && state->currTLI != ThisTimeLineID && state->currTLI != 0 && - (wantPage + wantLength) / XLogSegSize < state->currTLIValidUntil / XLogSegSize) + ((wantPage + wantLength) / state->wal_segment_size) < + (state->currTLIValidUntil / state->wal_segment_size)) return; /* @@ -864,9 +868,11 @@ XLogReadDetermineTimeline(XLogReaderState *state, XLogRecPtr wantPage, uint32 wa */ List *timelineHistory = readTimeLineHistory(ThisTimeLineID); - XLogRecPtr endOfSegment = (((wantPage / XLogSegSize) + 1) * XLogSegSize) - 1; + XLogRecPtr endOfSegment = (((wantPage / state->wal_segment_size) + 1) + * state->wal_segment_size) - 1; - Assert(wantPage / XLogSegSize == endOfSegment / XLogSegSize); + Assert(wantPage / state->wal_segment_size == + endOfSegment / state->wal_segment_size); /* * Find the timeline of the last LSN on the segment containing @@ -1014,7 +1020,8 @@ read_local_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr, * as 'count', read the whole page anyway. It's guaranteed to be * zero-padded up to the page boundary if it's incomplete. */ - XLogRead(cur_page, *pageTLI, targetPagePtr, XLOG_BLCKSZ); + XLogRead(cur_page, state->wal_segment_size, *pageTLI, targetPagePtr, + XLOG_BLCKSZ); /* number of valid bytes in the buffer */ return count; diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c index 0453fd4ac1..5686ab5529 100644 --- a/src/backend/bootstrap/bootstrap.c +++ b/src/backend/bootstrap/bootstrap.c @@ -19,6 +19,7 @@ #include "access/htup_details.h" #include "access/xact.h" +#include "access/xlog_internal.h" #include "bootstrap/bootstrap.h" #include "catalog/index.h" #include "catalog/pg_collation.h" @@ -222,7 +223,7 @@ AuxiliaryProcessMain(int argc, char *argv[]) /* If no -x argument, we are a CheckerProcess */ MyAuxProcType = CheckerProcess; - while ((flag = getopt(argc, argv, "B:c:d:D:Fkr:x:-:")) != -1) + while ((flag = getopt(argc, argv, "B:c:d:D:Fkr:x:X:-:")) != -1) { switch (flag) { @@ -257,6 +258,13 @@ AuxiliaryProcessMain(int argc, char *argv[]) case 'x': MyAuxProcType = atoi(optarg); break; + case 'X': + wal_segment_size = strtoul(optarg, NULL, 0); + if (!IsValidWalSegSize(wal_segment_size)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("-X requires a power of 2 value between 1MB and 1GB"))); + break; case 'c': case '-': { diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c index e48ebd557f..7e0af10c4d 100644 --- a/src/backend/postmaster/checkpointer.c +++ b/src/backend/postmaster/checkpointer.c @@ -624,7 +624,7 @@ CheckArchiveTimeout(void) * If the returned pointer points exactly to a segment boundary, * assume nothing happened. */ - if ((switchpoint % XLogSegSize) != 0) + if (XLogSegmentOffset(switchpoint, wal_segment_size) != 0) elog(DEBUG1, "write-ahead log switch forced (archive_timeout=%d)", XLogArchiveTimeout); } @@ -782,7 +782,8 @@ IsCheckpointOnSchedule(double progress) recptr = GetXLogReplayRecPtr(NULL); else recptr = GetInsertRecPtr(); - elapsed_xlogs = (((double) (recptr - ckpt_start_recptr)) / XLogSegSize) / CheckPointSegments; + elapsed_xlogs = (((double) (recptr - ckpt_start_recptr)) / + wal_segment_size) / CheckPointSegments; if (progress < elapsed_xlogs) { diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 95180b2ef5..4c61d71f8e 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -521,6 +521,7 @@ typedef struct HANDLE PostmasterHandle; HANDLE initial_signal_pipe; HANDLE syslogPipe[2]; + int wal_segment_size; #else int postmaster_alive_fds[2]; int syslogPipe[2]; @@ -6008,6 +6009,7 @@ save_backend_variables(BackendParameters *param, Port *port, pgwin32_create_signal_listener(childPid), childProcess)) return false; + param->wal_segment_size = wal_segment_size; #else memcpy(¶m->postmaster_alive_fds, &postmaster_alive_fds, sizeof(postmaster_alive_fds)); @@ -6237,6 +6239,8 @@ restore_backend_variables(BackendParameters *param, Port *port) #ifdef WIN32 PostmasterHandle = param->PostmasterHandle; pgwin32_initial_signal_pipe = param->initial_signal_pipe; + wal_segment_size = param->wal_segment_size; + CalculateUsableBytesInSegment(); #else memcpy(&postmaster_alive_fds, ¶m->postmaster_alive_fds, sizeof(postmaster_alive_fds)); diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c index 12a16bd773..c2c2bc7ebf 100644 --- a/src/backend/replication/basebackup.c +++ b/src/backend/replication/basebackup.c @@ -357,10 +357,10 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir) * shouldn't be such files, but if there are, there's little harm in * including them. */ - XLByteToSeg(startptr, startsegno); - XLogFileName(firstoff, ThisTimeLineID, startsegno); - XLByteToPrevSeg(endptr, endsegno); - XLogFileName(lastoff, ThisTimeLineID, endsegno); + XLByteToSeg(startptr, startsegno, wal_segment_size); + XLogFileName(firstoff, ThisTimeLineID, startsegno, wal_segment_size); + XLByteToPrevSeg(endptr, endsegno, wal_segment_size); + XLogFileName(lastoff, ThisTimeLineID, endsegno, wal_segment_size); dir = AllocateDir("pg_wal"); if (!dir) @@ -415,12 +415,13 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir) * Sanity check: the first and last segment should cover startptr and * endptr, with no gaps in between. */ - XLogFromFileName(walFiles[0], &tli, &segno); + XLogFromFileName(walFiles[0], &tli, &segno, wal_segment_size); if (segno != startsegno) { char startfname[MAXFNAMELEN]; - XLogFileName(startfname, ThisTimeLineID, startsegno); + XLogFileName(startfname, ThisTimeLineID, startsegno, + wal_segment_size); ereport(ERROR, (errmsg("could not find WAL file \"%s\"", startfname))); } @@ -429,12 +430,13 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir) XLogSegNo currsegno = segno; XLogSegNo nextsegno = segno + 1; - XLogFromFileName(walFiles[i], &tli, &segno); + XLogFromFileName(walFiles[i], &tli, &segno, wal_segment_size); if (!(nextsegno == segno || currsegno == segno)) { char nextfname[MAXFNAMELEN]; - XLogFileName(nextfname, ThisTimeLineID, nextsegno); + XLogFileName(nextfname, ThisTimeLineID, nextsegno, + wal_segment_size); ereport(ERROR, (errmsg("could not find WAL file \"%s\"", nextfname))); } @@ -443,7 +445,7 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir) { char endfname[MAXFNAMELEN]; - XLogFileName(endfname, ThisTimeLineID, endsegno); + XLogFileName(endfname, ThisTimeLineID, endsegno, wal_segment_size); ereport(ERROR, (errmsg("could not find WAL file \"%s\"", endfname))); } @@ -457,7 +459,7 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir) pgoff_t len = 0; snprintf(pathbuf, MAXPGPATH, XLOGDIR "/%s", walFiles[i]); - XLogFromFileName(walFiles[i], &tli, &segno); + XLogFromFileName(walFiles[i], &tli, &segno, wal_segment_size); fp = AllocateFile(pathbuf, "rb"); if (fp == NULL) @@ -479,7 +481,7 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir) (errcode_for_file_access(), errmsg("could not stat file \"%s\": %m", pathbuf))); - if (statbuf.st_size != XLogSegSize) + if (statbuf.st_size != wal_segment_size) { CheckXLogRemoved(segno, tli); ereport(ERROR, @@ -490,7 +492,7 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir) /* send the WAL file itself */ _tarWriteHeader(pathbuf, NULL, &statbuf, false); - while ((cnt = fread(buf, 1, Min(sizeof(buf), XLogSegSize - len), fp)) > 0) + while ((cnt = fread(buf, 1, Min(sizeof(buf), wal_segment_size - len), fp)) > 0) { CheckXLogRemoved(segno, tli); /* Send the chunk as a CopyData message */ @@ -501,11 +503,11 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir) len += cnt; throttle(cnt); - if (len == XLogSegSize) + if (len == wal_segment_size) break; } - if (len != XLogSegSize) + if (len != wal_segment_size) { CheckXLogRemoved(segno, tli); ereport(ERROR, @@ -513,7 +515,7 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir) errmsg("unexpected WAL file size \"%s\"", walFiles[i]))); } - /* XLogSegSize is a multiple of 512, so no need for padding */ + /* wal_segment_size is a multiple of 512, so no need for padding */ FreeFile(fp); diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c index efb9785f25..bca585fc27 100644 --- a/src/backend/replication/logical/logical.c +++ b/src/backend/replication/logical/logical.c @@ -163,7 +163,7 @@ StartupDecodingContext(List *output_plugin_options, ctx->slot = slot; - ctx->reader = XLogReaderAllocate(read_page, ctx); + ctx->reader = XLogReaderAllocate(wal_segment_size, read_page, ctx); if (!ctx->reader) ereport(ERROR, (errcode(ERRCODE_OUT_OF_MEMORY), diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 657bafae57..68766d522d 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -2083,15 +2083,16 @@ ReorderBufferSerializeTXN(ReorderBuffer *rb, ReorderBufferTXN *txn) * store in segment in which it belongs by start lsn, don't split over * multiple segments tho */ - if (fd == -1 || !XLByteInSeg(change->lsn, curOpenSegNo)) + if (fd == -1 || + !XLByteInSeg(change->lsn, curOpenSegNo, wal_segment_size)) { XLogRecPtr recptr; if (fd != -1) CloseTransientFile(fd); - XLByteToSeg(change->lsn, curOpenSegNo); - XLogSegNoOffsetToRecPtr(curOpenSegNo, 0, recptr); + XLByteToSeg(change->lsn, curOpenSegNo, wal_segment_size); + XLogSegNoOffsetToRecPtr(curOpenSegNo, 0, recptr, wal_segment_size); /* * No need to care about TLIs here, only used during a single run, @@ -2319,7 +2320,7 @@ ReorderBufferRestoreChanges(ReorderBuffer *rb, ReorderBufferTXN *txn, txn->nentries_mem = 0; Assert(dlist_is_empty(&txn->changes)); - XLByteToSeg(txn->final_lsn, last_segno); + XLByteToSeg(txn->final_lsn, last_segno, wal_segment_size); while (restored < max_changes_in_memory && *segno <= last_segno) { @@ -2334,11 +2335,11 @@ ReorderBufferRestoreChanges(ReorderBuffer *rb, ReorderBufferTXN *txn, /* first time in */ if (*segno == 0) { - XLByteToSeg(txn->first_lsn, *segno); + XLByteToSeg(txn->first_lsn, *segno, wal_segment_size); } Assert(*segno != 0 || dlist_is_empty(&txn->changes)); - XLogSegNoOffsetToRecPtr(*segno, 0, recptr); + XLogSegNoOffsetToRecPtr(*segno, 0, recptr, wal_segment_size); /* * No need to care about TLIs here, only used during a single run, @@ -2575,8 +2576,8 @@ ReorderBufferRestoreCleanup(ReorderBuffer *rb, ReorderBufferTXN *txn) Assert(txn->first_lsn != InvalidXLogRecPtr); Assert(txn->final_lsn != InvalidXLogRecPtr); - XLByteToSeg(txn->first_lsn, first); - XLByteToSeg(txn->final_lsn, last); + XLByteToSeg(txn->first_lsn, first, wal_segment_size); + XLByteToSeg(txn->final_lsn, last, wal_segment_size); /* iterate over all possible filenames, and delete them */ for (cur = first; cur <= last; cur++) @@ -2584,7 +2585,7 @@ ReorderBufferRestoreCleanup(ReorderBuffer *rb, ReorderBufferTXN *txn) char path[MAXPGPATH]; XLogRecPtr recptr; - XLogSegNoOffsetToRecPtr(cur, 0, recptr); + XLogSegNoOffsetToRecPtr(cur, 0, recptr, wal_segment_size); sprintf(path, "pg_replslot/%s/xid-%u-lsn-%X-%X.snap", NameStr(MyReplicationSlot->data.name), txn->xid, diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index a8a16f55e9..23de2577ef 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -1039,7 +1039,7 @@ ReplicationSlotReserveWal(void) * the new restart_lsn above, so normally we should never need to loop * more than twice. */ - XLByteToSeg(slot->data.restart_lsn, segno); + XLByteToSeg(slot->data.restart_lsn, segno, wal_segment_size); if (XLogGetLastRemovedSegno() < segno) break; } diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c index ea9d21a46b..3474514adc 100644 --- a/src/backend/replication/walreceiver.c +++ b/src/backend/replication/walreceiver.c @@ -613,7 +613,7 @@ WalReceiverMain(void) * Create .done file forcibly to prevent the streamed segment from * being archived later. */ - XLogFileName(xlogfname, recvFileTLI, recvSegNo); + XLogFileName(xlogfname, recvFileTLI, recvSegNo, wal_segment_size); if (XLogArchiveMode != ARCHIVE_MODE_ALWAYS) XLogArchiveForceDone(xlogfname); else @@ -943,7 +943,7 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr) { int segbytes; - if (recvFile < 0 || !XLByteInSeg(recptr, recvSegNo)) + if (recvFile < 0 || !XLByteInSeg(recptr, recvSegNo, wal_segment_size)) { bool use_existent; @@ -972,7 +972,7 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr) * Create .done file forcibly to prevent the streamed segment * from being archived later. */ - XLogFileName(xlogfname, recvFileTLI, recvSegNo); + XLogFileName(xlogfname, recvFileTLI, recvSegNo, wal_segment_size); if (XLogArchiveMode != ARCHIVE_MODE_ALWAYS) XLogArchiveForceDone(xlogfname); else @@ -981,7 +981,7 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr) recvFile = -1; /* Create/use new log file */ - XLByteToSeg(recptr, recvSegNo); + XLByteToSeg(recptr, recvSegNo, wal_segment_size); use_existent = true; recvFile = XLogFileInit(recvSegNo, &use_existent, true); recvFileTLI = ThisTimeLineID; @@ -989,10 +989,10 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr) } /* Calculate the start offset of the received logs */ - startoff = recptr % XLogSegSize; + startoff = XLogSegmentOffset(recptr, wal_segment_size); - if (startoff + nbytes > XLogSegSize) - segbytes = XLogSegSize - startoff; + if (startoff + nbytes > wal_segment_size) + segbytes = wal_segment_size - startoff; else segbytes = nbytes; diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c index 8ed7254b5c..78f8693ece 100644 --- a/src/backend/replication/walreceiverfuncs.c +++ b/src/backend/replication/walreceiverfuncs.c @@ -233,8 +233,8 @@ RequestXLogStreaming(TimeLineID tli, XLogRecPtr recptr, const char *conninfo, * being created by XLOG streaming, which might cause trouble later on if * the segment is e.g archived. */ - if (recptr % XLogSegSize != 0) - recptr -= recptr % XLogSegSize; + if (XLogSegmentOffset(recptr, wal_segment_size) != 0) + recptr -= XLogSegmentOffset(recptr, wal_segment_size); SpinLockAcquire(&walrcv->mutex); diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index db346e6edb..daf9e2bb0a 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -2316,9 +2316,9 @@ retry: int segbytes; int readbytes; - startoff = recptr % XLogSegSize; + startoff = XLogSegmentOffset(recptr, wal_segment_size); - if (sendFile < 0 || !XLByteInSeg(recptr, sendSegNo)) + if (sendFile < 0 || !XLByteInSeg(recptr, sendSegNo, wal_segment_size)) { char path[MAXPGPATH]; @@ -2326,7 +2326,7 @@ retry: if (sendFile >= 0) close(sendFile); - XLByteToSeg(recptr, sendSegNo); + XLByteToSeg(recptr, sendSegNo, wal_segment_size); /*------- * When reading from a historic timeline, and there is a timeline @@ -2359,12 +2359,12 @@ retry: { XLogSegNo endSegNo; - XLByteToSeg(sendTimeLineValidUpto, endSegNo); + XLByteToSeg(sendTimeLineValidUpto, endSegNo, wal_segment_size); if (sendSegNo == endSegNo) curFileTimeLine = sendTimeLineNextTLI; } - XLogFilePath(path, curFileTimeLine, sendSegNo); + XLogFilePath(path, curFileTimeLine, sendSegNo, wal_segment_size); sendFile = BasicOpenFile(path, O_RDONLY | PG_BINARY, 0); if (sendFile < 0) @@ -2401,8 +2401,8 @@ retry: } /* How many bytes are within this segment? */ - if (nbytes > (XLogSegSize - startoff)) - segbytes = XLogSegSize - startoff; + if (nbytes > (wal_segment_size - startoff)) + segbytes = wal_segment_size - startoff; else segbytes = nbytes; @@ -2433,7 +2433,7 @@ retry: * read() succeeds in that case, but the data we tried to read might * already have been overwritten with new WAL records. */ - XLByteToSeg(startptr, segno); + XLByteToSeg(startptr, segno, wal_segment_size); CheckXLogRemoved(segno, ThisTimeLineID); /* diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 25da06fffc..7f52e452f3 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -514,7 +514,6 @@ static int block_size; static int segment_size; static int wal_block_size; static bool data_checksums; -static int wal_segment_size; static bool integer_datetimes; static bool assert_enabled; @@ -714,9 +713,6 @@ typedef struct #if XLOG_BLCKSZ < 1024 || XLOG_BLCKSZ > (1024*1024) #error XLOG_BLCKSZ must be between 1KB and 1MB #endif -#if XLOG_SEG_SIZE < (1024*1024) || XLOG_SEG_SIZE > (1024*1024*1024) -#error XLOG_SEG_SIZE must be between 1MB and 1GB -#endif static const char *memory_units_hint = gettext_noop("Valid units for this parameter are \"kB\", \"MB\", \"GB\", and \"TB\"."); @@ -2264,7 +2260,8 @@ static struct config_int ConfigureNamesInt[] = GUC_UNIT_MB }, &min_wal_size_mb, - 5 * (XLOG_SEG_SIZE / (1024 * 1024)), 2, MAX_KILOBYTES, + DEFAULT_MIN_WAL_SEGS * (DEFAULT_XLOG_SEG_SIZE / (1024 * 1024)), + 2, MAX_KILOBYTES, NULL, NULL, NULL }, @@ -2275,7 +2272,8 @@ static struct config_int ConfigureNamesInt[] = GUC_UNIT_MB }, &max_wal_size_mb, - 64 * (XLOG_SEG_SIZE / (1024 * 1024)), 2, MAX_KILOBYTES, + DEFAULT_MAX_WAL_SEGS * (DEFAULT_XLOG_SEG_SIZE / (1024 * 1024)), + 2, MAX_KILOBYTES, NULL, assign_max_wal_size, NULL }, @@ -2637,14 +2635,14 @@ static struct config_int ConfigureNamesInt[] = { {"wal_segment_size", PGC_INTERNAL, PRESET_OPTIONS, - gettext_noop("Shows the number of pages per write ahead log segment."), + gettext_noop("Shows the size of write ahead log segments."), NULL, - GUC_UNIT_XBLOCKS | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE + GUC_UNIT_BYTE | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE }, &wal_segment_size, - (XLOG_SEG_SIZE / XLOG_BLCKSZ), - (XLOG_SEG_SIZE / XLOG_BLCKSZ), - (XLOG_SEG_SIZE / XLOG_BLCKSZ), + DEFAULT_XLOG_SEG_SIZE, + WalSegMinSize, + WalSegMaxSize, NULL, NULL, NULL }, diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c index 0dbfe7f952..bc2ca8731d 100644 --- a/src/backend/utils/misc/pg_controldata.c +++ b/src/backend/utils/misc/pg_controldata.c @@ -141,8 +141,9 @@ pg_control_checkpoint(PG_FUNCTION_ARGS) * Calculate name of the WAL file containing the latest checkpoint's REDO * start point. */ - XLByteToSeg(ControlFile->checkPointCopy.redo, segno); - XLogFileName(xlogfilename, ControlFile->checkPointCopy.ThisTimeLineID, segno); + XLByteToSeg(ControlFile->checkPointCopy.redo, segno, wal_segment_size); + XLogFileName(xlogfilename, ControlFile->checkPointCopy.ThisTimeLineID, + segno, wal_segment_size); /* Populate the values and null arrays */ values[0] = LSNGetDatum(ControlFile->checkPoint); diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index df5d2f3f22..1a79258fde 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -234,7 +234,7 @@ #max_wal_senders = 10 # max number of walsender processes # (change requires restart) -#wal_keep_segments = 0 # in logfile segments, 16MB each; 0 disables +#wal_keep_segments = 0 # in logfile segments; 0 disables #wal_sender_timeout = 60s # in milliseconds; 0 disables #max_replication_slots = 10 # max number of replication slots diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c index 7303bbe892..07b27968d3 100644 --- a/src/bin/initdb/initdb.c +++ b/src/bin/initdb/initdb.c @@ -52,6 +52,7 @@ #include <fcntl.h> #include <sys/stat.h> #include <unistd.h> +#include <math.h> #include <signal.h> #include <time.h> @@ -59,6 +60,7 @@ #include "sys/mman.h" #endif +#include "access/xlog_internal.h" #include "catalog/catalog.h" #include "catalog/pg_authid.h" #include "catalog/pg_class.h" @@ -141,6 +143,8 @@ static bool sync_only = false; static bool show_setting = false; static bool data_checksums = false; static char *xlog_dir = ""; +static char *str_wal_segment_size_mb = ""; +static int wal_segment_size_mb; /* internal vars */ @@ -999,6 +1003,25 @@ test_config_settings(void) printf("%s\n", dynamic_shared_memory_type); } +/* + * Calculate the default wal_size in proper unit. + */ +static char * +pretty_wal_size(int segment_count) +{ + double val = wal_segment_size_mb * segment_count; + double temp_val; + char *result = malloc(10); + + temp_val = val / 1024; + if (ceilf(temp_val) == temp_val) + snprintf(result, 10, "%dGB", (int) temp_val); + else + snprintf(result, 10, "%dMB", (int) val); + + return result; +} + /* * set up all the config files */ @@ -1043,6 +1066,15 @@ setup_config(void) conflines = replace_token(conflines, "#port = 5432", repltok); #endif + /* set default max_wal_size and min_wal_size */ + snprintf(repltok, sizeof(repltok), "min_wal_size = %s", + pretty_wal_size(DEFAULT_MIN_WAL_SEGS)); + conflines = replace_token(conflines, "#min_wal_size = 80MB", repltok); + + snprintf(repltok, sizeof(repltok), "max_wal_size = %s", + pretty_wal_size(DEFAULT_MAX_WAL_SEGS)); + conflines = replace_token(conflines, "#max_wal_size = 1GB", repltok); + snprintf(repltok, sizeof(repltok), "lc_messages = '%s'", escape_quotes(lc_messages)); conflines = replace_token(conflines, "#lc_messages = 'C'", repltok); @@ -1356,8 +1388,9 @@ bootstrap_template1(void) unsetenv("PGCLIENTENCODING"); snprintf(cmd, sizeof(cmd), - "\"%s\" --boot -x1 %s %s %s", + "\"%s\" --boot -x1 -X %u %s %s %s", backend_exec, + wal_segment_size_mb * (1024 * 1024), data_checksums ? "-k" : "", boot_options, talkargs); @@ -2291,6 +2324,7 @@ usage(const char *progname) printf(_(" -U, --username=NAME database superuser name\n")); printf(_(" -W, --pwprompt prompt for a password for the new superuser\n")); printf(_(" -X, --waldir=WALDIR location for the write-ahead log directory\n")); + printf(_(" --wal-segsize=SIZE size of wal segment size\n")); printf(_("\nLess commonly used options:\n")); printf(_(" -d, --debug generate lots of debugging output\n")); printf(_(" -k, --data-checksums use data page checksums\n")); @@ -2985,6 +3019,7 @@ main(int argc, char *argv[]) {"no-sync", no_argument, NULL, 'N'}, {"sync-only", no_argument, NULL, 'S'}, {"waldir", required_argument, NULL, 'X'}, + {"wal-segsize", required_argument, NULL, 12}, {"data-checksums", no_argument, NULL, 'k'}, {NULL, 0, NULL, 0} }; @@ -3118,6 +3153,9 @@ main(int argc, char *argv[]) case 'X': xlog_dir = pg_strdup(optarg); break; + case 12: + str_wal_segment_size_mb = pg_strdup(optarg); + break; default: /* getopt_long already emitted a complaint */ fprintf(stderr, _("Try \"%s --help\" for more information.\n"), @@ -3180,6 +3218,27 @@ main(int argc, char *argv[]) check_need_password(authmethodlocal, authmethodhost); + /* set wal segment size */ + if (str_wal_segment_size_mb == NULL || !strlen(str_wal_segment_size_mb)) + wal_segment_size_mb = (DEFAULT_XLOG_SEG_SIZE) / (1024 * 1024); + else + { + char *endptr; + + /* check that the argument is a number */ + wal_segment_size_mb = strtol(str_wal_segment_size_mb, &endptr, 10); + + /* verify that wal segment size is valid */ + if (*endptr != '\0' || + !IsValidWalSegSize(wal_segment_size_mb * 1024 * 1024)) + { + fprintf(stderr, + _("%s: --wal-segsize must be a power of two between 1 and 1024\n"), + progname); + exit(1); + } + } + get_restricted_token(progname); setup_pgdata(); diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index dfb9b5ddcb..7575acd959 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -26,6 +26,7 @@ #include <zlib.h> #endif +#include "access/xlog_internal.h" #include "common/file_utils.h" #include "common/string.h" #include "fe_utils/string_utils.h" @@ -555,7 +556,7 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier) } param->startptr = ((uint64) hi) << 32 | lo; /* Round off to even segment position */ - param->startptr -= param->startptr % XLOG_SEG_SIZE; + param->startptr -= XLogSegmentOffset(param->startptr, WalSegsz); #ifndef WIN32 /* Create our background pipe */ @@ -2397,6 +2398,10 @@ main(int argc, char **argv) exit(1); } + /* determine remote server's xlog segment size */ + if (!RetrieveWalSegSize(conn)) + disconnect_and_exit(1); + /* Create pg_wal symlink, if required */ if (strcmp(xlog_dir, "") != 0) { diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c index 4a1a5658fb..002f4ec87c 100644 --- a/src/bin/pg_basebackup/pg_receivewal.c +++ b/src/bin/pg_basebackup/pg_receivewal.c @@ -179,7 +179,7 @@ close_destination_dir(DIR *dest_dir, char *dest_folder) /* * Determine starting location for streaming, based on any existing xlog * segments in the directory. We start at the end of the last one that is - * complete (size matches XLogSegSize), on the timeline with highest ID. + * complete (size matches wal segment size), on the timeline with highest ID. * * If there are no WAL files in the directory, returns InvalidXLogRecPtr. */ @@ -230,7 +230,7 @@ FindStreamingStart(uint32 *tli) /* * Looks like an xlog file. Parse its position. */ - XLogFromFileName(dirent->d_name, &tli, &segno); + XLogFromFileName(dirent->d_name, &tli, &segno, WalSegsz); /* * Check that the segment has the right size, if it's supposed to be @@ -255,7 +255,7 @@ FindStreamingStart(uint32 *tli) disconnect_and_exit(1); } - if (statbuf.st_size != XLOG_SEG_SIZE) + if (statbuf.st_size != WalSegsz) { fprintf(stderr, _("%s: segment file \"%s\" has incorrect size %d, skipping\n"), @@ -296,7 +296,7 @@ FindStreamingStart(uint32 *tli) bytes_out = (buf[3] << 24) | (buf[2] << 16) | (buf[1] << 8) | buf[0]; - if (bytes_out != XLOG_SEG_SIZE) + if (bytes_out != WalSegsz) { fprintf(stderr, _("%s: compressed segment file \"%s\" has incorrect uncompressed size %d, skipping\n"), @@ -337,7 +337,7 @@ FindStreamingStart(uint32 *tli) if (!high_ispartial) high_segno++; - XLogSegNoOffsetToRecPtr(high_segno, 0, high_ptr); + XLogSegNoOffsetToRecPtr(high_segno, 0, high_ptr, WalSegsz); *tli = high_tli; return high_ptr; @@ -398,7 +398,7 @@ StreamLog(void) /* * Always start streaming at the beginning of a segment */ - stream.startpos -= stream.startpos % XLOG_SEG_SIZE; + stream.startpos -= XLogSegmentOffset(stream.startpos, WalSegsz); /* * Start the replication @@ -665,6 +665,10 @@ main(int argc, char **argv) if (!RunIdentifySystem(conn, NULL, NULL, NULL, &db_name)) disconnect_and_exit(1); + /* determine remote server's xlog segment size */ + if (!RetrieveWalSegSize(conn)) + disconnect_and_exit(1); + /* * Check that there is a database associated with connection, none should * be defined in this context. diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c index 888458f4a9..7d76b0d4d4 100644 --- a/src/bin/pg_basebackup/receivelog.c +++ b/src/bin/pg_basebackup/receivelog.c @@ -95,17 +95,17 @@ open_walfile(StreamCtl *stream, XLogRecPtr startpoint) ssize_t size; XLogSegNo segno; - XLByteToSeg(startpoint, segno); - XLogFileName(current_walfile_name, stream->timeline, segno); + XLByteToSeg(startpoint, segno, WalSegsz); + XLogFileName(current_walfile_name, stream->timeline, segno, WalSegsz); snprintf(fn, sizeof(fn), "%s%s", current_walfile_name, stream->partial_suffix ? stream->partial_suffix : ""); /* * When streaming to files, if an existing file exists we verify that it's - * either empty (just created), or a complete XLogSegSize segment (in - * which case it has been created and padded). Anything else indicates a - * corrupt file. + * either empty (just created), or a complete WalSegsz segment (in which + * case it has been created and padded). Anything else indicates a corrupt + * file. * * When streaming to tar, no file with this name will exist before, so we * never have to verify a size. @@ -120,7 +120,7 @@ open_walfile(StreamCtl *stream, XLogRecPtr startpoint) progname, fn, stream->walmethod->getlasterror()); return false; } - if (size == XLogSegSize) + if (size == WalSegsz) { /* Already padded file. Open it for use */ f = stream->walmethod->open_for_write(current_walfile_name, stream->partial_suffix, 0); @@ -154,7 +154,7 @@ open_walfile(StreamCtl *stream, XLogRecPtr startpoint) ngettext("%s: write-ahead log file \"%s\" has %d byte, should be 0 or %d\n", "%s: write-ahead log file \"%s\" has %d bytes, should be 0 or %d\n", size), - progname, fn, (int) size, XLogSegSize); + progname, fn, (int) size, WalSegsz); return false; } /* File existed and was empty, so fall through and open */ @@ -162,7 +162,7 @@ open_walfile(StreamCtl *stream, XLogRecPtr startpoint) /* No file existed, so create one */ - f = stream->walmethod->open_for_write(current_walfile_name, stream->partial_suffix, XLogSegSize); + f = stream->walmethod->open_for_write(current_walfile_name, stream->partial_suffix, WalSegsz); if (f == NULL) { fprintf(stderr, @@ -203,7 +203,7 @@ close_walfile(StreamCtl *stream, XLogRecPtr pos) if (stream->partial_suffix) { - if (currpos == XLOG_SEG_SIZE) + if (currpos == WalSegsz) r = stream->walmethod->close(walfile, CLOSE_NORMAL); else { @@ -231,7 +231,7 @@ close_walfile(StreamCtl *stream, XLogRecPtr pos) * new node. This is in line with walreceiver.c always doing a * XLogArchiveForceDone() after a complete segment. */ - if (currpos == XLOG_SEG_SIZE && stream->mark_done) + if (currpos == WalSegsz && stream->mark_done) { /* writes error message if failed */ if (!mark_file_as_archived(stream, current_walfile_name)) @@ -676,7 +676,8 @@ ReceiveXlogStream(PGconn *conn, StreamCtl *stream) * start streaming at the beginning of a segment. */ stream->timeline = newtimeline; - stream->startpos = stream->startpos - (stream->startpos % XLOG_SEG_SIZE); + stream->startpos = stream->startpos - + XLogSegmentOffset(stream->startpos, WalSegsz); continue; } else if (PQresultStatus(res) == PGRES_COMMAND_OK) @@ -1111,7 +1112,7 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len, *blockpos = fe_recvint64(©buf[1]); /* Extract WAL location for this block */ - xlogoff = *blockpos % XLOG_SEG_SIZE; + xlogoff = XLogSegmentOffset(*blockpos, WalSegsz); /* * Verify that the initial location in the stream matches where we think @@ -1148,11 +1149,11 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len, int bytes_to_write; /* - * If crossing a WAL boundary, only write up until we reach - * XLOG_SEG_SIZE. + * If crossing a WAL boundary, only write up until we reach wal + * segment size. */ - if (xlogoff + bytes_left > XLOG_SEG_SIZE) - bytes_to_write = XLOG_SEG_SIZE - xlogoff; + if (xlogoff + bytes_left > WalSegsz) + bytes_to_write = WalSegsz - xlogoff; else bytes_to_write = bytes_left; @@ -1182,7 +1183,7 @@ ProcessXLogDataMsg(PGconn *conn, StreamCtl *stream, char *copybuf, int len, xlogoff += bytes_to_write; /* Did we reach the end of a WAL segment? */ - if (*blockpos % XLOG_SEG_SIZE == 0) + if (XLogSegmentOffset(*blockpos, WalSegsz) == 0) { if (!close_walfile(stream, *blockpos)) /* Error message written in close_walfile() */ diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c index 9d40744a34..935ed27803 100644 --- a/src/bin/pg_basebackup/streamutil.c +++ b/src/bin/pg_basebackup/streamutil.c @@ -25,12 +25,18 @@ #include "receivelog.h" #include "streamutil.h" +#include "access/xlog_internal.h" #include "pqexpbuffer.h" #include "common/fe_memutils.h" #include "datatype/timestamp.h" #define ERRCODE_DUPLICATE_OBJECT "42710" +uint32 WalSegsz; + +/* SHOW command for replication connection was introduced in version 10 */ +#define MINIMUM_VERSION_FOR_SHOW_CMD 100000 + const char *progname; char *connection_string = NULL; char *dbhost = NULL; @@ -231,6 +237,76 @@ GetConnection(void) return tmpconn; } +/* + * From version 10, explicitly set wal segment size using SHOW wal_segment_size + * since ControlFile is not accessible here. + */ +bool +RetrieveWalSegSize(PGconn *conn) +{ + PGresult *res; + char xlog_unit[3]; + int xlog_val, + multiplier = 1; + + /* check connection existence */ + Assert(conn != NULL); + + /* for previous versions set the default xlog seg size */ + if (PQserverVersion(conn) < MINIMUM_VERSION_FOR_SHOW_CMD) + { + WalSegsz = DEFAULT_XLOG_SEG_SIZE; + return true; + } + + res = PQexec(conn, "SHOW wal_segment_size"); + if (PQresultStatus(res) != PGRES_TUPLES_OK) + { + fprintf(stderr, _("%s: could not send replication command \"%s\": %s\n"), + progname, "SHOW wal_segment_size", PQerrorMessage(conn)); + + PQclear(res); + return false; + } + if (PQntuples(res) != 1 || PQnfields(res) < 1) + { + fprintf(stderr, + _("%s: could not fetch WAL segment size: got %d rows and %d fields, expected %d rows and %d or more fields\n"), + progname, PQntuples(res), PQnfields(res), 1, 1); + + PQclear(res); + return false; + } + + /* fetch xlog value and unit from the result */ + if (sscanf(PQgetvalue(res, 0, 0), "%d%s", &xlog_val, xlog_unit) != 2) + { + fprintf(stderr, _("%s: WAL segment size could not be parsed\n"), + progname); + return false; + } + + /* set the multiplier based on unit to convert xlog_val to bytes */ + if (strcmp(xlog_unit, "MB") == 0) + multiplier = 1024 * 1024; + else if (strcmp(xlog_unit, "GB") == 0) + multiplier = 1024 * 1024 * 1024; + + /* convert and set WalSegsz */ + WalSegsz = xlog_val * multiplier; + + if (!IsValidWalSegSize(WalSegsz)) + { + fprintf(stderr, + _("%s: WAL segment size must be a power of two between 1MB and 1GB, but the remote server reported a value of %d bytes\n"), + progname, WalSegsz); + return false; + } + + PQclear(res); + return true; +} + /* * Run IDENTIFY_SYSTEM through a given connection and give back to caller * some result information if requested: diff --git a/src/bin/pg_basebackup/streamutil.h b/src/bin/pg_basebackup/streamutil.h index 6f6878679f..e6f2690558 100644 --- a/src/bin/pg_basebackup/streamutil.h +++ b/src/bin/pg_basebackup/streamutil.h @@ -24,6 +24,7 @@ extern char *dbuser; extern char *dbport; extern char *dbname; extern int dbgetpassword; +extern uint32 WalSegsz; /* Connection kept global so we can disconnect easily */ extern PGconn *conn; @@ -39,6 +40,7 @@ extern bool RunIdentifySystem(PGconn *conn, char **sysid, TimeLineID *starttli, XLogRecPtr *startpos, char **db_name); +extern bool RetrieveWalSegSize(PGconn *conn); extern TimestampTz feGetCurrentTimestamp(void); extern void feTimestampDifference(TimestampTz start_time, TimestampTz stop_time, long *secs, int *microsecs); diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c index 2ea893179a..75f3e521c4 100644 --- a/src/bin/pg_controldata/pg_controldata.c +++ b/src/bin/pg_controldata/pg_controldata.c @@ -99,6 +99,7 @@ main(int argc, char *argv[]) char xlogfilename[MAXFNAMELEN]; int c; int i; + int WalSegsz; set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_controldata")); @@ -164,6 +165,15 @@ main(int argc, char *argv[]) "Either the file is corrupt, or it has a different layout than this program\n" "is expecting. The results below are untrustworthy.\n\n")); + /* set wal segment size */ + WalSegsz = ControlFile->xlog_seg_size; + + if (!IsValidWalSegSize(WalSegsz)) + fprintf(stderr, + _("WARNING: WAL segment size specified, %d bytes, is not a power of two between 1MB and 1GB.\n" + "The file is corrupt and the results below are untrustworthy.\n"), + WalSegsz); + /* * This slightly-chintzy coding will work as long as the control file * timestamps are within the range of time_t; that should be the case in @@ -184,8 +194,9 @@ main(int argc, char *argv[]) * Calculate name of the WAL file containing the latest checkpoint's REDO * start point. */ - XLByteToSeg(ControlFile->checkPointCopy.redo, segno); - XLogFileName(xlogfilename, ControlFile->checkPointCopy.ThisTimeLineID, segno); + XLByteToSeg(ControlFile->checkPointCopy.redo, segno, WalSegsz); + XLogFileName(xlogfilename, ControlFile->checkPointCopy.ThisTimeLineID, + segno, WalSegsz); /* * Format system_identifier and mock_authentication_nonce separately to diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c index ac67831779..0608ea37f5 100644 --- a/src/bin/pg_resetwal/pg_resetwal.c +++ b/src/bin/pg_resetwal/pg_resetwal.c @@ -70,6 +70,7 @@ static MultiXactId set_mxid = 0; static MultiXactOffset set_mxoff = (MultiXactOffset) -1; static uint32 minXlogTli = 0; static XLogSegNo minXlogSegNo = 0; +static int WalSegsz; static void CheckDataVersion(void); static bool ReadControlFile(void); @@ -94,6 +95,7 @@ main(int argc, char *argv[]) char *endptr; char *endptr2; char *DataDir = NULL; + char *log_fname = NULL; int fd; set_pglocale_pgservice(argv[0], PG_TEXTDOMAIN("pg_resetwal")); @@ -265,7 +267,12 @@ main(int argc, char *argv[]) fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname); exit(1); } - XLogFromFileName(optarg, &minXlogTli, &minXlogSegNo); + + /* + * XLogFromFileName requires wal segment size which is not yet + * set. Hence wal details are set later on. + */ + log_fname = pg_strdup(optarg); break; default: @@ -350,6 +357,9 @@ main(int argc, char *argv[]) if (!ReadControlFile()) GuessControlValues(); + if (log_fname != NULL) + XLogFromFileName(log_fname, &minXlogTli, &minXlogSegNo, WalSegsz); + /* * Also look at existing segment files to set up newXlogSegNo */ @@ -573,18 +583,26 @@ ReadControlFile(void) offsetof(ControlFileData, crc)); FIN_CRC32C(crc); - if (EQ_CRC32C(crc, ((ControlFileData *) buffer)->crc)) + if (!EQ_CRC32C(crc, ((ControlFileData *) buffer)->crc)) { - /* Valid data... */ - memcpy(&ControlFile, buffer, sizeof(ControlFile)); - return true; + /* We will use the data but treat it as guessed. */ + fprintf(stderr, _("%s: pg_control exists but has invalid CRC; proceed with caution\n"), + progname); + guessed = true; } - fprintf(stderr, _("%s: pg_control exists but has invalid CRC; proceed with caution\n"), - progname); - /* We will use the data anyway, but treat it as guessed. */ memcpy(&ControlFile, buffer, sizeof(ControlFile)); - guessed = true; + WalSegsz = ControlFile.xlog_seg_size; + + /* return false if WalSegsz is not valid */ + if (!IsValidWalSegSize(WalSegsz)) + { + fprintf(stderr, + _("%s: WAL segment size must be a power of two between 1MB and 1GB; ignoring control file specifying %d bytes\n"), + progname, WalSegsz); + return false; + } + return true; } @@ -660,7 +678,7 @@ GuessControlValues(void) ControlFile.blcksz = BLCKSZ; ControlFile.relseg_size = RELSEG_SIZE; ControlFile.xlog_blcksz = XLOG_BLCKSZ; - ControlFile.xlog_seg_size = XLOG_SEG_SIZE; + ControlFile.xlog_seg_size = DEFAULT_XLOG_SEG_SIZE; ControlFile.nameDataLen = NAMEDATALEN; ControlFile.indexMaxKeys = INDEX_MAX_KEYS; ControlFile.toast_max_chunk_size = TOAST_MAX_CHUNK_SIZE; @@ -773,7 +791,8 @@ PrintNewControlValues(void) /* This will be always printed in order to keep format same. */ printf(_("\n\nValues to be changed:\n\n")); - XLogFileName(fname, ControlFile.checkPointCopy.ThisTimeLineID, newXlogSegNo); + XLogFileName(fname, ControlFile.checkPointCopy.ThisTimeLineID, + newXlogSegNo, WalSegsz); printf(_("First log segment after reset: %s\n"), fname); if (set_mxid != 0) @@ -850,7 +869,7 @@ RewriteControlFile(void) * newXlogSegNo. */ XLogSegNoOffsetToRecPtr(newXlogSegNo, SizeOfXLogLongPHD, - ControlFile.checkPointCopy.redo); + ControlFile.checkPointCopy.redo, WalSegsz); ControlFile.checkPointCopy.time = (pg_time_t) time(NULL); ControlFile.state = DB_SHUTDOWNED; @@ -877,7 +896,7 @@ RewriteControlFile(void) ControlFile.max_locks_per_xact = 64; /* Now we can force the recorded xlog seg size to the right thing. */ - ControlFile.xlog_seg_size = XLogSegSize; + ControlFile.xlog_seg_size = WalSegsz; /* Contents are protected with a CRC */ INIT_CRC32C(ControlFile.crc); @@ -1014,7 +1033,7 @@ FindEndOfXLOG(void) * are in virgin territory. */ xlogbytepos = newXlogSegNo * ControlFile.xlog_seg_size; - newXlogSegNo = (xlogbytepos + XLogSegSize - 1) / XLogSegSize; + newXlogSegNo = (xlogbytepos + WalSegsz - 1) / WalSegsz; newXlogSegNo++; } @@ -1151,7 +1170,7 @@ WriteEmptyXLOG(void) page->xlp_pageaddr = ControlFile.checkPointCopy.redo - SizeOfXLogLongPHD; longpage = (XLogLongPageHeader) page; longpage->xlp_sysid = ControlFile.system_identifier; - longpage->xlp_seg_size = XLogSegSize; + longpage->xlp_seg_size = WalSegsz; longpage->xlp_xlog_blcksz = XLOG_BLCKSZ; /* Insert the initial checkpoint record */ @@ -1176,7 +1195,8 @@ WriteEmptyXLOG(void) record->xl_crc = crc; /* Write the first page */ - XLogFilePath(path, ControlFile.checkPointCopy.ThisTimeLineID, newXlogSegNo); + XLogFilePath(path, ControlFile.checkPointCopy.ThisTimeLineID, + newXlogSegNo, WalSegsz); unlink(path); @@ -1202,7 +1222,7 @@ WriteEmptyXLOG(void) /* Fill the rest of the file with zeroes */ memset(buffer, 0, XLOG_BLCKSZ); - for (nbytes = XLOG_BLCKSZ; nbytes < XLogSegSize; nbytes += XLOG_BLCKSZ) + for (nbytes = XLOG_BLCKSZ; nbytes < WalSegsz; nbytes += XLOG_BLCKSZ) { errno = 0; if (write(fd, buffer, XLOG_BLCKSZ) != XLOG_BLCKSZ) diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c index 1befdbdeea..af5b891f01 100644 --- a/src/bin/pg_rewind/parsexlog.c +++ b/src/bin/pg_rewind/parsexlog.c @@ -69,7 +69,7 @@ extractPageMap(const char *datadir, XLogRecPtr startpoint, int tliIndex, private.datadir = datadir; private.tliIndex = tliIndex; - xlogreader = XLogReaderAllocate(&SimpleXLogPageRead, &private); + xlogreader = XLogReaderAllocate(rwnd_segsize, &SimpleXLogPageRead, &private); if (xlogreader == NULL) pg_fatal("out of memory\n"); @@ -122,7 +122,8 @@ readOneRecord(const char *datadir, XLogRecPtr ptr, int tliIndex) private.datadir = datadir; private.tliIndex = tliIndex; - xlogreader = XLogReaderAllocate(&SimpleXLogPageRead, &private); + xlogreader = XLogReaderAllocate(rwnd_segsize, &SimpleXLogPageRead, + &private); if (xlogreader == NULL) pg_fatal("out of memory\n"); @@ -170,11 +171,13 @@ findLastCheckpoint(const char *datadir, XLogRecPtr forkptr, int tliIndex, * header in that case to find the next record. */ if (forkptr % XLOG_BLCKSZ == 0) - forkptr += (forkptr % XLogSegSize == 0) ? SizeOfXLogLongPHD : SizeOfXLogShortPHD; + forkptr += (XLogSegmentOffset(forkptr, rwnd_segsize) == 0) ? + SizeOfXLogLongPHD : SizeOfXLogShortPHD; private.datadir = datadir; private.tliIndex = tliIndex; - xlogreader = XLogReaderAllocate(&SimpleXLogPageRead, &private); + xlogreader = XLogReaderAllocate(rwnd_segsize, &SimpleXLogPageRead, + &private); if (xlogreader == NULL) pg_fatal("out of memory\n"); @@ -239,21 +242,21 @@ SimpleXLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, XLogRecPtr targetSegEnd; XLogSegNo targetSegNo; - XLByteToSeg(targetPagePtr, targetSegNo); - XLogSegNoOffsetToRecPtr(targetSegNo + 1, 0, targetSegEnd); - targetPageOff = targetPagePtr % XLogSegSize; + XLByteToSeg(targetPagePtr, targetSegNo, rwnd_segsize); + XLogSegNoOffsetToRecPtr(targetSegNo + 1, 0, targetSegEnd, rwnd_segsize); + targetPageOff = XLogSegmentOffset(targetPagePtr, rwnd_segsize); /* * See if we need to switch to a new segment because the requested record * is not in the currently open one. */ - if (xlogreadfd >= 0 && !XLByteInSeg(targetPagePtr, xlogreadsegno)) + if (xlogreadfd >= 0 && !XLByteInSeg(targetPagePtr, xlogreadsegno, rwnd_segsize)) { close(xlogreadfd); xlogreadfd = -1; } - XLByteToSeg(targetPagePtr, xlogreadsegno); + XLByteToSeg(targetPagePtr, xlogreadsegno, rwnd_segsize); if (xlogreadfd < 0) { @@ -272,7 +275,8 @@ SimpleXLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, targetHistory[private->tliIndex].begin >= targetSegEnd) private->tliIndex--; - XLogFileName(xlogfname, targetHistory[private->tliIndex].tli, xlogreadsegno); + XLogFileName(xlogfname, targetHistory[private->tliIndex].tli, + xlogreadsegno, rwnd_segsize); snprintf(xlogfpath, MAXPGPATH, "%s/" XLOGDIR "/%s", private->datadir, xlogfname); diff --git a/src/bin/pg_rewind/pg_rewind.c b/src/bin/pg_rewind/pg_rewind.c index 4bd1a75973..757e41129f 100644 --- a/src/bin/pg_rewind/pg_rewind.c +++ b/src/bin/pg_rewind/pg_rewind.c @@ -44,6 +44,7 @@ static ControlFileData ControlFile_target; static ControlFileData ControlFile_source; const char *progname; +int rwnd_segsize; /* Configuration options */ char *datadir_target = NULL; @@ -572,8 +573,8 @@ createBackupLabel(XLogRecPtr startpoint, TimeLineID starttli, XLogRecPtr checkpo char buf[1000]; int len; - XLByteToSeg(startpoint, startsegno); - XLogFileName(xlogfilename, starttli, startsegno); + XLByteToSeg(startpoint, startsegno, rwnd_segsize); + XLogFileName(xlogfilename, starttli, startsegno, rwnd_segsize); /* * Construct backup label file @@ -631,6 +632,13 @@ digestControlFile(ControlFileData *ControlFile, char *src, size_t size) memcpy(ControlFile, src, sizeof(ControlFileData)); + /* set and validate rwnd_segsize */ + rwnd_segsize = ControlFile->xlog_seg_size; + + if (!IsValidWalSegSize(rwnd_segsize)) + pg_fatal("WAL segment size must be a power of two between 1MB and 1GB, but the control file specifies %d bytes\n", + rwnd_segsize); + /* Additional checks on control file */ checkControlFile(ControlFile); } diff --git a/src/bin/pg_rewind/pg_rewind.h b/src/bin/pg_rewind/pg_rewind.h index 31353dd354..77749fb37e 100644 --- a/src/bin/pg_rewind/pg_rewind.h +++ b/src/bin/pg_rewind/pg_rewind.h @@ -24,6 +24,7 @@ extern char *connstr_source; extern bool debug; extern bool showprogress; extern bool dry_run; +extern int rwnd_segsize; /* Target history */ extern TimeLineHistoryEntry *targetHistory; diff --git a/src/bin/pg_test_fsync/pg_test_fsync.c b/src/bin/pg_test_fsync/pg_test_fsync.c index c607b5371c..08548eaf1c 100644 --- a/src/bin/pg_test_fsync/pg_test_fsync.c +++ b/src/bin/pg_test_fsync/pg_test_fsync.c @@ -64,7 +64,7 @@ static const char *progname; static int secs_per_test = 5; static int needs_unlink = 0; -static char full_buf[XLOG_SEG_SIZE], +static char full_buf[DEFAULT_XLOG_SEG_SIZE], *buf, *filename = FSYNC_FILENAME; static struct timeval start_t, @@ -209,7 +209,7 @@ prepare_buf(void) int ops; /* write random data into buffer */ - for (ops = 0; ops < XLOG_SEG_SIZE; ops++) + for (ops = 0; ops < DEFAULT_XLOG_SEG_SIZE; ops++) full_buf[ops] = random(); buf = (char *) TYPEALIGN(XLOG_BLCKSZ, full_buf); @@ -226,7 +226,7 @@ test_open(void) if ((tmpfile = open(filename, O_RDWR | O_CREAT, S_IRUSR | S_IWUSR)) == -1) die("could not open output file"); needs_unlink = 1; - if (write(tmpfile, full_buf, XLOG_SEG_SIZE) != XLOG_SEG_SIZE) + if (write(tmpfile, full_buf, DEFAULT_XLOG_SEG_SIZE) != DEFAULT_XLOG_SEG_SIZE) die("write failed"); /* fsync now so that dirty buffers don't skew later tests */ diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index 5aa3233bd3..63b0743046 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -13,6 +13,7 @@ #include "postgres.h" #include <dirent.h> +#include <sys/stat.h> #include <unistd.h> #include "access/xlogreader.h" @@ -26,6 +27,8 @@ static const char *progname; +static int WalSegsz; + typedef struct XLogDumpPrivate { TimeLineID timeline; @@ -144,77 +147,168 @@ split_path(const char *path, char **dir, char **fname) } /* - * Try to find the file in several places: - * if directory == NULL: - * fname - * XLOGDIR / fname - * $PGDATA / XLOGDIR / fname - * else - * directory / fname - * directory / XLOGDIR / fname + * Open the file in the valid target directory. * * return a read only fd */ static int -fuzzy_open_file(const char *directory, const char *fname) +open_file_in_directory(const char *directory, const char *fname) { int fd = -1; char fpath[MAXPGPATH]; - if (directory == NULL) + Assert(directory != NULL); + + snprintf(fpath, MAXPGPATH, "%s/%s", directory, fname); + fd = open(fpath, O_RDONLY | PG_BINARY, 0); + + if (fd < 0 && errno != ENOENT) + fatal_error("could not open file \"%s\": %s", + fname, strerror(errno)); + return fd; +} + +/* + * Try to find fname in the given directory. Returns true if it is found, + * false otherwise. If fname is NULL, search the complete directory for any + * file with a valid WAL file name. If file is successfully opened, set the + * wal segment size. + */ +static bool +search_directory(char *directory, char *fname) +{ + int fd = -1; + DIR *xldir; + + /* open file if valid filename is provided */ + if (fname != NULL) + fd = open_file_in_directory(directory, fname); + + /* + * A valid file name is not passed so search the complete directory. If + * we find any file whose name is like a valid WAL file name then try to + * open it. If we can not open it then bail out. + */ + else if ((xldir = opendir(directory)) != NULL) { - const char *datadir; + struct dirent *xlde; - /* fname */ - fd = open(fname, O_RDONLY | PG_BINARY, 0); - if (fd < 0 && errno != ENOENT) - return -1; - else if (fd >= 0) - return fd; - - /* XLOGDIR / fname */ - snprintf(fpath, MAXPGPATH, "%s/%s", - XLOGDIR, fname); - fd = open(fpath, O_RDONLY | PG_BINARY, 0); - if (fd < 0 && errno != ENOENT) - return -1; - else if (fd >= 0) - return fd; - - datadir = getenv("PGDATA"); - /* $PGDATA / XLOGDIR / fname */ - if (datadir != NULL) + while ((xlde = readdir(xldir)) != NULL) { - snprintf(fpath, MAXPGPATH, "%s/%s/%s", - datadir, XLOGDIR, fname); - fd = open(fpath, O_RDONLY | PG_BINARY, 0); - if (fd < 0 && errno != ENOENT) - return -1; - else if (fd >= 0) - return fd; + if (IsXLogFileName(xlde->d_name)) + { + fd = open_file_in_directory(directory, xlde->d_name); + fname = xlde->d_name; + break; + } + } + + closedir(xldir); + } + + /* set WalSegsz if file is successfully opened */ + if (fd >= 0) + { + char *buf = (char *) malloc(XLOG_BLCKSZ); + + if (read(fd, buf, XLOG_BLCKSZ) == XLOG_BLCKSZ) + { + XLogPageHeader hdr = (XLogPageHeader) buf; + XLogLongPageHeader longhdr = (XLogLongPageHeader) hdr; + + WalSegsz = longhdr->xlp_seg_size; + + if (!IsValidWalSegSize(WalSegsz)) + fatal_error("WAL segment size must be a power of two between 1MB and 1GB, but the WAL file \"%s\" header specifies %d bytes", + fname, WalSegsz); + } + else + { + if (errno != 0) + fatal_error("could not read file \"%s\": %s", + fname, strerror(errno)); + else + fatal_error("not enough data in file \"%s\"", fname); + } + free(buf); + close(fd); + return true; + } + + return false; +} + +/* + * Identify the target directory and set WalSegsz. + * + * Try to find the file in several places: + * if directory != NULL: + * directory / + * directory / XLOGDIR / + * else + * . + * XLOGDIR / + * $PGDATA / XLOGDIR / + * + * Set the valid target directory in private->inpath. + */ +static void +identify_target_directory(XLogDumpPrivate *private, char *directory, + char *fname) +{ + char fpath[MAXPGPATH]; + + if (directory != NULL) + { + if (search_directory(directory, fname)) + { + private->inpath = strdup(directory); + return; + } + + /* directory / XLOGDIR */ + snprintf(fpath, MAXPGPATH, "%s/%s", directory, XLOGDIR); + if (search_directory(fpath, fname)) + { + private->inpath = strdup(fpath); + return; } } else { - /* directory / fname */ - snprintf(fpath, MAXPGPATH, "%s/%s", - directory, fname); - fd = open(fpath, O_RDONLY | PG_BINARY, 0); - if (fd < 0 && errno != ENOENT) - return -1; - else if (fd >= 0) - return fd; + const char *datadir; - /* directory / XLOGDIR / fname */ - snprintf(fpath, MAXPGPATH, "%s/%s/%s", - directory, XLOGDIR, fname); - fd = open(fpath, O_RDONLY | PG_BINARY, 0); - if (fd < 0 && errno != ENOENT) - return -1; - else if (fd >= 0) - return fd; + /* current directory */ + if (search_directory(".", fname)) + { + private->inpath = strdup("."); + return; + } + /* XLOGDIR */ + if (search_directory(XLOGDIR, fname)) + { + private->inpath = strdup(XLOGDIR); + return; + } + + datadir = getenv("PGDATA"); + /* $PGDATA / XLOGDIR */ + if (datadir != NULL) + { + snprintf(fpath, MAXPGPATH, "%s/%s", datadir, XLOGDIR); + if (search_directory(fpath, fname)) + { + private->inpath = strdup(fpath); + return; + } + } } - return -1; + + /* could not locate WAL file */ + if (fname) + fatal_error("could not locate WAL file \"%s\"", fname); + else + fatal_error("could not find any WAL file"); } /* @@ -244,9 +338,9 @@ XLogDumpXLogRead(const char *directory, TimeLineID timeline_id, int segbytes; int readbytes; - startoff = recptr % XLogSegSize; + startoff = XLogSegmentOffset(recptr, WalSegsz); - if (sendFile < 0 || !XLByteInSeg(recptr, sendSegNo)) + if (sendFile < 0 || !XLByteInSeg(recptr, sendSegNo, WalSegsz)) { char fname[MAXFNAMELEN]; int tries; @@ -255,9 +349,9 @@ XLogDumpXLogRead(const char *directory, TimeLineID timeline_id, if (sendFile >= 0) close(sendFile); - XLByteToSeg(recptr, sendSegNo); + XLByteToSeg(recptr, sendSegNo, WalSegsz); - XLogFileName(fname, timeline_id, sendSegNo); + XLogFileName(fname, timeline_id, sendSegNo, WalSegsz); /* * In follow mode there is a short period of time after the server @@ -267,7 +361,7 @@ XLogDumpXLogRead(const char *directory, TimeLineID timeline_id, */ for (tries = 0; tries < 10; tries++) { - sendFile = fuzzy_open_file(directory, fname); + sendFile = open_file_in_directory(directory, fname); if (sendFile >= 0) break; if (errno == ENOENT) @@ -298,7 +392,7 @@ XLogDumpXLogRead(const char *directory, TimeLineID timeline_id, int err = errno; char fname[MAXPGPATH]; - XLogFileName(fname, timeline_id, sendSegNo); + XLogFileName(fname, timeline_id, sendSegNo, WalSegsz); fatal_error("could not seek in log file %s to offset %u: %s", fname, startoff, strerror(err)); @@ -307,8 +401,8 @@ XLogDumpXLogRead(const char *directory, TimeLineID timeline_id, } /* How many bytes are within this segment? */ - if (nbytes > (XLogSegSize - startoff)) - segbytes = XLogSegSize - startoff; + if (nbytes > (WalSegsz - startoff)) + segbytes = WalSegsz - startoff; else segbytes = nbytes; @@ -318,7 +412,7 @@ XLogDumpXLogRead(const char *directory, TimeLineID timeline_id, int err = errno; char fname[MAXPGPATH]; - XLogFileName(fname, timeline_id, sendSegNo); + XLogFileName(fname, timeline_id, sendSegNo, WalSegsz); fatal_error("could not read from log file %s, offset %u, length %d: %s", fname, sendOff, segbytes, strerror(err)); @@ -935,17 +1029,18 @@ main(int argc, char **argv) private.inpath, strerror(errno)); } - fd = fuzzy_open_file(private.inpath, fname); + identify_target_directory(&private, private.inpath, fname); + fd = open_file_in_directory(private.inpath, fname); if (fd < 0) fatal_error("could not open file \"%s\"", fname); close(fd); /* parse position from file */ - XLogFromFileName(fname, &private.timeline, &segno); + XLogFromFileName(fname, &private.timeline, &segno, WalSegsz); if (XLogRecPtrIsInvalid(private.startptr)) - XLogSegNoOffsetToRecPtr(segno, 0, private.startptr); - else if (!XLByteInSeg(private.startptr, segno)) + XLogSegNoOffsetToRecPtr(segno, 0, private.startptr, WalSegsz); + else if (!XLByteInSeg(private.startptr, segno, WalSegsz)) { fprintf(stderr, _("%s: start WAL location %X/%X is not inside file \"%s\"\n"), @@ -958,7 +1053,7 @@ main(int argc, char **argv) /* no second file specified, set end position */ if (!(optind + 1 < argc) && XLogRecPtrIsInvalid(private.endptr)) - XLogSegNoOffsetToRecPtr(segno + 1, 0, private.endptr); + XLogSegNoOffsetToRecPtr(segno + 1, 0, private.endptr, WalSegsz); /* parse ENDSEG if passed */ if (optind + 1 < argc) @@ -968,28 +1063,29 @@ main(int argc, char **argv) /* ignore directory, already have that */ split_path(argv[optind + 1], &directory, &fname); - fd = fuzzy_open_file(private.inpath, fname); + fd = open_file_in_directory(private.inpath, fname); if (fd < 0) fatal_error("could not open file \"%s\"", fname); close(fd); /* parse position from file */ - XLogFromFileName(fname, &private.timeline, &endsegno); + XLogFromFileName(fname, &private.timeline, &endsegno, WalSegsz); if (endsegno < segno) fatal_error("ENDSEG %s is before STARTSEG %s", argv[optind + 1], argv[optind]); if (XLogRecPtrIsInvalid(private.endptr)) - XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.endptr); + XLogSegNoOffsetToRecPtr(endsegno + 1, 0, private.endptr, + WalSegsz); /* set segno to endsegno for check of --end */ segno = endsegno; } - if (!XLByteInSeg(private.endptr, segno) && - private.endptr != (segno + 1) * XLogSegSize) + if (!XLByteInSeg(private.endptr, segno, WalSegsz) && + private.endptr != (segno + 1) * WalSegsz) { fprintf(stderr, _("%s: end WAL location %X/%X is not inside file \"%s\"\n"), @@ -1000,6 +1096,8 @@ main(int argc, char **argv) goto bad_argument; } } + else + identify_target_directory(&private, private.inpath, NULL); /* we don't know what to print */ if (XLogRecPtrIsInvalid(private.startptr)) @@ -1011,7 +1109,8 @@ main(int argc, char **argv) /* done with argument parsing, do the actual work */ /* we have everything we need, start reading */ - xlogreader_state = XLogReaderAllocate(XLogDumpReadPage, &private); + xlogreader_state = XLogReaderAllocate(WalSegsz, XLogDumpReadPage, + &private); if (!xlogreader_state) fatal_error("out of memory"); @@ -1028,7 +1127,8 @@ main(int argc, char **argv) * to the start of a record and also wasn't a pointer to the beginning of * a segment (e.g. we were used in file mode). */ - if (first_record != private.startptr && (private.startptr % XLogSegSize) != 0) + if (first_record != private.startptr && + XLogSegmentOffset(private.startptr, WalSegsz) != 0) printf(ngettext("first record is after %X/%X, at %X/%X, skipping over %u byte\n", "first record is after %X/%X, at %X/%X, skipping over %u bytes\n", (first_record - private.startptr)), diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h index 66bfb77295..bc49918d17 100644 --- a/src/include/access/xlog.h +++ b/src/include/access/xlog.h @@ -94,6 +94,7 @@ extern PGDLLIMPORT XLogRecPtr XactLastCommitEnd; extern bool reachedConsistency; /* these variables are GUC parameters related to XLOG */ +extern int wal_segment_size; extern int min_wal_size_mb; extern int max_wal_size_mb; extern int wal_keep_segments; @@ -161,6 +162,7 @@ extern PGDLLIMPORT int wal_level; /* Do we need to WAL-log information required only for logical replication? */ #define XLogLogicalInfoActive() (wal_level >= WAL_LEVEL_LOGICAL) + #ifdef WAL_DEBUG extern bool XLOG_DEBUG; #endif @@ -262,6 +264,7 @@ extern Size XLOGShmemSize(void); extern void XLOGShmemInit(void); extern void BootStrapXLOG(void); extern void StartupXLOG(void); +extern void CalculateUsableBytesInSegment(void); extern void ShutdownXLOG(int code, Datum arg); extern void InitXLOGAccess(void); extern void CreateCheckPoint(int flags); diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h index 7453dcbd0e..d37ea7eddb 100644 --- a/src/include/access/xlog_internal.h +++ b/src/include/access/xlog_internal.h @@ -85,15 +85,27 @@ typedef XLogLongPageHeaderData *XLogLongPageHeader; #define XLogPageHeaderSize(hdr) \ (((hdr)->xlp_info & XLP_LONG_HEADER) ? SizeOfXLogLongPHD : SizeOfXLogShortPHD) -/* - * The XLOG is split into WAL segments (physical files) of the size indicated - * by XLOG_SEG_SIZE. - */ -#define XLogSegSize ((uint32) XLOG_SEG_SIZE) -#define XLogSegmentsPerXLogId (UINT64CONST(0x100000000) / XLOG_SEG_SIZE) +/* wal_segment_size can range from 1MB to 1GB */ +#define WalSegMinSize 1024 * 1024 +#define WalSegMaxSize 1024 * 1024 * 1024 +/* default number of min and max wal segments */ +#define DEFAULT_MIN_WAL_SEGS 5 +#define DEFAULT_MAX_WAL_SEGS 64 -#define XLogSegNoOffsetToRecPtr(segno, offset, dest) \ - (dest) = (segno) * XLOG_SEG_SIZE + (offset) +/* check that the given size is a valid wal_segment_size */ +#define IsPowerOf2(x) (x > 0 && ((x) & ((x)-1)) == 0) +#define IsValidWalSegSize(size) \ + (IsPowerOf2(size) && \ + ((size) >= WalSegMinSize && (size) <= WalSegMaxSize)) + +#define XLogSegmentsPerXLogId(wal_segsz_bytes) \ + (UINT64CONST(0x100000000) / (wal_segsz_bytes)) + +#define XLogSegNoOffsetToRecPtr(segno, offset, dest, wal_segsz_bytes) \ + (dest) = (segno) * (wal_segsz_bytes) + (offset) + +#define XLogSegmentOffset(xlogptr, wal_segsz_bytes) \ + ((xlogptr) & ((wal_segsz_bytes) - 1)) /* * Compute a segment number from an XLogRecPtr. @@ -103,11 +115,11 @@ typedef XLogLongPageHeaderData *XLogLongPageHeader; * for deciding which segment to write given a pointer to a record end, * for example. */ -#define XLByteToSeg(xlrp, logSegNo) \ - logSegNo = (xlrp) / XLogSegSize +#define XLByteToSeg(xlrp, logSegNo, wal_segsz_bytes) \ + logSegNo = (xlrp) / (wal_segsz_bytes) -#define XLByteToPrevSeg(xlrp, logSegNo) \ - logSegNo = ((xlrp) - 1) / XLogSegSize +#define XLByteToPrevSeg(xlrp, logSegNo, wal_segsz_bytes) \ + logSegNo = ((xlrp) - 1) / (wal_segsz_bytes) /* * Is an XLogRecPtr within a particular XLOG segment? @@ -115,11 +127,11 @@ typedef XLogLongPageHeaderData *XLogLongPageHeader; * For XLByteInSeg, do the computation at face value. For XLByteInPrevSeg, * a boundary byte is taken to be in the previous segment. */ -#define XLByteInSeg(xlrp, logSegNo) \ - (((xlrp) / XLogSegSize) == (logSegNo)) +#define XLByteInSeg(xlrp, logSegNo, wal_segsz_bytes) \ + (((xlrp) / (wal_segsz_bytes)) == (logSegNo)) -#define XLByteInPrevSeg(xlrp, logSegNo) \ - ((((xlrp) - 1) / XLogSegSize) == (logSegNo)) +#define XLByteInPrevSeg(xlrp, logSegNo, wal_segsz_bytes) \ + ((((xlrp) - 1) / (wal_segsz_bytes)) == (logSegNo)) /* Check if an XLogRecPtr value is in a plausible range */ #define XRecOffIsValid(xlrp) \ @@ -140,10 +152,10 @@ typedef XLogLongPageHeaderData *XLogLongPageHeader; /* Length of XLog file name */ #define XLOG_FNAME_LEN 24 -#define XLogFileName(fname, tli, logSegNo) \ +#define XLogFileName(fname, tli, logSegNo, wal_segsz_bytes) \ snprintf(fname, MAXFNAMELEN, "%08X%08X%08X", tli, \ - (uint32) ((logSegNo) / XLogSegmentsPerXLogId), \ - (uint32) ((logSegNo) % XLogSegmentsPerXLogId)) + (uint32) ((logSegNo) / XLogSegmentsPerXLogId(wal_segsz_bytes)), \ + (uint32) ((logSegNo) % XLogSegmentsPerXLogId(wal_segsz_bytes))) #define XLogFileNameById(fname, tli, log, seg) \ snprintf(fname, MAXFNAMELEN, "%08X%08X%08X", tli, log, seg) @@ -162,18 +174,18 @@ typedef XLogLongPageHeaderData *XLogLongPageHeader; strspn(fname, "0123456789ABCDEF") == XLOG_FNAME_LEN && \ strcmp((fname) + XLOG_FNAME_LEN, ".partial") == 0) -#define XLogFromFileName(fname, tli, logSegNo) \ +#define XLogFromFileName(fname, tli, logSegNo, wal_segsz_bytes) \ do { \ uint32 log; \ uint32 seg; \ sscanf(fname, "%08X%08X%08X", tli, &log, &seg); \ - *logSegNo = (uint64) log * XLogSegmentsPerXLogId + seg; \ + *logSegNo = (uint64) log * XLogSegmentsPerXLogId(wal_segsz_bytes) + seg; \ } while (0) -#define XLogFilePath(path, tli, logSegNo) \ - snprintf(path, MAXPGPATH, XLOGDIR "/%08X%08X%08X", tli, \ - (uint32) ((logSegNo) / XLogSegmentsPerXLogId), \ - (uint32) ((logSegNo) % XLogSegmentsPerXLogId)) +#define XLogFilePath(path, tli, logSegNo, wal_segsz_bytes) \ + snprintf(path, MAXPGPATH, XLOGDIR "/%08X%08X%08X", tli, \ + (uint32) ((logSegNo) / XLogSegmentsPerXLogId(wal_segsz_bytes)), \ + (uint32) ((logSegNo) % XLogSegmentsPerXLogId(wal_segsz_bytes))) #define TLHistoryFileName(fname, tli) \ snprintf(fname, MAXFNAMELEN, "%08X.history", tli) @@ -189,20 +201,22 @@ typedef XLogLongPageHeaderData *XLogLongPageHeader; #define StatusFilePath(path, xlog, suffix) \ snprintf(path, MAXPGPATH, XLOGDIR "/archive_status/%s%s", xlog, suffix) -#define BackupHistoryFileName(fname, tli, logSegNo, offset) \ +#define BackupHistoryFileName(fname, tli, logSegNo, startpoint, wal_segsz_bytes) \ snprintf(fname, MAXFNAMELEN, "%08X%08X%08X.%08X.backup", tli, \ - (uint32) ((logSegNo) / XLogSegmentsPerXLogId), \ - (uint32) ((logSegNo) % XLogSegmentsPerXLogId), offset) + (uint32) ((logSegNo) / XLogSegmentsPerXLogId(wal_segsz_bytes)), \ + (uint32) ((logSegNo) % XLogSegmentsPerXLogId(wal_segsz_bytes)), \ + (uint32) (XLogSegmentOffset(startpoint, wal_segsz_bytes))) #define IsBackupHistoryFileName(fname) \ (strlen(fname) > XLOG_FNAME_LEN && \ strspn(fname, "0123456789ABCDEF") == XLOG_FNAME_LEN && \ strcmp((fname) + strlen(fname) - strlen(".backup"), ".backup") == 0) -#define BackupHistoryFilePath(path, tli, logSegNo, offset) \ +#define BackupHistoryFilePath(path, tli, logSegNo, startpoint, wal_segsz_bytes) \ snprintf(path, MAXPGPATH, XLOGDIR "/%08X%08X%08X.%08X.backup", tli, \ - (uint32) ((logSegNo) / XLogSegmentsPerXLogId), \ - (uint32) ((logSegNo) % XLogSegmentsPerXLogId), offset) + (uint32) ((logSegNo) / XLogSegmentsPerXLogId(wal_segsz_bytes)), \ + (uint32) ((logSegNo) % XLogSegmentsPerXLogId(wal_segsz_bytes)), \ + (uint32) (XLogSegmentOffset((startpoint), wal_segsz_bytes))) /* * Information logged when we detect a change in one of the parameters diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h index 7671598334..c8decc2a39 100644 --- a/src/include/access/xlogreader.h +++ b/src/include/access/xlogreader.h @@ -73,6 +73,11 @@ struct XLogReaderState * ---------------------------------------- */ + /* + * Segment size of the to-be-parsed data (mandatory). + */ + int wal_segment_size; + /* * Data input callback (mandatory). * @@ -189,7 +194,8 @@ struct XLogReaderState }; /* Get a new XLogReader */ -extern XLogReaderState *XLogReaderAllocate(XLogPageReadCB pagereadfunc, +extern XLogReaderState *XLogReaderAllocate(int wal_segment_size, + XLogPageReadCB pagereadfunc, void *private_data); /* Free an XLogReader */ diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h index 1ec03caf5f..3fed3b6431 100644 --- a/src/include/catalog/pg_control.h +++ b/src/include/catalog/pg_control.h @@ -21,7 +21,7 @@ /* Version identifier for this pg_control format */ -#define PG_CONTROL_VERSION 1002 +#define PG_CONTROL_VERSION 1003 /* Nonce key length, see below */ #define MOCK_AUTH_NONCE_LEN 32 diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in index 579d195663..85deb29d83 100644 --- a/src/include/pg_config.h.in +++ b/src/include/pg_config.h.in @@ -895,11 +895,6 @@ */ #undef XLOG_BLCKSZ -/* XLOG_SEG_SIZE is the size of a single WAL file. This must be a power of 2 - and larger than XLOG_BLCKSZ (preferably, a great deal larger than - XLOG_BLCKSZ). Changing XLOG_SEG_SIZE requires an initdb. */ -#undef XLOG_SEG_SIZE - /* Number of bits in a file offset, on hosts where this is settable. */ diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h index f3b35297d1..9615a389af 100644 --- a/src/include/pg_config_manual.h +++ b/src/include/pg_config_manual.h @@ -13,6 +13,12 @@ *------------------------------------------------------------------------ */ +/* + * This is default value for wal_segment_size to be used at intidb when run + * without --walsegsize option. Must be a valid segment size. + */ +#define DEFAULT_XLOG_SEG_SIZE (16*1024*1024) + /* * Maximum length for identifiers (e.g. table names, column names, * function names). Names actually are limited to one less byte than this, diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm index 19a95ddc0e..5d5f716b6f 100644 --- a/src/tools/msvc/Solution.pm +++ b/src/tools/msvc/Solution.pm @@ -179,8 +179,6 @@ s{PG_VERSION_STR "[^"]+"}{PG_VERSION_STR "PostgreSQL $self->{strver}$extraver, c 1024, "\n"; print $o "#define XLOG_BLCKSZ ", 1024 * $self->{options}->{wal_blocksize}, "\n"; - print $o "#define XLOG_SEG_SIZE (", $self->{options}->{wal_segsize}, - " * 1024 * 1024)\n"; if ($self->{options}->{float4byval}) { -- 2.14.1.2.g4274c698f4.dirty
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers