On 20/06/2025 18:45, Maxim Orlov wrote:
Hi!

I've noticed two bugs reported [0] and [1] which are both related to the wraparound of mxid and mxoff. Problems for mxid and mxoff are minor, as they require hitting the exact overflow limit to
occur. But it's better to correct it.

I included a test to reproduce the problem (see 0001). It is not intended to be committed, I
guess. I then added a commit with a fix.

There two fixes.
1) pg_upgrade does not consider the mxid to be in a wraparound state. In this case, I adjust     the mxid value to the FirstMultiXactId. Alternatively, it might be conceivable to include the     ability to set InvalidMultiXactId in pg_resetwal, but this seems odd to me.

I think we should allow InvalidMultiXactId in pg_resetwal. It can appear in the control file of existing clusters, so it seems weird if you cannot use pg_resetwal to reset to that value.

I also find it weird that we ever store InvalidMultiXactId in the control file, and have code in all the places that read it to deal with the possibility. It would seem much more straightforward to skip over InvalidMultiXactId when we set 'nextMXact'. I think we should change that in v19, but that's a separate patch.

2) pg_resetwall forbids to set mxoff to UINT_MAX32. I'm not sure if this was done on
     purpose or not, but perhaps this check can be removed.

With your patch, pg_resetwal will *always* reset nextMultiOffset, even if you didn't specify --multixact-offset.

I propose the attached fix. It adds separate booleans to pg_resetwal to track whether -m or -O option was given, and allows 0 for the next multixid and UINT32_MAX for next multixact offset.

Can oldestMultiXactId ever be 0? I think not, but then again ReadMultiXactIdRange() seems to be prepared to deal with that.

- Heikki
From 924c63daf1ceb9d796e18567afa6904cbe672a00 Mon Sep 17 00:00:00 2001
From: Maxim Orlov <[email protected]>
Date: Fri, 20 Jun 2025 17:21:08 +0300
Subject: [PATCH v2 1/1] Fix the mxid and mxoff wraparound issues in
 pg_upgrade.

Per bags:
- BUG #18863: Multixact wraparound and pg_resetwal error "multitransaction ID (-m) must not be 0"
- BUG #18865: pg_resetwal error: multitransaction offset (-O) must not be -1
---
 src/bin/pg_resetwal/pg_resetwal.c | 18 +++++++++---------
 src/bin/pg_upgrade/pg_upgrade.c   | 11 ++++++++++-
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index a89d72fc5cf..3d61f025c5e 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -70,8 +70,10 @@ static TransactionId set_xid = 0;
 static TransactionId set_oldest_commit_ts_xid = 0;
 static TransactionId set_newest_commit_ts_xid = 0;
 static Oid	set_oid = 0;
+static bool mxid_given = false;
 static MultiXactId set_mxid = 0;
-static MultiXactOffset set_mxoff = (MultiXactOffset) -1;
+static bool mxoff_given = false;
+static MultiXactOffset set_mxoff = 0;
 static TimeLineID minXlogTli = 0;
 static XLogSegNo minXlogSegNo = 0;
 static int	WalSegSz;
@@ -246,6 +248,7 @@ main(int argc, char *argv[])
 					pg_log_error_hint("Try \"%s --help\" for more information.", progname);
 					exit(1);
 				}
+				mxid_given = true;
 
 				set_oldestmxid = strtoul(endptr + 1, &endptr2, 0);
 				if (endptr2 == endptr + 1 || *endptr2 != '\0' || errno != 0)
@@ -254,8 +257,6 @@ main(int argc, char *argv[])
 					pg_log_error_hint("Try \"%s --help\" for more information.", progname);
 					exit(1);
 				}
-				if (set_mxid == 0)
-					pg_fatal("multitransaction ID (-m) must not be 0");
 
 				/*
 				 * XXX It'd be nice to have more sanity checks here, e.g. so
@@ -274,8 +275,7 @@ main(int argc, char *argv[])
 					pg_log_error_hint("Try \"%s --help\" for more information.", progname);
 					exit(1);
 				}
-				if (set_mxoff == -1)
-					pg_fatal("multitransaction offset (-O) must not be -1");
+				mxoff_given = true;
 				break;
 
 			case 'l':
@@ -454,7 +454,7 @@ main(int argc, char *argv[])
 	if (set_oid != 0)
 		ControlFile.checkPointCopy.nextOid = set_oid;
 
-	if (set_mxid != 0)
+	if (mxid_given)
 	{
 		ControlFile.checkPointCopy.nextMulti = set_mxid;
 
@@ -464,7 +464,7 @@ main(int argc, char *argv[])
 		ControlFile.checkPointCopy.oldestMultiDB = InvalidOid;
 	}
 
-	if (set_mxoff != -1)
+	if (mxoff_given)
 		ControlFile.checkPointCopy.nextMultiOffset = set_mxoff;
 
 	if (minXlogTli > ControlFile.checkPointCopy.ThisTimeLineID)
@@ -805,7 +805,7 @@ PrintNewControlValues(void)
 				 newXlogSegNo, WalSegSz);
 	printf(_("First log segment after reset:        %s\n"), fname);
 
-	if (set_mxid != 0)
+	if (mxid_given)
 	{
 		printf(_("NextMultiXactId:                      %u\n"),
 			   ControlFile.checkPointCopy.nextMulti);
@@ -815,7 +815,7 @@ PrintNewControlValues(void)
 			   ControlFile.checkPointCopy.oldestMultiDB);
 	}
 
-	if (set_mxoff != -1)
+	if (mxoff_given)
 	{
 		printf(_("NextMultiOffset:                      %u\n"),
 			   ControlFile.checkPointCopy.nextMultiOffset);
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 490e98fa26f..2a317e59108 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -816,6 +816,15 @@ copy_xact_xlog_xid(void)
 	if (old_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER &&
 		new_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER)
 	{
+		uint32	old_chkpnt_nxtmulti = old_cluster.controldata.chkpnt_nxtmulti;
+
+		/*
+		 * Beware of the possibility that chkpnt_nxtmulti is in the
+		 * wrapped-around state in old cluster.
+		 */
+		if (old_chkpnt_nxtmulti == 0)
+			old_chkpnt_nxtmulti = 1;	/* FirstMultiXactId */
+
 		copy_subdir_files("pg_multixact/offsets", "pg_multixact/offsets");
 		copy_subdir_files("pg_multixact/members", "pg_multixact/members");
 
@@ -829,7 +838,7 @@ copy_xact_xlog_xid(void)
 				  "\"%s/pg_resetwal\" -O %u -m %u,%u \"%s\"",
 				  new_cluster.bindir,
 				  old_cluster.controldata.chkpnt_nxtmxoff,
-				  old_cluster.controldata.chkpnt_nxtmulti,
+				  old_chkpnt_nxtmulti,
 				  old_cluster.controldata.chkpnt_oldstMulti,
 				  new_cluster.pgdata);
 		check_ok();
-- 
2.47.3

Reply via email to