On Thu, Oct 29, 2015 at 08:46:52AM +0100, Andres Freund wrote: > On October 29, 2015 7:59:03 AM GMT+01:00, Noah Misch <n...@leadboat.com> > wrote: > >That helps; thanks. Your design seems good. I've located only insipid > >defects. > > Great! > > > I propose to save some time by writing a patch series > >eliminating > >them, which you could hopefully review. Does that sound good? > > Yes, it does.
I have pushed a stack of branches to https://github.com/nmisch/postgresql.git: mxt0-revert - reverts commits 4f627f8 and aa29c1c mxt1-disk-independent - see below mxt2-cosmetic - update already-wrong comments and formatting mxt3-main - replaces commit 4f627f8 mxt4-rm-legacy - replaces commit aa29c1c The plan is to squash each branch into one PostgreSQL commit. In addition to examining overall "git diff mxt2-cosmetic mxt3-main", I recommend reviewing itemized changes and commit log entries in "git log -p --reverse --no-merges mxt2-cosmetic..mxt3-main". In particular, when a change involves something you discussed upthread or was otherwise not obvious, I put a statement of rationale in the commit log. > + * Make sure to only attempt truncation if there's values to truncate > + * away. In normal processing values shouldn't go backwards, but there's > + * some corner cases (due to bugs) where that's possible. Which bugs are those? I would like to include more detail if available. If anything here requires careful study, it's the small mxt1-disk-independent change, which I have also attached in patch form. That patch removes the SlruScanDirCbFindEarliest() test from TruncateMultiXact(), which in turn makes multixact control decisions independent of whether TruncateMultiXact() is successful at unlinking segments. Today, one undeletable segment file can cause TruncateMultiXact() to give up on truncation completely for a span of hundreds of millions of MultiXactId. Patched multixact.c will, like CLOG, make its decisions strictly based on the content of files it expects to exist. It will no longer depend on the absence of files it hopes will not exist. To aid in explaining the change's effects, I will define some terms. A "logical wrap" occurs when no range of 2^31 integers covers the set of MultiXactId stored in xmax fields. A "file-level wrap" occurs when there exists a pair of pg_multixact/offsets segment files such that: | segno_a * SLRU_PAGES_PER_SEGMENT * MULTIXACT_OFFSETS_PER_PAGE - segno_b * SLRU_PAGES_PER_SEGMENT * MULTIXACT_OFFSETS_PER_PAGE | > 2^31 A logical wrap implies either a file-level wrap or missing visibility metadata, but a file-level wrap does not imply other consequences. The SlruScanDirCbFindEarliest() test is usually redundant with find_multixact_start(), because MultiXactIdPrecedes(oldestMXact, earliest) almost implies that find_multixact_start() will fail. The exception arises when pg_multixact/offsets files compose a file-level wrap, which can happen when TruncateMultiXact() fails to unlink segments as planned. When it does happen, the result of SlruScanDirCbFindEarliest(), and therefore the computed "earliest" value, is undefined. (This outcome is connected to our requirement to use only half the pg_clog or pg_multixact/offsets address space at any one time. The PagePrecedes callbacks for these SLRUs cease to be transitive if more than half their address space is in use.) The SlruScanDirCbFindEarliest() test can be helpful when a file-level wrap coexists with incorrect oldestMultiXactId (extant xmax values define "correct oldestMultiXactId"). If we're lucky with readdir() order, the test will block truncation so we don't delete still-needed segments. I am content to lose that, because (a) the code isn't reliable for (or even directed toward) that purpose and (b) sites running on today's latest point releases no longer have incorrect oldestMultiXactId.
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c index cb12fc3..cb4a0cd 100644 --- a/src/backend/access/transam/multixact.c +++ b/src/backend/access/transam/multixact.c @@ -2945,29 +2945,6 @@ SlruScanDirCbRemoveMembers(SlruCtl ctl, char *filename, int segpage, return false; /* keep going */ } -typedef struct mxtruncinfo -{ - int earliestExistingPage; -} mxtruncinfo; - -/* - * SlruScanDirectory callback - * This callback determines the earliest existing page number. - */ -static bool -SlruScanDirCbFindEarliest(SlruCtl ctl, char *filename, int segpage, void *data) -{ - mxtruncinfo *trunc = (mxtruncinfo *) data; - - if (trunc->earliestExistingPage == -1 || - ctl->PagePrecedes(segpage, trunc->earliestExistingPage)) - { - trunc->earliestExistingPage = segpage; - } - - return false; /* keep going */ -} - /* * Remove all MultiXactOffset and MultiXactMember segments before the oldest * ones still of interest. @@ -2986,8 +2963,6 @@ TruncateMultiXact(void) MultiXactOffset oldestOffset; MultiXactId nextMXact; MultiXactOffset nextOffset; - mxtruncinfo trunc; - MultiXactId earliest; MembersLiveRange range; Assert(AmCheckpointerProcess() || AmStartupProcess() || @@ -3001,49 +2976,20 @@ TruncateMultiXact(void) Assert(MultiXactIdIsValid(oldestMXact)); /* - * Note we can't just plow ahead with the truncation; it's possible that - * there are no segments to truncate, which is a problem because we are - * going to attempt to read the offsets page to determine where to - * truncate the members SLRU. So we first scan the directory to determine - * the earliest offsets page number that we can read without error. - */ - trunc.earliestExistingPage = -1; - SlruScanDirectory(MultiXactOffsetCtl, SlruScanDirCbFindEarliest, &trunc); - earliest = trunc.earliestExistingPage * MULTIXACT_OFFSETS_PER_PAGE; - if (earliest < FirstMultiXactId) - earliest = FirstMultiXactId; - - /* - * If there's nothing to remove, we can bail out early. - * - * Due to bugs in early releases of PostgreSQL 9.3.X and 9.4.X, - * oldestMXact might point to a multixact that does not exist. - * Autovacuum will eventually advance it to a value that does exist, - * and we want to set a proper offsetStopLimit when that happens, - * so call DetermineSafeOldestOffset here even if we're not actually - * truncating. - */ - if (MultiXactIdPrecedes(oldestMXact, earliest)) - { - DetermineSafeOldestOffset(oldestMXact); - return; - } - - /* * First, compute the safe truncation point for MultiXactMember. This is * the starting offset of the oldest multixact. * - * Hopefully, find_multixact_start will always work here, because we've - * already checked that it doesn't precede the earliest MultiXact on - * disk. But if it fails, don't truncate anything, and log a message. + * Due to bugs in early releases of PostgreSQL 9.3.X and 9.4.X, + * oldestMXact might point to a multixact that does not exist. Call + * DetermineSafeOldestOffset() to emit the message about disabled member + * wraparound protection. Autovacuum will eventually advance oldestMXact + * to a value that does exist. */ if (oldestMXact == nextMXact) oldestOffset = nextOffset; /* there are NO MultiXacts */ else if (!find_multixact_start(oldestMXact, &oldestOffset)) { - ereport(LOG, - (errmsg("oldest MultiXact %u not found, earliest MultiXact %u, skipping truncation", - oldestMXact, earliest))); + DetermineSafeOldestOffset(oldestMXact); return; }
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers