Re: Temporary tables prevent autovacuum, leading to XID wraparound

Masahiko Sawada Thu, 08 Feb 2018 23:36:07 -0800

On Sat, Feb 3, 2018 at 1:48 AM, Robert Haas <[email protected]> wrote:
> On Fri, Feb 2, 2018 at 1:27 AM, Masahiko Sawada <[email protected]> wrote:
>> Thank you for suggestion. It sounds more smarter. So it would be more
>> better if we vacuums database for anti-wraparound in ascending order
>> of relfrozenxid?
>
> Currently, we're doing it based on datfrozenxid.  I was looking for a
> small, relatively safe change to improve things, which led to my
> proposal: continue using datfrozenxid when the database isn't being
> vacuumed, but when it is, substitute the datfrozenxid that we expect
> the database *will have* after the tables currently being vacuumed are
> finished for the actual datfrozenxid.
>
> On further reflection, though, I see problems.  Suppose db1 has
> age(datfrozenxid) = 600m and two tables with age(relfrozenxid) of 600m
> and 400m.  db2 has age(datfrozenxid) = 500m.  Then suppose #1 starts
> vacuuming the table with age 600m.  The AV launcher sees that, with
> that table out of the way, we'll be able to bring datfrozenxid forward
> to 400m and so sends worker #2 to db2.  But actually, when the worker
> in db1 finishes vacuuming the table with age 600m, it's going to have
> to vacuum the table with age 400m before performing any relfrozenxid
> update.  So actually, the table with age 400m is holding the
> relfrozenxid at 600m just as much as if it too had a relfrozenxid of
> 600m.  In fact, any tables in the same database that need routine
> vacuuming are a problem, too: we're going to vacuum all of those
> before updating relfrozenxid, too.
>
> Maybe we should start by making the scheduling algorithm used by the
> individual workers smarter:
>
> 1. Let's teach do_autovacuum() that, when there are any tables needing
> vacuum for wraparound, either for relfrozenxid or relminmxid, it
> should vacuum only those and forget about the rest.  This seems like
> an obvious optimization to prevent us from delaying
> datfrozenxid/datminmxid updates for the sake of vacuuming tables that
> are "merely" bloated.
>
> 2. Let's have do_autovacuum() sort the tables to be vacuumed in
> ascending order by relfrozenxid, so older tables are vacuumed first.
> A zero-order idea is to put tables needing relfrozenxid vacuuming
> before tables needing relminmxid vacuuming, and sort the latter by
> ascending relminmxid, but maybe there's something smarter possible
> there.  The idea here is to vacuum tables in order of priority rather
> than in whatever order they happen to be physically mentioned in
> pg_class.
>
> 3. Let's have do_autovacuum() make an extra call to
> vac_update_datfrozenxid() whenever the next table to be vacuumed is at
> least 10 million XIDs (or MXIDs) newer than the first one it vacuumed
> either since the last call to vac_update_datfrozenxid() or, if none,
> since it started.  That way, we don't have to wait for every single
> table to get vacuumed before we can consider advancing
> relfrozenxid/relminmxid.
>
> 4. When it's performing vacuuming for wraparound, let's have AV
> workers advertise in shared memory the oldest relfrozenxid and
> relminmxid that it might exist in the database.  Given #1 and #2, this
> is pretty easy, since we start by moving through tables in increasing
> relfrozenxid order and then shift to moving through them in increasing
> relminmxid order.  When we're working through the relfrozenxid tables,
> the oldest relminmxid doesn't move, and the oldest relfrozenxid is
> that of the next table in the list.  When we're working through the
> relminmxid tables, it's the reverse.  We need a little cleverness to
> figure out what value to advertise when we're on the last table in
> each list -- it should be the next-higher value, even though that will
> be above the relevant threshold, not a sentinel value.


So I think we should includes tables as well that are not at risk of
wraparound in order to get the next-higher value (that is, the oldest
table of the non-risked tables) instead of forgetting them. And then
we skip vacuuming them.

>
> 5. With those steps in place, I think we can now adopt my previous
> idea to have the AV launcher use any advertised relfrozenxid (and, as
> I now realize, relminmxid) instead of the ones found in pg_database,
> because now we know that individual workers are definitely focused on
> getting relfrozenxid (and/or relminmxid) as soon as possible, and
> vacuuming unrelated tables won't help them do it any faster.
>
> This gets us fairly close to vacuuming tables in decreasing order of
> wraparound danger across the entire cluster.  It's not perfect.  It
> prefers to keep vacuuming tables in the same database rather than
> having a worker exit and maybe launching a new one in a different
> database -- but the alternative is not very appealing.  If we didn't
> do it that way, and if we had a million tables with XIDs that were
> closely spaced spread across different databases, we'd have to
> terminate and relaunching workers at a very high rate to get
> everything sorted out, which would be inefficient and annoying to
> program.  Also, it keeps the existing hard prioritization of
> relfrozenxid over relminmxid, which could theoretically be wrong for
> some installation.  But I think that might not be a big problem in
> practice, and it seems like that could be separately improved at
> another time.
>

I think this algorithm works fine and improves the current behavior.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: Temporary tables prevent autovacuum, leading to XID wraparound

Reply via email to