Re: trying again to get incremental backup

Robert Haas Mon, 30 Oct 2023 07:45:30 -0700

On Tue, Oct 24, 2023 at 12:08 PM Robert Haas <robertmh...@gmail.com> wrote:
> Note that whether to remove summaries is a separate question from
> whether to generate them in the first place. Right now, I have
> wal_summarize_mb controlling whether they get generated in the first
> place, but as I noted in another recent email, that isn't an entirely
> satisfying solution.


I did some more research on this. My conclusion is that I should
remove wal_summarize_mb and just have a GUC summarize_wal = on|off
that controls whether the summarizer runs at all. There will be one
summary file per checkpoint, no matter how far apart checkpoints are
or how large the summary gets. Below I'll explain the reasoning; let
me know if you disagree.

What I describe above would be a bad plan if it were realistically
possible for a summary file to get so large that it might run the
machine out of memory either when producing it or when trying to make
use of it for an incremental backup. This seems to be a somewhat
difficult scenario to create. So far, I haven't been able to generate
WAL summary files more than a few tens of megabytes in size, even when
summarizing 50+ GB of WAL per summary file. One reason why it's hard
to produce large summary files is because, for a single relation fork,
the WAL summary size converges to 1 bit per modified block when the
number of modified blocks is large. This means that, even if you have
a terabyte sized relation, you're looking at no more than perhaps 20MB
of summary data no matter how much of it gets modified. Now, somebody
could have a 30TB relation and then if they modify the whole thing
they could have the better part of a gigabyte of summary data for that
relation, but if you've got a 30TB table you probably have enough
memory that that's no big deal.

But, what if you have multiple relations? I initialized pgbench with a
scale factor of 30000 and also with 30000 partitions and did a 1-hour
run. I got 4 checkpoints during that time and each one produced an
approximately 16MB summary file. The efficiency here drops
considerably. For example, one of the files is 16495398 bytes and
records information on 7498403 modified blocks, which works out to
about 2.2 bytes per modified block. That's more than an order of
magnitude worse than what I got in the single-relation case, where the
summary file didn't even use two *bits* per modified block. But here
again, the file just isn't that big in absolute terms. To get a 1GB+
WAL summary file, you'd need to modify millions of relation forks,
maybe tens of millions, and most installations aren't even going to
have that many relation forks, let alone be modifying them all
frequently.

My conclusion here is that it's pretty hard to have a database where
WAL summarization is going to use too much memory. I wouldn't be
terribly surprised if there are some extreme cases where it happens,
but those databases probably aren't great candidates for incremental
backup anyway. They're probably databases with millions of relations
and frequent, widely-scattered modifications to those relations. And
if you have that kind of high turnover rate then incremental backup
isn't going to as helpful anyway, so there's probably no reason to
enable WAL summarization in the first place. Maybe if you have that
plus in the same database cluster you have a 100TB of completely
static data that is never modified, and if you also do all of this on
a pretty small machine, then you can find a case where incremental
backup would have worked well but for the memory consumed by WAL
summarization.

But I think that's sufficiently niche that the current patch shouldn't
concern itself with such cases. If we find that they're common enough
to worry about, we might eventually want to do something to mitigate
them, but whether that thing looks anything like wal_summarize_mb
seems pretty unclear. So I conclude that it's a mistake to include
that GUC as currently designed and propose to replace it with a
Boolean as described above.

Comments?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: trying again to get incremental backup

Reply via email to