Re: pg_combinebackup fails on file named INCREMENTAL.*

Robert Haas Tue, 16 Apr 2024 09:50:09 -0700

On Tue, Apr 16, 2024 at 12:06 PM Stefan Fercot
<stefan.fer...@protonmail.com> wrote:
> Sure, I can see your point here and how people could be tempted to through 
> away that backup_manifest if they don't know how important it is to keep it.
> Probably in this case we'd need the list to be inside the tar, just like 
> backup_label and tablespace_map then.


Yeah, I think anywhere inside the tar is better than anywhere outside
the tar, by a mile. I'm happy to leave the specific question of where
inside the tar as something TBD at time of implementation by fiat of
the person doing the work.

But that said ...

> Do you mean 1 stub-list per pgdata + 1 per tablespaces?
>
> I don't really see how it would be faster to recursively go through each 
> sub-directories of the pgdata and tablespaces to gather all the pieces 
> together compared to reading 1 main file.
> But I guess, choosing one option or the other, we will only find out how well 
> it works once people will use it on the field and possibly give some feedback.

The reason why I was suggesting one stub-list per directory is that we
recurse over the directory tree. We reach each directory in turn,
process it, and then move on to the next one. What I imagine that we
want to do is - first iterate over all of the files actually present
in a directory. Then, iterate over the list of stubs for that
directory and do whatever we would have done if there had been a stub
file present for each of those. So, we don't really want a list of
every stub in the whole backup, or even every stub in the whole
tablespace. What we want is to be able to easily get a list of stubs
for a single directory. Which is very easily done if each directory
contains its own stub-list file.

If we instead have a centralized stub-list for the whole tablespace,
or the whole backup, it's still quite possible to make it work. We
just read that centralized stub list and we build an in-memory data
structure that is indexed by containing directory, like a hash table
where the key is the directory name and the value is a list of
filenames within that directory. But, a slight disadvantage of this
model is that you have to keep that whole data structure in memory for
the whole time you're reconstructing, and you have to pass around a
pointer to it everywhere so that the code that handles individual
directories can access it. I'm sure this isn't the end of the world.
It's probably unlikely that someone has so many stub files that the
memory used for such a data structure is painfully high, and even if
they did, it's unlikely that they are spread out across multiple
databases and/or tablespaces in such a way that only needing the data
for one directory at a time would save you. But, it's not impossible
that such a scenario could exist.

Somebody might say - well, don't go directory by directory. Just
handle all of the stubs at the end. But I don't think that really
fixes anything. I want to be able to verify that none of the stubs
listed in the stub-list are also present in the backup as real files,
for sanity checking purposes. It's quite easy to see how to do that in
the design I proposed above: keep a list of the files for each
directory as you read it, and then when you read the stub-list for
that directory, check those lists against each other for duplicates.
Doing this on the level of a whole tablespace or the whole backup is
clearly also possible, but once again it potentially uses more memory,
and there's no functional gain.

Plus, this kind of approach would also make the reconstruction process
"jump around" more. It might pull a bunch of mostly-unchanged files
from the full backup while handling the non-stub files, and then come
back to that directory a second time, much later, when it's processing
the stub-list. Perhaps that would lead to a less-optimal I/O pattern,
or perhaps it would make it harder for the user to understand how much
progress reconstruction had made. Or perhaps it would make no
difference at all; I don't know. Maybe there's even some advantage in
a two-pass approach like this. I don't see one. But it might prove
otherwise on closer examination.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg_combinebackup fails on file named INCREMENTAL.*

Reply via email to