Re: pg_verifybackup: TAR format backup verification

Robert Haas Tue, 30 Jul 2024 08:34:21 -0700

On Mon, Jul 22, 2024 at 7:53 AM Amul Sul <[email protected]> wrote:
> Fix in the attached version.


First of all, in the interest of full disclosure, I suggested this
project to Amul, so I'm +1 on the concept. I think making more of our
backup-related tools work with tar and compressed tar formats -- and
perhaps eventually data not stored locally -- will make them a lot
more usable. If, for example, you take a full backup and an
incremental backup, each in tar format, store them in the cloud
someplace, and then want to verify and afterwards restore the
incremental backup, you would need to download the tar files from the
cloud, then extract all the tar files, then run pg_verifybackup and
pg_combinebackup over the results. With this patch set, and similar
work for pg_combinebackup, you could skip the step where you need to
extract the tar files, saving significant amounts of time and disk
space. If the tools also had the ability to access data remotely, you
could save even more, but that's a much harder project, so it makes
sense to me to start with this.

Second, I think this patch set is quite well-organized and easy to
read. That's not to say there is nothing in these patches to which
someone might object, but it seems to me that it should at least be
simple for anyone who wants to review to find the things to which they
object in the patch set without having to spend too much time on it,
which is fantastic.

Third, I think the general approach that these patches take to the
problem - namely, renaming bbstreamer to astreamer and moving it
somewhere that permits it to be reused - makes a lot of sense. To be
honest, I don't think I had it in mind that bbstreamer would be a
reusable component when I wrote it, or if I did have it in mind, it
was off in some dusty corner of my mind that doesn't get visited very
often. I was imagining that you would need to build new infrastructure
to deal with reading the tar file, but I think what you've done here
is better. Reusing the bbstreamer stuff gives you tar file parsing,
and decompression if necessary, basically for free, and IMHO the
result looks rather elegant.

However, I'm not very convinced by 0003. The handling between the
meson and make-based build systems doesn't seem consistent. On the
meson side, you just add the objects to the same list that contains
all of the other files (but not in alphabetical order, which should be
fixed). But on the make side, you for some reason invent a separate
AOBJS list instead of just adding the files to OBJS. I don't think it
should be necessary to treat these objects any differently from any
other objects, so they should be able to just go in OBJS: but if it
were necessary, then I feel like the meson side would need something
similar.

Also, I'm not so sure about this change to src/fe_utils/meson.build:

-  dependencies: frontend_common_code,
+  dependencies: [frontend_common_code, lz4, zlib, zstd],

frontend_common_code already includes dependencies on zlib and zstd,
so we probably don't need to add those again here. I checked the
result of otool -L src/bin/pg_controldata/pg_controldata from the
meson build directory, and I find that currently it links against libz
and libzstd but not liblz4. However, if I either make this line say
dependencies: [frontend_common_code, lz4] or if I just update
frontend_common_code to include lz4, then it starts linking against
liblz4 as well. I'm not entirely sure if there's any reason to do one
or the other of those things, but I think I'd be inclined to make
frontend_common_code just include lz4 since it already includes zlib
and zstd anyway, and then you don't need this change.

Alternatively, we could take the position that we need to avoid having
random front-end tools that don't do anything with compression at all,
like pg_controldata for example, to link with compression libraries at
all. But then we'd need to rethink a bunch of things that have not
much to do with this patch.

Regarding 0004, I would rather not move show_progress and
skip_checksums to the new header file. I suppose I was a bit lazy in
making these file-level global variables instead of passing them down
using function arguments and/or a context object, but at least right
now they're only global within a single file. Can we think of
inserting a preparatory patch that moves these into verifier_context?

Regarding 0005, the comment /* Check whether there's an entry in the
manifest hash. */ should move inside verify_manifest_entry, where
manifest_files_lookup is called. The call to the new function
verify_manifest_entry() needs its own, proper comment. Also, I think
there's a null-pointer deference hazard here, because
verify_manifest_entry() can return NULL but the "Validate the manifest
system identifier" chunk assumes it isn't. I think you could hit this
- and presumably seg fault - if pg_control is on disk but not in the
manifest.  Seems like just adding an m != NULL test is easiest, but
see also below comments about 0006.

Regarding 0006, suppose that the member file within the tar archive is
longer than expected. With the unpatched code, we'll feed all of the
data to the checksum context, but then, after the read-loop
terminates, we'll complain about the file being the wrong length. With
the patched code, we'll complain about the checksum mismatch before
returning from verify_content_checksum(). I think that's an unintended
behavior change, and I think the new behavior is worse than the old
behavior. But also, I think that in the case of a tar file, the
desired behavior is quite different. In that case, we know the length
of the file from the member header, so we can check whether the length
is as expected before we read any of the data bytes. If we discover
that the size is wrong, we can complain about that and need not feed
the checksum bytes to the checksum context at all -- we can just skip
them, which will be faster. That approach doesn't really make sense
for a file, because even if we were to stat() the file before we
started reading it, the length could theoretically change as we are
reading it, if someone is concurrently modifying it, but for a tar
file I think it does.

I would suggest abandoning this refactoring. There's very little logic
in verify_file_checksum() that you can actually reuse. I think you
should just duplicate the code. If you really want, you could arrange
to reuse the error-reporting code that checks for checksumlen !=
m->checksum_length and memcmp(checksumbuf, m->checksum_payload,
checksumlen) != 0, but even that I think is little enough that it's
fine to just duplicate it. The rest is either (1) OS calls like
open(), read(), etc. which won't be applicable to the
read-from-archive case or (2) calls to pg_checksum_WHATEVER, which are
fine to just duplicate, IMHO.

My eyes are starting to glaze over a bit here so expect comments below
this point to be only a partial review of the corresponding patch.

Regarding 0007, I think that the should_verify_sysid terminology is
problematic. I made all the code and identifier names talk only about
the control file, not the specific thing in the control file that we
are going to verify, in case in the future we want to verify
additional things. This breaks that abstraction.

Regarding 0009, I feel like astreamer_verify_content() might want to
grow some subroutines. One idea could be to move the
ASTREAMER_MEMBER_HEADER case and likewise ASTREAMER_MEMBER_CONTENTS
cases into a new function for each; another idea could be to move
smaller chunks of logic, e.g. under the ASTREAMER_MEMBER_CONTENTS
case, the verify_checksums could be one subroutine and the ill-named
verify_sysid stuff could be another. I'm not certain exactly what's
best here, but some of this code is as deeply as six levels nested,
which is not such a terrible thing that nobody should ever do it, but
it is bad enough that we should at least look around for a better way.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg_verifybackup: TAR format backup verification

Reply via email to