Hi Amul,
     thanks for working on this.

+ file_name_len = strlen(relpath);
> + if (file_name_len < file_extn_len ||
> + strcmp(relpath + file_name_len - file_extn_len, file_extn) != 0)
> + {
> + if (compress_algorithm == PG_COMPRESSION_NONE)
> + report_backup_error(context,
> + "\"%s\" is not a valid file, expecting tar file",
> + relpath);
> + else
> + report_backup_error(context,
> + "\"%s\" is not a valid file, expecting \"%s\" compressed tar file",
> + relpath,
> + get_compress_algorithm_name(compress_algorithm));
> + return;
> + }
>

I believe pg_verifybackup needs to exit after reporting a failure here
since it could not figure out a streamer to allocate.

Also, v1-0002 removes #include "pqexpbuffer.h" from astreamer.h and adds it
to the new .h file and in v1-0004 it
reverts the change. So this can be avoided altogether.

On Tue, Jul 9, 2024 at 3:24 PM Amul Sul <sula...@gmail.com> wrote:

> Hi,
>
> Currently, pg_verifybackup only works with plain (directory) format
> backups.
> This proposal aims to support tar-format backups too.  We will read the tar
> files from start to finish and verify each file inside against the
> backup_manifest information, similar to how it verifies plain files.
>
> We are introducing new options to pg_verifybackup: -F, --format=p|t and -Z,
> --compress=METHOD, which allow users to specify the backup format and
> compression type, similar to the options available in pg_basebackup. If
> these
> options are not provided, the backup format and compression type will be
> automatically detected.  To determine the format, we will search for
> PG_VERSION
> file in the backup directory — if found, it indicates a plain backup;
> otherwise, it
> is a tar-format backup.  For the compression type, we will check the
> extension
> of base.tar.xxx file of tar-format backup. Refer to patch 0008 for the
> details.
>
> The main challenge is to structure the code neatly.  For plain-format
> backups,
> we verify bytes directly from the files.  For tar-format backups, we read
> bytes
> from the tar file of the specific file we care about.  We need an
> abstraction
> to handle both formats smoothly, without using many if statements or
> special
> cases.
>
> To achieve this goal, we need to reuse existing infrastructure without
> duplicating code, and for that, the major work involved here is the code
> refactoring. Here is a breakdown of the work:
>
> 1. BBSTREAMER Rename and Relocate:
> BBSTREAMER, currently used in pg_basebackup for reading and decompressing
> TAR
> files; can also be used for pg_verifybackup. In the future, it could
> support
> other tools like pg_combinebackup for merging TAR backups without
> extraction,
> and pg_waldump for verifying WAL files from the tar backup.  For that
> accessibility,
> BBSTREAMER needs to be relocated to a shared directory.
>
> Moreover, renaming BBSTREAMER to ASTREAMER (short for Archive Streamer)
> would
> better indicate its general application across multiple tools. Moving it to
> src/fe_utils directory is appropriate, given its frontend infrastructure
> use.
>
> 2. pg_verifybackup Code Refactoring:
> The existing code for plain backup verification will be split into separate
> files or functions, so it can also be reused for tar backup verification.
>
> 3. Adding TAR Backup Verification:
> Finally, patches will be added to implement TAR backup verification, along
> with
> tests and documentation.
>
> Patches 0001-0003 focus on renaming and relocating BBSTREAMER, patches
> 0004-0007 on splitting the existing verification code, and patches
> 0008-0010 on
> adding TAR backup verification capabilities, tests, and documentation. The
> last
> set could be a single patch but is split to make the review easier.
>
> Please take a look at the attached patches and share your comments,
> suggestions, or any ways to enhance them. Your feedback is greatly
> appreciated.
>
> Thank you !
>
> --
> Regards,
> Amul Sul
> EDB: http://www.enterprisedb.com
>


-- 
Thanks & Regards,
Sravan Velagandula
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Reply via email to