Hi Jeevan, I have incorporated all the comments in the attached patch. Please review and let me know your thoughts.
On Thu, Nov 21, 2019 at 2:51 PM Jeevan Chalke < jeevan.cha...@enterprisedb.com> wrote: > > > On Wed, Nov 20, 2019 at 11:05 AM Suraj Kharage < > suraj.khar...@enterprisedb.com> wrote: > >> Hi, >> >> Since now we are generating the backup manifest file with each backup, it >> provides us an option to validate the given backup. >> Let's say, we have taken a backup and after a few days, we want to check >> whether that backup is validated or corruption-free without restarting the >> server. >> >> Please find attached POC patch for same which will be based on the latest >> backup manifest patch from Rushabh. With this functionality, we add new >> option to pg_basebackup, something like --verify-backup. >> So, the syntax would be: >> ./bin/pg_basebackup --verify-backup -D <backup_directory_path> >> >> Basically, we read the backup_manifest file line by line from the given >> directory path and build the hash table, then scan the directory and >> compare each file with the hash entry. >> >> Thoughts/suggestions? >> > > > I like the idea of verifying the backup once we have backup_manifest with > us. > Periodically verifying the already taken backup with this simple tool > becomes > easy now. > > I have reviewed this patch and here are my comments: > > 1. > @@ -30,7 +30,9 @@ > #include "common/file_perm.h" > #include "common/file_utils.h" > #include "common/logging.h" > +#include "common/sha2.h" > #include "common/string.h" > +#include "fe_utils/simple_list.h" > #include "fe_utils/recovery_gen.h" > #include "fe_utils/string_utils.h" > #include "getopt_long.h" > @@ -38,12 +40,19 @@ > #include "pgtar.h" > #include "pgtime.h" > #include "pqexpbuffer.h" > +#include "pgrhash.h" > #include "receivelog.h" > #include "replication/basebackup.h" > #include "streamutil.h" > > Please add new files in order. > > 2. > Can hash related file names be renamed to backuphash.c and backuphash.h? > > 3. > Need indentation adjustments at various places. > > 4. > + char buf[1000000]; // 1MB chunk > > It will be good if we have multiple of block /page size (or at-least power > of 2 > number). > > 5. > +typedef struct pgrhash_entry > +{ > + struct pgrhash_entry *next; /* link to next entry in same bucket */ > + DataDirectoryFileInfo *record; > +} pgrhash_entry; > + > +struct pgrhash > +{ > + unsigned nbuckets; /* number of buckets */ > + pgrhash_entry **bucket; /* pointer to hash entries */ > +}; > + > +typedef struct pgrhash pgrhash; > > These two can be moved to .h file instead of redefining over there. > > 6. > +/* > + * TODO: this function is not necessary, can be removed. > + * Test whether the given row number is match for the supplied keys. > + */ > +static bool > +pgrhash_compare(char *bt_filename, char *filename) > > Yeah, it can be removed by doing strcmp() at the required places rather > than > doing it in a separate function. > > 7. > mdate is not compared anywhere. I understand that it can't be compared with > the file in the backup directory and its entry in the manifest as manifest > entry gives mtime from server file whereas the same file in the backup will > have different mtime. But adding a few comments there will be good. > > 8. > + char mdate[24]; > > should be mtime instead? > > > Thanks > > -- > Jeevan Chalke > Associate Database Architect & Team Lead, Product Development > EnterpriseDB Corporation > The Enterprise PostgreSQL Company > > -- -- Thanks & Regards, Suraj kharage, EnterpriseDB Corporation, The Postgres Database Company.
backup_validator_POC.patch
Description: Binary data