Hi Asif, In below scenarios backup verification failed for tablespace, when backup taken with parallel option. without parallel for the same scenario pg_verifybackup is passed without any error.
[edb@localhost bin]$ mkdir /tmp/test_bkp/tblsp1 [edb@localhost bin]$ ./psql postgres -p 5432 -c "create tablespace tblsp1 location '/tmp/test_bkp/tblsp1';" CREATE TABLESPACE [edb@localhost bin]$ ./psql postgres -p 5432 -c "create table test (a text) tablespace tblsp1;" CREATE TABLE [edb@localhost bin]$ ./psql postgres -p 5432 -c "insert into test values ('parallel_backup with -T tablespace option');" INSERT 0 1 [edb@localhost bin]$ ./pg_basebackup -p 5432 -D /tmp/test_bkp/bkp -T /tmp/test_bkp/tblsp1=/tmp/test_bkp/tblsp2 -j 4 [edb@localhost bin]$ ./pg_verifybackup /tmp/test_bkp/bkp pg_verifybackup: error: "pg_tblspc/16384/PG_13_202004074/13530/16390" is present on disk but not in the manifest pg_verifybackup: error: "pg_tblspc/16384/PG_13_202004074/13530/16388" is present on disk but not in the manifest pg_verifybackup: error: "pg_tblspc/16384/PG_13_202004074/13530/16385" is present on disk but not in the manifest pg_verifybackup: error: "/PG_13_202004074/13530/16388" is present in the manifest but not on disk pg_verifybackup: error: "/PG_13_202004074/13530/16390" is present in the manifest but not on disk pg_verifybackup: error: "/PG_13_202004074/13530/16385" is present in the manifest but not on disk --without parallel backup [edb@localhost bin]$ ./pg_basebackup -p 5432 -D /tmp/test_bkp/bkp1 -T /tmp/test_bkp/tblsp1=/tmp/test_bkp/tblsp3 -j 1 [edb@localhost bin]$ ./pg_verifybackup /tmp/test_bkp/bkp1 backup successfully verified Thanks & Regards, Rajkumar Raghuwanshi On Wed, Apr 15, 2020 at 2:19 PM Ahsan Hadi <ahsan.h...@gmail.com> wrote: > > > On Wed, 15 Apr 2020 at 1:49 AM, Robert Haas <robertmh...@gmail.com> wrote: > >> On Tue, Apr 14, 2020 at 10:37 AM Asif Rehman <asifr.reh...@gmail.com> >> wrote: >> > I forgot to make a check for no-manifest. Fixed. Attached is the >> updated patch. >> >> +typedef struct >> +{ >> ... >> +} BackupFile; >> + >> +typedef struct >> +{ >> ... >> +} BackupState; >> >> These structures need comments. >> >> +list_wal_files_opt_list: >> + SCONST SCONST >> { >> - $$ = makeDefElem("manifest_checksums", >> - >> (Node *)makeString($2), -1); >> + $$ = list_make2( >> + makeDefElem("start_wal_location", >> + (Node *)makeString($2), >> -1), >> + makeDefElem("end_wal_location", >> + (Node *)makeString($2), >> -1)); >> + >> } >> >> This seems like an unnecessarily complicated parse representation. The >> DefElems seem to be completely unnecessary here. >> >> @@ -998,7 +1110,37 @@ SendBaseBackup(BaseBackupCmd *cmd) >> set_ps_display(activitymsg); >> } >> >> - perform_base_backup(&opt); >> + switch (cmd->cmdtag) >> >> So the design here is that SendBaseBackup() is now going to do a bunch >> of things that are NOT sending a base backup? With no updates to the >> comments of that function and no change to the process title it sets? >> >> - return (manifest->buffile != NULL); >> + return (manifest && manifest->buffile != NULL); >> >> Heck no. It appears that you didn't even bother reading the function >> header comment. >> >> + * Send a single resultset containing XLogRecPtr record (in text format) >> + * TimelineID and backup label. >> */ >> static void >> -SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli) >> +SendXlogRecPtrResult(XLogRecPtr ptr, TimeLineID tli, >> + StringInfo label, char *backupid) >> >> This just casually breaks wire protocol compatibility, which seems >> completely unacceptable. >> >> + if (strlen(opt->tablespace) > 0) >> + sendTablespace(opt->tablespace, NULL, true, NULL, &files); >> + else >> + sendDir(".", 1, true, NIL, true, NULL, NULL, &files); >> + >> + SendFilesHeader(files); >> >> So I guess the idea here is that we buffer the entire list of files in >> memory, regardless of size, and then we send it out afterwards. That >> doesn't seem like a good idea. The list of files might be very large. >> We probably need some code refactoring here rather than just piling >> more and more different responsibilities onto sendTablespace() and >> sendDir(). >> >> + if (state->parallel_mode) >> + SpinLockAcquire(&state->lock); >> + >> + state->throttling_counter += increment; >> + >> + if (state->parallel_mode) >> + SpinLockRelease(&state->lock); >> >> I don't like this much. It seems to me that we would do better to use >> atomics here all the time, instead of conditional spinlocks. >> >> +static void >> +send_file(basebackup_options *opt, char *file, bool missing_ok) >> ... >> + if (file == NULL) >> + return; >> >> That seems totally inappropriate. >> >> + sendFile(file, file + basepathlen, &statbuf, >> true, InvalidOid, NULL, NULL); >> >> Maybe I'm misunderstanding, but this looks like it's going to write a >> tar header, even though we're not writing a tarfile. >> >> + else >> + ereport(WARNING, >> + (errmsg("skipping special file >> or directory \"%s\"", file))); >> >> So, if the user asks for a directory or symlink, what's going to >> happen is that they're going to receive an empty file, and get a >> warning. That sounds like terrible behavior. >> >> + /* >> + * Check for checksum failures. If there are failures across >> multiple >> + * processes it may not report total checksum count, but it will >> error >> + * out,terminating the backup. >> + */ >> >> In other words, the patch breaks the feature. Not that the feature in >> question works particularly well as things stand, but this makes it >> worse. >> >> I think this patch (0003) is in really bad shape. I'm having second >> thoughts about the design, but it's kind of hard to even have a >> discussion about the design when the patch is riddled with minor >> problems like inadequate comments, failure to update existing >> comments, and breaking a bunch of things. I understand that sometimes >> things get missed, but this is version 14 of a patch that's been >> kicking around since last August. > > > Fair enough. Some of this is also due to backup related features i.e > backup manifest, progress reporting that got committed to master towards > the tail end of PG-13. Rushing to get parallel backup feature compatible > with these features also caused some of the oversights. > > >> >> -- >> Robert Haas >> EnterpriseDB: http://www.enterprisedb.com >> The Enterprise PostgreSQL Company >> >> >> -- > Highgo Software (Canada/China/Pakistan) > URL : http://www.highgo.ca > ADDR: 10318 WHALLEY BLVD, Surrey, BC > EMAIL: mailto: ahsan.h...@highgo.ca >