Re: Avoiding useless SHA256 initialization with backup manifests, breaking base backups with FIPS

2020-11-11 Thread Michael Paquier
ution in the first paragraph, you cannot make use of backup > manifests at all with OpenSSL+FIPS (one can still enforce the use of > the in-core SHA2 implementation even if building with OpenSSL), but at > least it gives an escape route with 13. Okay. Hearing nothing, I have app

Avoiding useless SHA256 initialization with backup manifests, breaking base backups with FIPS

2020-11-09 Thread Michael Paquier
/ Unfortunately, this enforces an ABI breakage so this is not backpatchable material. Now, if one attempts to use OpenSSL with FIPS, the initialization of backup manifests in InitializeBackupManifest() enforces a call to pg_sha256_init() for the manifest file itself even if pg_basebackup, or anything

Re: Documentation patch for backup manifests in protocol.sgml

2020-09-01 Thread Bernd Helmle
Am Montag, den 31.08.2020, 18:48 -0400 schrieb Bruce Momjian: > > So confirmed. > > > Patch applied through 13. Thanks!

Re: Documentation patch for backup manifests in protocol.sgml

2020-08-31 Thread Michael Paquier
On Mon, Aug 31, 2020 at 06:48:53PM -0400, Bruce Momjian wrote: > Patch applied through 13. Thanks. -- Michael signature.asc Description: PGP signature

Re: Documentation patch for backup manifests in protocol.sgml

2020-08-31 Thread Bruce Momjian
in this paragraph. > > > > If someone can confirm this, I will apply it? Magnus? > > The reason why backup manifests are sent at the end of a base backup > is that they include the start and stop positions of the backup (see > caller of AddWALInfoToBackupManifest() in perf

Re: Documentation patch for backup manifests in protocol.sgml

2020-08-24 Thread Michael Paquier
>> CopyResponse data message containing the contents of the backup >> manifest (if requested) after having received the tar files. So i >> propose the attached to give a little more detail in this paragraph. > > If someone can confirm this, I will apply it? Magnus? The reason w

Re: Documentation patch for backup manifests in protocol.sgml

2020-08-21 Thread Bruce Momjian
On Tue, Aug 18, 2020 at 02:41:09PM +0200, Bernd Helmle wrote: > Hi, > > protocol.sgml describes the protocol messages received by a BASE_BACKUP > streaming command, but doesn't tell anything about the additional > CopyResponse data message containing the contents of the backup > manifest (if reque

Documentation patch for backup manifests in protocol.sgml

2020-08-18 Thread Bernd Helmle
Hi, protocol.sgml describes the protocol messages received by a BASE_BACKUP streaming command, but doesn't tell anything about the additional CopyResponse data message containing the contents of the backup manifest (if requested) after having received the tar files. So i propose the attached to gi

Re: backup manifests

2020-04-24 Thread Robert Haas
On Thu, Apr 23, 2020 at 5:16 PM Andres Freund wrote: > Do you not see a warning when compiling with optimizations enabled? No, I don't. I tried it with -O{0,1,2,3} and I always use -Wall -Werror. No warnings. [rhaas pgsql]$ clang -v clang version 5.0.2 (tags/RELEASE_502/final) Target: x86_64-app

Re: backup manifests

2020-04-23 Thread Andres Freund
Hi, On 2020-04-23 08:57:39 -0400, Robert Haas wrote: > On Sun, Apr 5, 2020 at 3:31 PM Andres Freund wrote: > > The warnings don't seem too unreasonable. The compiler can't see that > > the error_cb inside json_manifest_parse_failure() is not expected to > > return. Probably worth adding a wrapper

Re: backup manifests

2020-04-23 Thread Robert Haas
On Sun, Apr 5, 2020 at 3:31 PM Andres Freund wrote: > The warnings don't seem too unreasonable. The compiler can't see that > the error_cb inside json_manifest_parse_failure() is not expected to > return. Probably worth adding a wrapper around the calls to > context->error_cb and mark that as nore

Re: backup manifests

2020-04-22 Thread Fujii Masao
On 2020/04/23 1:28, Robert Haas wrote: On Wed, Apr 22, 2020 at 12:21 PM Fujii Masao wrote: I found three minor issues in pg_verifybackup. + {"print-parse-wal", no_argument, NULL, 'p'}, This is unused option, so this line should be removed. + printf(_(" -m, --manifest=

Re: backup manifests

2020-04-22 Thread Robert Haas
On Wed, Apr 22, 2020 at 12:21 PM Fujii Masao wrote: > I found three minor issues in pg_verifybackup. > > + {"print-parse-wal", no_argument, NULL, 'p'}, > > This is unused option, so this line should be removed. > > + printf(_(" -m, --manifest=PATH use specified path fo

Re: backup manifests

2020-04-22 Thread Fujii Masao
On 2020/04/15 11:18, Fujii Masao wrote: On 2020/04/14 0:15, Robert Haas wrote: On Sun, Apr 12, 2020 at 10:09 PM Fujii Masao wrote: I found other minor issues. I think these are all correct fixes. Thanks for the post-commit review, and sorry for this mistakes. Thanks for the review, Mic

Re: backup manifests

2020-04-14 Thread Fujii Masao
On 2020/04/14 0:15, Robert Haas wrote: On Sun, Apr 12, 2020 at 10:09 PM Fujii Masao wrote: I found other minor issues. I think these are all correct fixes. Thanks for the post-commit review, and sorry for this mistakes. Thanks for the review, Michael and Robert. Pushed the patches! Rega

Re: backup manifests

2020-04-13 Thread Robert Haas
On Sun, Apr 12, 2020 at 10:09 PM Fujii Masao wrote: > I found other minor issues. I think these are all correct fixes. Thanks for the post-commit review, and sorry for this mistakes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: backup manifests

2020-04-12 Thread Michael Paquier
On Mon, Apr 13, 2020 at 11:09:34AM +0900, Fujii Masao wrote: > - while ((c = getopt_long(argc, argv, > "CD:F:r:RS:T:X:l:nNzZ:d:c:h:p:U:s:wWkvP", > + while ((c = getopt_long(argc, argv, > "CD:F:r:RS:T:X:l:nNzZ:d:c:h:p:U:s:wWkvPm:", > > "m:" seems unnecessary, so should be removed? > Patch

Re: backup manifests

2020-04-12 Thread Fujii Masao
On 2020/04/09 23:06, Fujii Masao wrote: On 2020/04/09 2:35, Robert Haas wrote: On Wed, Apr 8, 2020 at 1:15 AM Fujii Masao wrote: When there is a backup_manifest in the database cluster, it's included in the backup even when --no-manifest is specified. ISTM that this is problematic because

Re: backup manifests

2020-04-09 Thread Fujii Masao
On 2020/04/09 23:10, Stephen Frost wrote: Greetings, * Fujii Masao (masao.fu...@oss.nttdata.com) wrote: On 2020/04/09 2:35, Robert Haas wrote: On Wed, Apr 8, 2020 at 1:15 AM Fujii Masao wrote: When there is a backup_manifest in the database cluster, it's included in the backup even when -

Re: backup manifests

2020-04-09 Thread Stephen Frost
Greetings, * Fujii Masao (masao.fu...@oss.nttdata.com) wrote: > On 2020/04/09 2:35, Robert Haas wrote: > >On Wed, Apr 8, 2020 at 1:15 AM Fujii Masao > >wrote: > >>When there is a backup_manifest in the database cluster, it's included in > >>the backup even when --no-manifest is specified. ISTM t

Re: backup manifests

2020-04-09 Thread Fujii Masao
On 2020/04/09 2:35, Robert Haas wrote: On Wed, Apr 8, 2020 at 1:15 AM Fujii Masao wrote: When there is a backup_manifest in the database cluster, it's included in the backup even when --no-manifest is specified. ISTM that this is problematic because the backup_manifest is obviously not valid

Re: backup manifests and contemporaneous buildfarm failures

2020-04-08 Thread Tom Lane
Andrew Dunstan writes: > On 4/8/20 3:41 PM, Robert Haas wrote: >> I don't understand what the local $ENV{MSYS2_ARG_CONV_EXCL} = >> $source_ts_prefix does, > You don't want to know > See for the > gory details. I don't want to know

Re: backup manifests and contemporaneous buildfarm failures

2020-04-08 Thread Andrew Dunstan
On 4/8/20 3:41 PM, Robert Haas wrote: > On Wed, Apr 8, 2020 at 1:59 PM Tom Lane wrote: >> I guess we could commit it and find out. I'm all for the simpler >> coding if it works. > I don't understand what the local $ENV{MSYS2_ARG_CONV_EXCL} = > $source_ts_prefix does, You don't want to know .

Re: backup manifests and contemporaneous buildfarm failures

2020-04-08 Thread Robert Haas
On Wed, Apr 8, 2020 at 1:59 PM Tom Lane wrote: > I guess we could commit it and find out. I'm all for the simpler > coding if it works. I don't understand what the local $ENV{MSYS2_ARG_CONV_EXCL} = $source_ts_prefix does, but the remove/unlink condition was suggested by Amit Kapila on the basis

Re: backup manifests and contemporaneous buildfarm failures

2020-04-08 Thread Tom Lane
Andrew Dunstan writes: > OK, tricky, but here's what I did to get this working on fairywren. > First, on Msys2 there is a problem with name mangling. We've had to fix > this before by telling it to ignore certain argument prefixes. > Second, once that was fixed rmdir was failing on the tablespace.

Re: backup manifests and contemporaneous buildfarm failures

2020-04-08 Thread Andrew Dunstan
On 4/7/20 9:42 AM, Andrew Dunstan wrote: > On Tue, Apr 7, 2020 at 12:37 AM Tom Lane wrote: >> Robert Haas writes: >>> Taking stock of the situation this morning, most of the buildfarm is >>> now green. There are three failures, on eelpout (6 hours ago), >>> fairywren (17 hours ago), and hyrax (3

Re: backup manifests

2020-04-08 Thread Robert Haas
On Wed, Apr 8, 2020 at 1:15 AM Fujii Masao wrote: > When there is a backup_manifest in the database cluster, it's included in > the backup even when --no-manifest is specified. ISTM that this is problematic > because the backup_manifest is obviously not valid for the backup. > So, isn't it better

Re: backup manifests

2020-04-07 Thread Fujii Masao
On 2020/04/04 4:22, Robert Haas wrote: On Thu, Apr 2, 2020 at 4:34 PM David Steele wrote: +1. These would be great tests to have and a win for pg_basebackup overall but I don't think they should be a prerequisite for this commit. Not to mention the server. I can't say that I have a lot of c

Re: pgsql: Generate backup manifests for base backups, and validate them.

2020-04-07 Thread David Steele
On 4/7/20 12:44 PM, Robert Haas wrote: On Tue, Apr 7, 2020 at 5:51 AM Peter Eisentraut wrote: On 2020-04-03 21:07, Robert Haas wrote: A new tool called pg_validatebackup can validate a backup against the manifest. In software engineering, "verify" and "validate" have standardized distinct me

Re: pgsql: Generate backup manifests for base backups, and validate them.

2020-04-07 Thread Robert Haas
On Tue, Apr 7, 2020 at 5:51 AM Peter Eisentraut wrote: > On 2020-04-03 21:07, Robert Haas wrote: > > A new tool called pg_validatebackup can validate a backup against the > > manifest. > > In software engineering, "verify" and "validate" have standardized > distinct meanings. I'm not going to try

Re: backup manifests and contemporaneous buildfarm failures

2020-04-07 Thread Andrew Dunstan
On Tue, Apr 7, 2020 at 12:37 AM Tom Lane wrote: > > Robert Haas writes: > > Taking stock of the situation this morning, most of the buildfarm is > > now green. There are three failures, on eelpout (6 hours ago), > > fairywren (17 hours ago), and hyrax (3 days, 7 hours ago). > > fairywren has now

Re: backup manifests and contemporaneous buildfarm failures

2020-04-07 Thread Andrew Dunstan
On Mon, Apr 6, 2020 at 1:18 AM Fabien COELHO wrote: > > > Hello, > > >> Do I need to precede those with some recursive chmod commands? Perhaps > >> the client should refuse to run if there is still something left after > >> these. > > > > I think the latter would be a very good idea, just so that

Re: pgsql: Generate backup manifests for base backups, and validate them.

2020-04-07 Thread Peter Eisentraut
On 2020-04-03 21:07, Robert Haas wrote: A new tool called pg_validatebackup can validate a backup against the manifest. In software engineering, "verify" and "validate" have standardized distinct meanings. I'm not going to try to explain them here, but you can easily find them online. I hav

Re: backup manifests and contemporaneous buildfarm failures

2020-04-06 Thread Tom Lane
Robert Haas writes: > Taking stock of the situation this morning, most of the buildfarm is > now green. There are three failures, on eelpout (6 hours ago), > fairywren (17 hours ago), and hyrax (3 days, 7 hours ago). fairywren has now done this twice in the pg_validatebackupCheck step: exec fail

Re: backup manifests and contemporaneous buildfarm failures

2020-04-06 Thread Andrew Dunstan
On 4/6/20 7:53 AM, Robert Haas wrote: > On Sun, Apr 5, 2020 at 4:07 PM Andrew Dunstan > wrote: >> Do I need to precede those with some recursive chmod commands? > +1. > >> Perhaps >> the client should refuse to run if there is still something left after >> these. > +1 to that, too. > See https

Re: backup manifests and contemporaneous buildfarm failures

2020-04-06 Thread Robert Haas
On Sun, Apr 5, 2020 at 4:07 PM Andrew Dunstan wrote: > Do I need to precede those with some recursive chmod commands? +1. > Perhaps > the client should refuse to run if there is still something left after > these. +1 to that, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The E

Re: backup manifests and contemporaneous buildfarm failures

2020-04-05 Thread Fabien COELHO
Hello, Do I need to precede those with some recursive chmod commands? Perhaps the client should refuse to run if there is still something left after these. I think the latter would be a very good idea, just so that this sort of failure is less obscure. Not sure about whether a recursive chm

Re: backup manifests and contemporaneous buildfarm failures

2020-04-05 Thread Tom Lane
Andrew Dunstan writes: > Hmm, the buildfarm client does this at the beginning of each run to > remove anything that might be left over from a previous run: > rmtree("inst"); > rmtree("$pgsql") unless ($from_source && !$use_vpath); Right, the point is precisely that some versions of rmtre

Re: backup manifests and contemporaneous buildfarm failures

2020-04-05 Thread Andrew Dunstan
On 4/5/20 9:10 AM, Mikael Kjellström wrote: > On 2020-04-04 04:43, Robert Haas wrote: > >> I think I've done about as much as I can do for tonight, though. Most >> things are green now, and the ones that aren't are failing because of >> stuff that is at least plausibly fixed. By morning it should

Re: backup manifests

2020-04-05 Thread Andres Freund
Hi, On 2020-04-03 15:22:23 -0400, Robert Haas wrote: > I've pushed all the patches. Seeing new warnings in an optimized build /home/andres/src/postgresql-master/src/bin/pg_validatebackup/parse_manifest.c: In function 'json_manifest_object_end': /home/andres/src/postgresql-master/src/bin/pg_vali

Re: backup manifests and contemporaneous buildfarm failures

2020-04-05 Thread Mikael Kjellström
On 2020-04-04 04:43, Robert Haas wrote: I think I've done about as much as I can do for tonight, though. Most things are green now, and the ones that aren't are failing because of stuff that is at least plausibly fixed. By morning it should be clearer how much broken stuff is left, although that

Re: backup manifests and contemporaneous buildfarm failures

2020-04-04 Thread Thomas Munro
On Sun, Apr 5, 2020 at 2:36 AM Robert Haas wrote: > eelpout is unhappy because: > > +WARNING: could not remove shared memory segment > "/PostgreSQL.248989127": No such file or directory > +WARNING: could not remove shared memory segment > "/PostgreSQL.1450751626": No such file or directory Seem

Re: backup manifests and contemporaneous buildfarm failures

2020-04-04 Thread Tom Lane
Robert Haas writes: > On Sat, Apr 4, 2020 at 10:57 AM Tom Lane wrote: >> What is odd is that >> (AFAIR) we've never seen this before. Maybe somebody recently added >> an error cursor callback in a place that didn't have it before, and >> is involved in SQL-function processing? None of the commi

Re: backup manifests and contemporaneous buildfarm failures

2020-04-04 Thread Robert Haas
On Sat, Apr 4, 2020 at 10:57 AM Tom Lane wrote: > It's not so surprising that we could get a different result that way > from a CLOBBER_CACHE_ALWAYS animal like hyrax, since CCA-forced > cache reloads would cause extra stack expenditure at a lot of places. > And it could vary depending on totally

Re: backup manifests and contemporaneous buildfarm failures

2020-04-04 Thread Tom Lane
Robert Haas writes: > hyrax's last run was before any of this happened, so it seems to have > an unrelated problem. The last two runs, three and six days ago, both > failed like this: > -ERROR: stack depth limit exceeded > +ERROR: stack depth limit exceeded at character 8 > Not sure what that'

Re: backup manifests

2020-04-04 Thread Tom Lane
Robert Haas writes: > On Fri, Apr 3, 2020 at 8:18 PM Tom Lane wrote: >> I suppose that judicious s/time_t/pg_time_t/ would fix this. > I think you sent this email just after I pushed > db1531cae00941bfe4f6321fdef1e1ef355b6bed, or maybe after I'd committed > it locally and just before I pushed it

Re: backup manifests and contemporaneous buildfarm failures

2020-04-04 Thread Robert Haas
On Fri, Apr 3, 2020 at 10:43 PM Robert Haas wrote: > I think I've done about as much as I can do for tonight, though. Most > things are green now, and the ones that aren't are failing because of > stuff that is at least plausibly fixed. By morning it should be > clearer how much broken stuff is le

Re: backup manifests

2020-04-04 Thread Robert Haas
On Fri, Apr 3, 2020 at 8:18 PM Tom Lane wrote: > BTW, some of the buildfarm is showing a simpler portability problem: > they think you were too cavalier about the difference between time_t > and pg_time_t. (On a platform with 32-bit time_t, that's an actual > bug, probably.) lapwing is actually

Re: backup manifests and contemporaneous buildfarm failures

2020-04-04 Thread Robert Haas
On Fri, Apr 3, 2020 at 11:06 PM Andres Freund wrote: > On 2020-04-03 20:48:09 -0400, Robert Haas wrote: > > 'serinus' is also failing. This is less obviously related: > > Hm. Tests passed once since then. Yeah, but conchuela also failed once in what I think was a similar way. I suspect the fix I

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Petr Jelinek
On 04/04/2020 05:06, Andres Freund wrote: Hi, Peter, Petr, CCed you because it's probably a bug somewhere around the initial copy code for logical replication. On 2020-04-03 20:48:09 -0400, Robert Haas wrote: 'serinus' is also failing. This is less obviously related: Hm. Tests passed once s

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Andres Freund
Hi, Peter, Petr, CCed you because it's probably a bug somewhere around the initial copy code for logical replication. On 2020-04-03 20:48:09 -0400, Robert Haas wrote: > 'serinus' is also failing. This is less obviously related: Hm. Tests passed once since then. > 2020-04-04 02:08:57.299 CEST

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Robert Haas
On Fri, Apr 3, 2020 at 9:52 PM Tom Lane wrote: > Robert Haas writes: > > 'prairiedog' is also unhappy, and it looks related: > > Yeah, gaur also failed in the same place. Both of those are > alignment-picky 32-bit hardware, so I'm thinking the problem is > pg_gmtime() trying to fetch a 64-bit pg

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Tom Lane
Robert Haas writes: > Interestingly, on my machine, rmtree coped with a mode 0 directory > just fine, but mode 0400 was more than its tiny brain could handle, so > the originally committed fix had code to revert 0400 back to 0700, but > I didn't add similar code to revert from 0 back to 0700 becau

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Tom Lane
Robert Haas writes: > 'prairiedog' is also unhappy, and it looks related: Yeah, gaur also failed in the same place. Both of those are alignment-picky 32-bit hardware, so I'm thinking the problem is pg_gmtime() trying to fetch a 64-bit pg_time_t from an insufficiently aligned address. I'm trying

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Robert Haas
On Fri, Apr 3, 2020 at 8:12 PM Tom Lane wrote: > Yeah, so it would seem. The buildfarm script uses rmtree to clean out > the old build tree. The man page for File::Path suggests (but can't > quite bring itself to say in so many words) that by default, rmtree > will adjust the permissions on targ

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Robert Haas
On Fri, Apr 3, 2020 at 6:13 PM Tom Lane wrote: > Locally, I observe that "make clean" in src/bin/pg_validatebackup fails > to clean up the tmp_check directory left behind by "make check". Fixed. I also tried to fix 'lapwing', which was complaining about about a call to pg_gmtime, saying that it

Re: backup manifests

2020-04-03 Thread Tom Lane
BTW, some of the buildfarm is showing a simpler portability problem: they think you were too cavalier about the difference between time_t and pg_time_t. (On a platform with 32-bit time_t, that's an actual bug, probably.) lapwing is actually failing: https://buildfarm.postgresql.org/cgi-bin/show_

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Tom Lane
Robert Haas writes: > On Fri, Apr 3, 2020 at 6:48 PM Tom Lane wrote: >> I'm guessing that we're looking at a platform-specific difference in >> whether "rm -rf" fails outright on an unreadable subdirectory, or >> just tries to carry on by unlinking it anyway. > My intention was that it would be

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Robert Haas
On Fri, Apr 3, 2020 at 5:58 PM Fabien COELHO wrote: > seawasp just failed the same way. Good news, I can see "configure" under > "HEAD/pgsql". Ah, good. > The only strange thing under buildroot I found is: > > HEAD/pgsql.build/src/bin/pg_validatebackup/tmp_check/t_003_corruption_master_data/back

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Robert Haas
On Fri, Apr 3, 2020 at 6:48 PM Tom Lane wrote: > I'm guessing that we're looking at a platform-specific difference in > whether "rm -rf" fails outright on an unreadable subdirectory, or > just tries to carry on by unlinking it anyway. My intention was that it would be cleaned by the TAP framework

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Tom Lane
I wrote: > I'm guessing that we're looking at a platform-specific difference in > whether "rm -rf" fails outright on an unreadable subdirectory, or > just tries to carry on by unlinking it anyway. Yeah... on my RHEL6 box, "make check" cleans up the working directories under tmp_check, but on a Fre

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Tom Lane
Thomas Munro writes: > Same here, on elver. I see pg_subtrans has been chmod(0)'d, > presumably by the perl subroutine mutilate_open_directory_fails. I > see this in my inbox (the build farm wrote it to stderr or stdout > rather than the log file): > cannot chdir to child for > pgsql.build/src/

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Stephen Frost
Greetings, * Thomas Munro (thomas.mu...@gmail.com) wrote: > On Sat, Apr 4, 2020 at 11:13 AM Tom Lane wrote: > > Fabien COELHO writes: > > > The only strange thing under buildroot I found is: > > > > > HEAD/pgsql.build/src/bin/pg_validatebackup/tmp_check/t_003_corruption_master_data/backup/open_d

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Thomas Munro
On Sat, Apr 4, 2020 at 11:13 AM Tom Lane wrote: > Fabien COELHO writes: > > The only strange thing under buildroot I found is: > > > HEAD/pgsql.build/src/bin/pg_validatebackup/tmp_check/t_003_corruption_master_data/backup/open_directory_fails/pg_subtrans/ > > > this last directory perms are d

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Alvaro Herrera
On 2020-Apr-03, Tom Lane wrote: > I wonder if VPATH versus not-VPATH might be a relevant factor ... Oh, absolutely. The ones that failed show, in the last successful run, the configure line invoked as "./configure", while the animals that are still running are invoking configure from some other

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Tom Lane
Fabien COELHO writes: > The only strange thing under buildroot I found is: > HEAD/pgsql.build/src/bin/pg_validatebackup/tmp_check/t_003_corruption_master_data/backup/open_directory_fails/pg_subtrans/ > this last directory perms are d- which seems to break cleanup. Locally, I observe tha

Re: backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Fabien COELHO
Hello Robert, Done now. Meanwhile, two more machines have reported the mysterious message: sh: ./configure: not found ...that first appeared on spurfowl a few hours ago. The other two machines are eelpout and elver, both of which list Thomas Munro as a maintainer. spurfowl lists Stephen Fros

backup manifests and contemporaneous buildfarm failures

2020-04-03 Thread Robert Haas
[ splitting this off into a separate thread ] On Fri, Apr 3, 2020 at 5:07 PM Robert Haas wrote: > I'lll go see about adding that. Done now. Meanwhile, two more machines have reported the mysterious message: sh: ./configure: not found ...that first appeared on spurfowl a few hours ago. The othe

Re: backup manifests

2020-04-03 Thread Justin Pryzby
:00:00 2001 From: Justin Pryzby Date: Fri, 3 Apr 2020 16:17:28 -0500 Subject: [PATCH] docs: backup manifests commit 0d8c9c1210c44b36ec2efcb223a1dfbe897a3661 Author: Robert Haas --- doc/src/sgml/protocol.sgml | 4 ++-- doc/src/sgml/ref/pg_basebackup.sgml | 4 ++-- doc/src/sgml/

Re: backup manifests

2020-04-03 Thread Robert Haas
On Fri, Apr 3, 2020 at 4:54 PM Alvaro Herrera wrote: > Maybe it needs perl2host? *jaw drops* Wow, OK, yeah, that looks like the thing. Thanks for the suggestion; I didn't know that existed (and I kinda wish I still didn't). I'lll go see about adding that. -- Robert Haas EnterpriseDB: http://

Re: backup manifests

2020-04-03 Thread Alvaro Herrera
On 2020-Apr-03, Robert Haas wrote: > sub tempdir_short > { > > return File::Temp::tempdir(CLEANUP => 1); > } > > And File::Temp's documentation says that the temporary directory is > picked using File::Spec's tmpdir(), which says that it knows about > different operating systems and will

Re: backup manifests

2020-04-03 Thread Robert Haas
On Fri, Apr 3, 2020 at 3:53 PM Robert Haas wrote: > 2. Also, a bunch of machines were super-unhappy with > 003_corruption.pl, failing with this sort of thing: > > pg_basebackup: error: could not get COPY data stream: ERROR: symbolic > link target too long for tar format: file name "pg_tblspc/1638

Re: backup manifests

2020-04-03 Thread Robert Haas
On Fri, Apr 3, 2020 at 3:22 PM Robert Haas wrote: > It looks like the buildfarm is unhappy though, so I guess I'd better > go look at that. I fixed two things so far, and there seems to be at least one more possible issue that I don't understand. 1. Apparently, we have an automated perlcritic ru

Re: backup manifests

2020-04-03 Thread Robert Haas
On Thu, Apr 2, 2020 at 4:34 PM David Steele wrote: > +1. These would be great tests to have and a win for pg_basebackup > overall but I don't think they should be a prerequisite for this commit. Not to mention the server. I can't say that I have a lot of confidence that all of the server behavior

Re: backup manifests

2020-04-02 Thread David Steele
On 4/2/20 3:47 PM, Andres Freund wrote: On 2020-04-02 15:42:48 -0400, Robert Haas wrote: I suspect I'm not doing quite what you had in mind here... thoughts? I have some ideas, but I think it's complicated enough that I'd not put it in the "pre commit path" for now. +1. These would be great

Re: backup manifests

2020-04-02 Thread Andres Freund
On 2020-04-02 15:42:48 -0400, Robert Haas wrote: > I suspect I'm not doing quite what you had in mind here... thoughts? I have some ideas, but I think it's complicated enough that I'd not put it in the "pre commit path" for now.

Re: backup manifests

2020-04-02 Thread Robert Haas
On Thu, Apr 2, 2020 at 2:55 PM Robert Haas wrote: > On Thu, Apr 2, 2020 at 2:23 PM Andres Freund wrote: > > > That might make the window fairly wide on normal systems, but I'm not > > > sure about Raspberry Pi BF members or things running > > > CLOBBER_CACHE_ALWAYS/RECURSIVELY. I guess I could tr

Re: backup manifests

2020-04-02 Thread Robert Haas
On Thu, Apr 2, 2020 at 3:26 PM David Steele wrote: > So, with the addition of the 0004 patch down-thread this looks > committable to me. Glad to hear it. Thank you. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Re: backup manifests

2020-04-02 Thread David Steele
On 4/2/20 1:04 PM, Robert Haas wrote: > There are still some things that not everybody is happy about. In particular, Stephen and David are unhappy about using CRC-32C as the default algorithm, but Andres and Noah both think it's a reasonable choice, even if not as robust as everybody will want.

Re: backup manifests

2020-04-02 Thread Andres Freund
On 2020-04-02 14:55:19 -0400, Robert Haas wrote: > > Yes, I am asking for something to be changed: I'd like the code that > > read()s the file when computing the checksum to add up how many bytes > > were read, and compare that to the size in the manifest. And if there's > > a difference report an

Re: backup manifests

2020-04-02 Thread Andres Freund
Hi, On 2020-04-02 14:16:27 -0400, Robert Haas wrote: > On Thu, Apr 2, 2020 at 1:23 PM Andres Freund wrote: > > I suspect its possible to control the timing by preventing the > > checkpoint at the end of recovery from completing within a relevant > > timeframe. I think configuring a large checkpoi

Re: backup manifests

2020-04-02 Thread Robert Haas
On Thu, Apr 2, 2020 at 1:23 PM Andres Freund wrote: > I suspect its possible to control the timing by preventing the > checkpoint at the end of recovery from completing within a relevant > timeframe. I think configuring a large checkpoint_timeout and using a > non-fast base backup ought to do the

Re: backup manifests

2020-04-02 Thread Andres Freund
Hi, On 2020-04-02 13:04:45 -0400, Robert Haas wrote: > And here's another new patch set. After some experimentation, I was > able to manually test the timeline-switch-during-a-base-backup case > and found that it had bugs in both pg_validatebackup and the code I > added to the backend's basebackup

Re: backup manifests

2020-04-01 Thread David Steele
On 3/31/20 7:57 AM, Robert Haas wrote: On Mon, Mar 30, 2020 at 7:24 PM David Steele wrote: I'm confused as to why you're not seeing that. What's the exact sequence of steps? $ pg_basebackup -D test/backup5 --manifest-checksums=SHA256 $ vi test/backup5/backup_manifest * Add 'X' to the c

Re: backup manifests

2020-04-01 Thread Andres Freund
perform checksum validation etc. Potentially even in a single pass over the data directory. > This patch has two parts (a) Generate backup manifests for base > backups, and (b) Validate backup (manifest). It seems to me that > there are not many things pending for (a), can't we

Re: backup manifests

2020-04-01 Thread Andres Freund
Hi, On 2020-03-31 22:15:04 -0700, Noah Misch wrote: > On Tue, Mar 31, 2020 at 03:50:34PM -0700, Andres Freund wrote: > > On 2020-03-31 14:10:34 -0400, Robert Haas wrote: > > > +/* > > > + * Attempt to parse the WAL files required to restore from backup using > > > + * pg_waldump. > > > + */ > > >

Re: backup manifests

2020-03-31 Thread Noah Misch
On Tue, Mar 31, 2020 at 03:50:34PM -0700, Andres Freund wrote: > On 2020-03-31 14:10:34 -0400, Robert Haas wrote: > > +/* > > + * Attempt to parse the WAL files required to restore from backup using > > + * pg_waldump. > > + */ > > +static void > > +parse_required_wal(validator_context *context, ch

Re: backup manifests

2020-03-31 Thread Andres Freund
Hi, On 2020-03-31 14:10:34 -0400, Robert Haas wrote: > I made an attempt to implement this. Awesome! > In the attached patch set, 0001 I'm going to work on those things. I > would appreciate *very timely* feedback on anything people do or do > not like about this, because I want to commit this

Re: backup manifests

2020-03-31 Thread Stephen Frost
re, I think the command-line options > > > > will get confusing. > > > > > > I'm mildly inclined to name it pg_validate, pg_validate_dbdir or > > > such. And eventually (definitely not this release) subsume pg_checksums > > > in it. That way we

Re: backup manifests

2020-03-31 Thread Robert Haas
On Mon, Mar 30, 2020 at 7:24 PM David Steele wrote: > > I'm confused as to why you're not seeing that. What's the exact > > sequence of steps? > > $ pg_basebackup -D test/backup5 --manifest-checksums=SHA256 > > $ vi test/backup5/backup_manifest > * Add 'X' to the checksum of backup_label > >

Re: backup manifests

2020-03-31 Thread Amit Kapila
such. And eventually (definitely not this release) subsume pg_checksums > > in it. That way we can add other checkers too. > > Works for me; of those two, I prefer pg_validate. > pg_validate sounds like a tool with a much bigger purpose. I think even things like amcheck could also f

Re: backup manifests

2020-03-30 Thread Noah Misch
On Mon, Mar 30, 2020 at 12:16:31PM -0700, Andres Freund wrote: > On 2020-03-30 15:04:55 -0400, Robert Haas wrote: > > I guess I'd like to be clear here that I have no fundamental > > disagreement with taking this tool in any direction that people would > > like it to go. For me it's just a question

Re: backup manifests

2020-03-30 Thread David Steele
On 3/30/20 4:16 PM, Robert Haas wrote: On Fri, Mar 27, 2020 at 3:51 PM David Steele wrote: > { "Path": "backup_label", "Size": 224, "Last-Modified": "2020-03-27 18:33:18 GMT", "Checksum-Algorithm": "CRC32C", "Checksum": "b914bec9" }, Storing the checksum type with each file seems pretty red

Re: backup manifests

2020-03-30 Thread David Steele
On 3/30/20 5:08 PM, Andres Freund wrote: The data in the backup label isn't sufficient though. Without having parsed the timeline file there's no way to verify that the correct WAL is present. I guess we can also add client side tools to parse timelines, add command the fetch all of the required

Re: backup manifests

2020-03-30 Thread Andres Freund
Hi, On 2020-03-30 15:23:08 -0400, Robert Haas wrote: > On Mon, Mar 30, 2020 at 2:59 PM Andres Freund wrote: > > I wonder if it'd not be best, independent of whether we build in this > > verification, to include that metadata in the manifest file. That's for > > sure better than having to build a

Re: backup manifests

2020-03-30 Thread Robert Haas
On Sun, Mar 29, 2020 at 9:05 PM David Steele wrote: > Yeah, that seems reasonable. > > In our case backups are nearly always compressed and/or encrypted so > even checking the original size is a bit of work. Getting the checksum > at the same time seems like an obvious win. Makes sense. If this e

Re: backup manifests

2020-03-30 Thread Robert Haas
On Mon, Mar 30, 2020 at 2:59 PM Andres Freund wrote: > I wonder if it'd not be best, independent of whether we build in this > verification, to include that metadata in the manifest file. That's for > sure better than having to build a separate tool to parse timeline > history files. I don't thin

Re: backup manifests

2020-03-30 Thread Andres Freund
Hi, On 2020-03-30 15:04:55 -0400, Robert Haas wrote: > I guess I'd like to be clear here that I have no fundamental > disagreement with taking this tool in any direction that people would > like it to go. For me it's just a question of timing. Feature freeze > is now a week or so away, and nothing

Re: backup manifests

2020-03-30 Thread Robert Haas
On Mon, Mar 30, 2020 at 2:24 AM Amit Kapila wrote: > > Between those two, I would use "pg_validatebackup" if there's a fair chance > > it > > will end up doing the pg_waldump check. Otherwise, I would use > > "pg_validatemanifest". > > +1. I guess I'd like to be clear here that I have no fundam

Re: backup manifests

2020-03-30 Thread Andres Freund
Hi, On 2020-03-30 14:35:40 -0400, Robert Haas wrote: > On Sun, Mar 29, 2020 at 10:08 PM Andres Freund wrote: > > See the attached minimal prototype for what I am thinking of. > > > > This would not correctly handle the case where the timeline changes > > while taking a base backup. But I'm not su

  1   2   3   >