On Fri, Mar 27, 2020 at 2:29 AM Andres Freund <and...@anarazel.de> wrote: > s/ye'/yes/
Ugh, sorry. Fixed in the version posted earlier. > Are you planning to include a specification of the manifest file format > anywhere? I looked through the patches and didn't find anything. I thought about that. I think it would be good to have. I was sort of hoping to leave it for a follow-on patch, but maybe that's cheating too much. > I think it'd also be good to include more information about what the > point of manifest files actually is. What kind of information do you want to see included there? Basically, the way the documentation is written right now, it essentially says, well, we have this manifest thing so that you can later run pg_validatebackup, and pg_validatebackup says that it's there to check the integrity of backups using the manifest. This is all a bit circular, though, and maybe needs elaboration. What I've experienced is that: - Sometimes people take a backup and then wonder later whether the disk has flipped some bits. - Sometimes people restore a backup and forget some of the parts, like the user-defined tablespaces. - Sometimes anti-virus software, or poorly-run cron job run amok, wander around inflicting unpredictable damage. It would be nice to have a system that would notice these kinds of things on a running system, but here I've got the more modest goal of checking for in the context of a backup. If the data gets corrupted in transit, or if the disk mutilates it, or if the user mutilates it, you need something to check the backup against to find out that bad things have happend; the manifest is that thing. But I don't know exactly how much of all that should go in the docs, or in what way. > > + <para> > > + <application>pg_validatebackup</application> reads the manifest file of > > a > > + backup, verifies the manifest against its own internal checksum, and > > then > > + verifies that the same files are present in the target directory as in > > the > > + manifest itself. It then verifies that each file has the expected > > checksum, > > Depending on what you want to use the manifest for, we'd also need to > check that there are no additional files. That seems to actually be > implemented, which imo should be mentioned here. I intended the text to say that, because it says that it checks that the two things are "the same," which is symmetric. In the new version I posted a bit ago, I tried to make it more explicit, because apparently it was not sufficiently clear. > Hm. Is it a great choice to include the checksum for the manifest inside > the manifest itself? With a cryptographic checksum it seems like it > could make a ton of sense to store the checksum somewhere "safe", but > keep the manifest itself alongside the base backup itself. While not > huge, they won't be tiny either. Seems like the user could just copy the manifest checksum and store it somewhere, if they wish. Then they can check it against the manifest itself later, if they wish. Or they can take a SHA-512 of the whole file and store that securely. The problem is that we have no idea how to write that checksum to a more security storage. We could write backup_manifest and backup_manifest.checksum into separate files, but that seems like it's adding complexity without any real benefit. To me, the security-related uses of this patch seem to be fairly niche. I think it's nice that they exist, but I don't think that's the main selling point. For me, the main selling point is that you can check that your disk didn't eat your data and that nobody nuked any files that were supposed to be there. > Doesn't have to be in the first version, but could it be useful to move > this to common/ or such? Yeah. At one point, this code was written in a way that was totally specific to pg_validatebackup, but I then realized that it would be better to make it more general, so I refactored it into in the form you see now, where pg_validatebackup.c depends on parse_manifest.c but not the reverse. I suspect that if someone wants to use this for something else they might need to change a few more things - not sure exactly what - but I don't think it would be too hard. I thought it would be best to leave that task until someone has a concrete use case in mind, but I did want it to to be relatively easy to do that down the road, and I hope that the way I've organized the code achieves that. > > +static void > > +validate_backup_directory(validator_context *context, char *relpath, > > + char *fullpath) > > +{ > > Hm. Should this warn if the directory's permissions are set too openly > (world writable?)? I don't think so, but it's pretty clear that different people have different ideas about what the scope of this tool ought to be, even in this first version. > Hm. I think it'd be good to verify that the checksummed size is the same > as the size of the file in the manifest. That's checked in an earlier phase. Are you worried about the file being modified after the first pass checks the size and before we come through to do the checksumming? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company