Greetings, * Andres Freund (and...@anarazel.de) wrote: > On 2018-10-19 16:35:46 -0400, Stephen Frost wrote: > > The only justification for *not* doing this is that some extension > > author might have dumped files into our tablespace directory, something > > we've never claimed to support nor generally worried about in all the > > time that I can recall before this. > > No, it's really not the only reason. As I said, testing is much less > likely to find cases where we're checksumming a short-lived file even > though we shouldn't, than making sure that checksummed files are > actually checksummed.
While I agree that we might possibly end up trying to checksum a short-lived and poorly-named file, I'd rather that than risk missing the checking of a file which does have checksums that should have been checked and claiming that we checked all of the checksums. > > > > What about two different extensions wanting to create files with the > > > > same names in the tablespace directories..? > > > > > > > > Experimentation is fine, of course, this is open source code and people > > > > should feel free to play with it, but we are not obligated to avoid > > > > breaking things when an extension author, through their experimentation, > > > > does something which is clearly not supported, like dropping files into > > > > PG's tablespace directories. Further, when it comes to our user's data, > > > > we should be taking a strict approach and accounting for everything, > > > > something that this whitelist-of-patterns-based approach to finding > > > > files to verify the checksums on doesn't do. > > > > > > It's not economically feasible to work on extensions that will only be > > > usable a year down the road. > > > > I listed out multiple other solutions to this problem which you > > summarily ignored. > > Except that none of them really achieves the goals you can achieve by > having the files in the database (like DROP DATABASE working, for > starters). The only reason things like DROP DATABASE "work" in this example case is exactly that it assumes that all of the files which exist are ones which we put there. Or, to put it another way, if we're only going to checksum files based on some whitelist of files we expect to be there, shouldn't we go back and make things like DROP DATABASE only remove those files that we expect to be there and not random other files that we have no knowledge of..? > > It's unfortunate that those other solutions weren't used and that, > > instead, this extension decided to drop files into PG's tablespace > > directories, but that doesn't mean we should condone or implicitly > > support that action. > > This just seems pointlessly rigid. Our extension related APIs aren't > generally very good or well documented. Most non-trivial extensions that > add interesting things to the postgres ecosystem are going to need a few > not so pretty things to get going. PostGIS is a fantastic example of an extension that is far from trivial, extends PG in ways which are clearly supported, and has worked with PG to improve things in core to be able to then use in the extension. > > I stand by my position that this patch should be reverted (with no > > offense or ill-will towards Michael, of course, I certainly appreciate > > his efforts to address the issues with pg_verify_checksums) and that we > > should move more of this code to identify files to verify the checksums > > on into libpgcommon and then use it from both pg_basebackup and > > pg_verify_checksums, to the extent possible, but that we make sure to > > account for all of the files in our tablespace and database directories. > > What does that do, except break things that currently work? Sure, work > on a patch that fixes the architectural concerns, but what's the point > in reverting it until that's done? As you pointed out previously, the current code *doesn't* work, before or after this change, and we clearly need to rework this to move things into libpgcommon and also fix pg_basebackup. Reverting this would at least get us back to having similar code between this and pg_basebackup, and then it'll be cleaner and clearer to have one patch which moves that similar logic into libpgcommon and fixes the missing exceptions for the EXEC_BACKEND case. Keeping the patch doesn't do anything for the pg_basebackup case, and confuses the issue by having these filename-pattern-whitelists which weren't there before and that should be removed, imv. Thanks! Stephen
signature.asc
Description: PGP signature