Greetings, * Michael Paquier (mich...@paquier.xyz) wrote: > This is a follow-up of the following thread: > https://www.postgresql.org/message-id/20181012010411.re53cwcistcpi...@alap3.anarazel.de
Thanks for starting this thread Michael. > pg_verify_checksums used first a blacklist to decide if files have > checksums or not. So with this approach all files should have checksums > except files like pg_control, pg_filenode.map, PG_VERSION, etc. So, this is a great example- pg_control actually *does* have a CRC and we really should be checking it in tools like pg_verify_checksums and pg_basebackup. Similairly, we should be trying to get to a point where we have checksums for anything else that we actually care about. > d55241a has added a first fix to silence the buildfarm for EXEC_BACKEND > builds. After discussion, Andres has pointed out that some extensions > may want to drop files within global/ or base/ as part of their logic. > cstore_fdw was mentioned but if you look at their README that's not the > case. So the one example that's been put forward doesn't actually do this..? > However, I think that Andres argument is pretty good as with > pluggable storage we should allow extensions to put custom files for > different tablespaces. Andres' wasn't argueing, that I saw at least, that pluggable storage would result in random files showing up in tablespace directories that the core code has no knowledge of. In fact, he seemed to be saying in 20181019205724.ewnuhvrsanaci...@alap3.anarazel.de that pluggable storage results in the core code being aware of the files that those other storage mechanisms create, so this entire line of argument seems to be without merit. > So this commit has changed the logic of > pg_verify_checksums to use a whitelist, which assumes that only normal > relation files have checksums: > <digits> > <digits>.<segment> > <digits>_<forkname> > <digits>_<forkname>.<segment > After more discussion on the thread mentioned above, Stephen has pointed > out that base backups use the same blacklist logic when verifying > checksums. This has the same problem with EXEC_BACKEND-specific files, > resulting in spurious warnings (that's an example!) which are confusing > for the user: > $ pg_basebackup -D popo > WARNING: cannot verify checksum in file "./global/config_exec_params", > block 0: read buffer size 5 and page size 8192 differ Stephen also extensively argued that extensions shouldn't be dropping random files into PG's database and tablespace directories and that we shouldn't be writing code which attempts to work around that (and, indeed, ultimately can't since technically extension authors could drop files into those directories which match these "whitelist patterns"). Further, using a whitelist risks possibly missing files that should be included, unlike having an exclude list which ensures that we actually check everything and complain about anything found that's out of the ordinary. Additionally, having these checks would hopefully make it clear that people shouldn't be dropping random files into our database and tablespace directories- something we didn't try to do for the root of the data directory, resulting in, frankly, a big mess. Allowing random files to exist in the data directory has lead to quite a few issues including things like pg_basebackup breaking because of a root-owned file or similar that can't be read, and this extends that. I thought the point of this new thread was to encourage discussion of that point and the pros and cons seen for each side, not to only include one side of it, so I'm rather disappointed. > Folks on this thread agreed that both pg_verify_checksums and checksum > verification for base backups should use the same logic. It seems to me > that using a whitelist-based approach for both is easier to maintain as > the patterns of files supporting checksums is more limited than files > which do not support checksums. So this way we allow extensions > bundling custom files to still work with pg_verify_checksums and > checksum verification in base backups. This is not an accurate statement- those random files which some extension drops into these directories don't "work" with pg_verify_checksums, this just makes pg_verify_checksums ignore them. That is not the same thing at all. Worse, we end up with things like pg_basebackup copying/backing up those files, but skipping them for validation checking, when we have no reason to expect that those files will be at all valid when they're copied and no way to see if they are valid in the resulting restore. Other parts of the system will continue to similairly do things that people might not expect- DROP DATABASE will happily nuke any file it finds, no matter if it matches these patterns or not. Basically, the way I see it, at least, is that we should either maintain that PG's database and tablespace directories are under the purview of PG and other things shouldn't be putting files there without the core code's knowledge, or we decide that it's ok for random things to drop files into these directories and we'll just ignore them- across the board. I don't like this half-and-half approach where *some* things will operate on files we don't recognize, including removing them in some cases!, but other parts of the system will ignore them. > Something else which has been discussed on this thread is that we should > have a dedicated API to decide if a file has checksums or not, and if it > should be skipped in both cases. That's work only for HEAD though, so > we need to do something for HEAD and v11, which is simple. Yes, some kind of API, which is then in libpgcommon for other tools to use, would be good. I agree that should go into HEAD though. > Attached is a patch I have cooked for this purpose. I have studied the > file patterns opened by the backend, and we actually need to only skip > temporary files and folders as those can include legit relation file > names (like 1.1 for example). At the same time I have found about > parse_filename_for_nontemp_relation() which is a copycat of the > recently-added isRelFileName(), so we can easily shave some code by > reusing it in both cases. This also takes care of the issues for > temporary files, and it also fixes an extra bug coming from the original > implementation which checked the file patterns without looking at the > file type first, so the tool could miss some cascading paths. Using some existing code is certainly better than writing new code. This doesn't change my opinion of the bigger question though, which is to what extent we should be implicitly supporting extensions and whatever else putting files into the database and tablespace directories. Even if we go with this proposed change to look at the relation filename, I'd be happier with some kind of warning being thrown when we come across files we don't recognize in directories that aren't intended to have random files showing up. Thanks! Stephen
signature.asc
Description: PGP signature