On 2018-10-19 14:11:03 -0400, Stephen Frost wrote: > Greetings, > > * Andres Freund (and...@anarazel.de) wrote: > > On 2018-10-19 13:52:28 -0400, Stephen Frost wrote: > > > > > I also categorically disagree with the notion that it's ok for > > > > > extensions to dump files into our tablespace diretories or that we > > > > > should try to work around random code dumping extra files in the > > > > > directories which we maintain- it's not like this additional code will > > > > > actually protect us from that, after all, and it's foolish to think we > > > > > really could protect against that. > > > > > > > > I'm unconvinced. There already are extensions doing so, and it's not > > > > like we've given them any sort of reasonable alternative. You can't just > > > > create relations in a "registered" / "supported" kinda way right now, so > > > > uh, yea, extension gotta make do. And that has worked for many years. > > > > > > I don't agree that any of this is an argument for accepting random code > > > writing into arbitrary places in the data directory or tablespace > > > directories as being something which we support or which we write code > > > to work-around. > > > > > > If this is a capability that extension authors need then I'm all for > > > having a way for them to have that capability in a clearly defined and > > > documented way- and one which we could actually write code to handle and > > > work with. > > > > cstore e.g. does this, and it's already out there. So yes, we should > > provide better infrastructure. But for now, we gotta deal with reality, > > unless we just want to gratuituously break things. > > No, I don't agree that we are beholden to an external extension that > decided to start writing into directories that clearly belong to PG.
Did it have an alternative? > How do we even know that what cstore does in the tablespace directory > wouldn't get caught up in the checksum file pattern-matching that this > commit put in? You listen to people? > What if there was an extension which did create files that would > match, what would we do then? I'm happy to go create one, if that'd > help make the point that this "pattern whitelist" isn't actually a > solution but is really rather just a hack that'll break in the future. Oh, for crying out loud. Yes, obviously you can create conflicts, nobody ever doubted that. How on earth is that a useful point for anything? The point isn't that it's a good idea for extensions to do so, but that it has happened because we didn't provide better alternative. > > > > And I fail to see why a blacklist is architecturally better. There's > > > > plenty cases where we might want to create temporary files, > > > > non-checksummed data or such that we'd need a teach a blacklist about, > > > > but there's far fewer cases where add new checksummed files. Basically > > > > never since checksums have been introduced. And if checksums were > > > > introduced for something new, say slrus, we'd ceertainly use > > > > pg_verify_checksum during development. > > > > > > In cases where we need something temporary, we're going to need to be > > > prepared to clean those up and we had better make it very plain that > > > they're temporary and easy to write code for. Anything we aren't > > > prepared to blow away on a crash-restart should be checksum'd and in an > > > ideal world, we'd be looking to reduce the blacklist to only those > > > things which are temporary. > > > > There's pending patches that add support for pg_verify_checksums running > > against a running cluster. We'll not just need to recognize files that > > are there after a graceful shutdown, but with anything that a cluster > > can have there while running. > > Of course- the same is true with a crash/restart case, so I'm not sure > what you're getting at here. pg_verify_checksum doesn't support running on a crashed cluster, and I'm not sure it'd make sense to teach it to so (there's not really much it could do to discern whether a block is a torn page that replay will fix, or corrupted). > I wasn't ever thinking of only the graceful shutdown case- certainly > pg_basebackup operates when the cluster is running also and that's > more what I was thinking about from the start and which is part of > what started the discussion about this commit as that was entirely > ignored in this change. It was mainly ignored in the original pg_verify_checksums change, not in the commit discussed here. That was broken. That built separate logic. Both basebackup an verify checksums already only scan certain directories. They *already* are encoding directory structure information.