On Sat, Mar 1, 2025 at 9:48 PM Jeff Davis <pg...@j-davis.com> wrote:
> On Sat, 2025-03-01 at 13:52 -0500, Greg Sabino Mullane wrote: > > > Can you expand on some of those cases? > > > > Certainly. I think one of the problems is that because this patch is > > solving a pg_upgrade issue, the focus is on the "dump and restore" > > scenarios. But pg_dump is used for much more than that, especially > > "dump and examine". > > Thank you for going through these examples. > > > I just don't think it should be enabled by default for everything > > using pg_dump. For the record, I would not strongly object to having > > stats on by default for binary dumps, although I would prefer them > > off. > > I am open to that idea, I just want to get it right, because probably > whatever the default is in 18 will stay that way. > > Also, we will need to think through the set of pg_dump options again. A > lot of our tools seem to assume that "if it's the default, we don't > need a way to ask for it explicitly", which makes it a lot harder to > ever change the default and keep a coherent set of options. > That's a good point in general, and definitely something we should think through, independently of his patch. > > So why not just expect people to modify their programs to use --no- > > statistics for cases like this? That's certainly an option, but it's > > going to break a lot of existing things, and create branching code: > > I suggest that we wait a bit to see what additional feedback we get > early in beta. > I definitely thing it should be on by default. FWIW, I've seen many cases of people using automated tools to verify the *schema* between two databases. I'd say that's quite common. But they use pg_dump -s, which I believe is not affected by this one. I don't think I've ever come across an automated tool to verify the contents of an entire database this way. That doesn't mean it's not out there of course, just that it's not so common. The cases I've seen pg_dump used to verify the contents that's always been in combination with a myriad of other switches such as include/exclude of specific tables etc, and adding just one more switch to those seems like a small price to pay for having the default behaviour be a big improvement for the majority of usecases. > Also, anything trained to parse pg_dump output will have to learn > > about the new SELECT pg_restore_ calls with their multi-line formats > > (not 100% sure we don't have that anywhere, as things like "SELECT > > setval" and "SELECT set_config" are single line, but there may be > > existing things) > That's going to be true every time we add something to pg_dump. And for that matter, anything new to *postgresql*, since surely we'd want pg_dump to dump objects by default. Any tool that parses the pg_dump output directly will always have to carefully analyze each new version. And probably shouldn't be using the plaintext format in the first place - and if using pg_restore it comes out as it's own type of object, making it easy to exclude at that level. -- Magnus Hagander Me: https://www.hagander.net/ <http://www.hagander.net/> Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>