On 2025/06/25 5:07, Robert Haas wrote:
On Tue, Jun 24, 2025 at 12:48 PM Nathan Bossart
<nathandboss...@gmail.com> wrote:
On Mon, Jun 23, 2025 at 01:38:10PM -0400, Robert Haas wrote:
I had thought we had a consensus that pg_upgrade should preserve stats
but regularly pg_dump shouldn't include them; perhaps I misunderstood
or that changed.
I think it's a bit of both. I skimmed through the past discussions and
found that not only was there a rough consensus in 2024 that stats _should_
be on by default [0], but also that an updated vote tally didn't show much
of a consensus at all [1]. Like you, I thought we had pretty much closed
that door, but the aforementioned analysis along with further discussion
has me convinced that we might want to reconsider [2].
Well, I don't know: I still think that's the right answer, so I don't
really want to reconsider, but I understand that I'm not in charge
here.
For the record, my vote is: default "off" for pg_dump and pg_dumpall,
and "on" for pg_restore.
For pg_dump and pg_dumpall, I agree with Jeff's idea in [1],
but if the statistics is skipped by default, I don't think
we need a --no-statistics option. So, here's how I think
the options should work:
* Keep: --schema-only, --data-only, --statistics-only, --no-schema,
--no-data, -and -statistics
* Remove: --no-statistics, --with-schema, and --with-data
* Combinations:
Schema + Data + Stats : --statistics
Schema + Data : (default)
Schema + Stats : --no-data --statistics
Data + Stats : --no-schema --statistics
Schema only : --schema-only (or --no-data)
Data only : --data-only (or --no-schema)
Stats only : --statistics-only (or --no-schema --no-data
--statistics)
As I mentioned in [2], if we treat --statistics in the similar way to
--sequence-data, i.e., allow --statistics to be used with --schema-only
or --data-only, we could simplify further:
* Keep: --schema-only, --data-only, --statistics-only, and --statistics
* Remove: --no-schema, --no-data, --no-statistics, --with-schema, and
--with-data
* Combinations:
Schema + Data + Stats : --statistics
Schema + Data : (default)
Schema + Stats : --schema-only --statistics
Data + Stats : --data-only --statistics
Schema only : --schema-only
Data only : --data-only
Stats only : --statistics-only
Some may find this confusing due to mixing --statistics with --schema-only
or --data-only, so I understand if there's hesitation.
For pg_restore, I believe there's agreement to restore statistics
by default if they exist in the archive. So:
* Keep: --schema-only, --data-only, --statistics-only, --no-schema,
--no-data, and --no-statistics
* Remove: --with-schema, --with-data, and --statistics
* Combinations:
Schema + Data + Stats : (default)
Schema + Data : --no-statistics
Schema + Stats : --no-data
Data + Stats : --no-schema
Schema only : --schema-only (or --no-data
--no-statistics)
Data only : --data-only (or --no-schema
--no-statistics)
Stats only : --statistics-only (or --no-schema --no-data)
Thought?
Regards,
[1]
https://postgr.es/m/031558c60e84362898922caa6a90587e7fdf2a57.ca...@j-davis.com
[2] https://postgr.es/m/94f89b0a-5d83-4a67-9092-50ba39134...@oss.nttdata.com
--
Fujii Masao
NTT DATA Japan Corporation