On Tue, Apr 22, 2025 at 11:03:29PM +0200, Christoph Berg wrote: > Re: Nathan Bossart >> In any case, IMO it's unfortunate >> that we might end up recommending roughly the same post-upgrade steps as >> before even though the optimizer statistics are carried over. > > Maybe the docs (and the pg_upgrade scripts) should recommend the old > procedure by default until this gap is closed? People could then still > opt to use the new procedure in specific cases.
I think we'd still want to modify the --analyze-in-stages recommendation (from what is currently recommended for supported versions). If we don't, you'll wipe out the optimizer stats you brought over from the old version. Here is a rough draft of what I am thinking. -- nathan
diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml index df13365b287..648c6e2967c 100644 --- a/doc/src/sgml/ref/pgupgrade.sgml +++ b/doc/src/sgml/ref/pgupgrade.sgml @@ -833,17 +833,19 @@ psql --username=postgres --file=script.sql postgres <para> Because not all statistics are not transferred by - <command>pg_upgrade</command>, you will be instructed to run a command to + <command>pg_upgrade</command>, you will be instructed to run commands to regenerate that information at the end of the upgrade. You might need to set connection parameters to match your new cluster. </para> <para> - Using <command>vacuumdb --all --analyze-only --missing-stats-only</command> - can efficiently generate such statistics. Alternatively, + First, use <command>vacuumdb --all --analyze-in-stages --missing-stats-only</command> - can be used to generate minimal statistics quickly. For either command, - the use of <option>--jobs</option> can speed it up. + to quickly generate minimal optimizer statistics for relations without + any. Then, use <command>vacuumdb --all --analyze-only</command> to ensure + all relations have updated cumulative statistics for triggering vacuum and + analyze. For both commands, the use of <option>--jobs</option> can speed + it up. If <varname>vacuum_cost_delay</varname> is set to a non-zero value, this can be overridden to speed up statistics generation using <envar>PGOPTIONS</envar>, e.g., <literal>PGOPTIONS='-c diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c index 18c2d652bb6..f1b90c5957e 100644 --- a/src/bin/pg_upgrade/check.c +++ b/src/bin/pg_upgrade/check.c @@ -814,9 +814,12 @@ output_completion_banner(char *deletion_script_file_name) } pg_log(PG_REPORT, - "Some optimizer statistics may not have been transferred by pg_upgrade.\n" + "Some statistics are not transferred by pg_upgrade.\n" "Once you start the new server, consider running:\n" - " %s/vacuumdb %s--all --analyze-in-stages --missing-stats-only", new_cluster.bindir, user_specification.data); + " %s/vacuumdb %s--all --analyze-in-stages --missing-stats-only\n" + " %s/vacuumdb %s--all --analyze-only", + new_cluster.bindir, user_specification.data, + new_cluster.bindir, user_specification.data); if (deletion_script_file_name) pg_log(PG_REPORT,