On 6/27/25 06:35, Zechman, Derek S wrote:
We recently performed an upgrade from pg14 (14.18) to pg16 (16.9) and performed the analyze-in-stages post upgrade.  It has been noticed that some plans changed to use hash joins instead of nested loops.  Further investigation found it was because the parent table of partitioned tables did not have stats.  After running an ANALYZE on the parent tables we got similar plan an execution times as before.

I have two questions

1 - Why does analyze-in-stages not analyze the parent tables?

2 – What happens if we do not run analyze-in-stages post upgrade and just run an analyze?

It is spelled out in the docs:

https://www.postgresql.org/docs/current/pgupgrade.html

Emphasis added

"Using vacuumdb --all --analyze-only can efficiently generate such statistics, and the use of --jobs can speed it up. Option --analyze-in-stages can be used to generate **minimal statistics** quickly. If vacuum_cost_delay is set to a non-zero value, this can be overridden to speed up statistics generation using PGOPTIONS, e.g., PGOPTIONS='-c vacuum_cost_delay=0' vacuumdb ...."

and from here:

https://www.postgresql.org/docs/current/app-vacuumdb.html

"--analyze-in-stages

Only calculate statistics for use by the optimizer (no vacuum), like --analyze-only. Run three stages of analyze; the first stage uses the lowest possible statistics target (see default_statistics_target) to produce usable statistics faster, and subsequent stages build the full statistics.

This option is only useful to analyze a database that currently has no statistics or has wholly incorrect ones, such as if it is newly populated from a restored dump or by pg_upgrade. Be aware that running with this option in a database with existing statistics may cause the query optimizer choices to become transiently worse due to the low statistics targets of the early stages.
"


Thanks,

Sean


--
Adrian Klaver
adrian.kla...@aklaver.com



Reply via email to