Re: pg_upgrade: Pass -j down to vacuumdb

Justin Pryzby Wed, 03 Apr 2019 14:25:13 -0700

On Wed, Apr 03, 2019 at 04:42:14PM -0400, Jeff Janes wrote:
> So maybe the first stage could
> be run by pg_upgrade itself, while the new server is still running on a
> linux socket in a private directory.


I think that would take too long.  It would be less of an issue if there was
feedback/progress from pg_upgrade during the analyze.

For our upgrades (which typically take ~15min but several customers take up to
~60min), I only analyze base tables (essentially, those which are neither
parents nor children), then start services, then ANALYZE with default stats
target.  I would want to avoid delaying services restart for more than another
(say) 5 minutes, and I would want to avoid even that unless there was a
progress report indicating that it's projected to take only a few more minutes.

I just did a test on one of our large-but-not-huge customers.  With
stats_target=1, analyzing a 145GB partitioned table looks like it'll take
perhaps an hour; they have ~1TB data, so delaying services during ANALYZE would
nullify the utility of pg_upgrade.  I can restore the essential tables from
backup in 15-30 minutes.

It might be fine if pg_upgrade took an option which enabled analyze, perhaps
instead of outputting analyze_new_cluster.sh.  But actually, a problem with
*that* is that currently pg_upgrade avoids starting the new cluster.  That
seems to be deliberate, since, with --link, that's an irreversible operation:
it's unsafe to start the old cluster afterwards.

Tangent: I have a queued mail from ~15 months ago wherein I proposed adding to
pg_upgrade an option to remove the old data dir (or probably only the files
associated with known relations).  I realized at the time that would be pretty
scary without having first verified that the new cluster at least starts.  I'm
not sure how good an idea that is, but --startnewcluster would be needed there,
too.

Justin

Re: pg_upgrade: Pass -j down to vacuumdb

Reply via email to