Re: Statistics Import and Export

Matthias van de Meent Tue, 23 Apr 2024 09:34:16 -0700

On Tue, 23 Apr 2024, 05:52 Tom Lane, <t...@sss.pgh.pa.us> wrote:
> Jeff Davis <pg...@j-davis.com> writes:
> > On Mon, 2024-04-22 at 16:19 -0400, Tom Lane wrote:
> >> Loading data without stats, and hoping
> >> that auto-analyze will catch up sooner not later, is exactly the
> >> current behavior that we're doing all this work to get out of.
>
> > That's the disconnect, I think. For me, the main reason I'm excited
> > about this work is as a way to solve the bad-plans-after-upgrade
> > problem and to repro planner issues outside of production. Avoiding the
> > need to ANALYZE at the end of a data load is also a nice convenience,
> > but not a primary driver (for me).
>
> Oh, I don't doubt that there are use-cases for dumping stats without
> data.  I'm just dubious about the reverse.  I think data+stats should
> be the default, even if only because pg_dump's default has always
> been to dump everything.  Then there should be a way to get stats
> only, and maybe a way to get data only.  Maybe this does argue for a
> four-section definition, despite the ensuing churn in the pg_dump API.


I've heard of use cases where dumping stats without data would help
with production database planner debugging on a non-prod system.

Sure, some planner inputs would have to be taken into account too, but
having an exact copy of production stats is at least a start and can
help build models and alerts for what'll happen when the tables grow
larger with the current stats.

As for other planner inputs: table size is relatively easy to shim
with sparse files; cumulative statistics can be copied from a donor
replica if needed, and btree indexes only really really need to
contain their highest and lowest values (and need their height set
correctly).

Kind regards,

Matthias van de Meent

Re: Statistics Import and Export

Reply via email to