On Mon, Jun 26, 2023 at 4:16 PM James Coleman <jtc...@gmail.com> wrote:
>
> On Mon, Jun 26, 2023 at 4:00 PM Andres Freund <and...@anarazel.de> wrote:
> >
> > Hi,
> >
> > On 2023-06-26 13:40:49 -0400, James Coleman wrote:
> > > Have we ever discussed running an analyze immediately after creating a 
> > > table?
> >
> > That doesn't make a whole lot of sense to me - we could just insert the
> > constants stats we wanted in that case.
> >
>
> I thought that was implicit in that, but fair enough :)
>
> > > Consider the following:
> > >
> > > create table stats(i int, t text not null);
> > > explain select * from stats;
> > >    Seq Scan on stats  (cost=0.00..22.70 rows=1270 width=36
> > > analyze stats;
> > > explain select * from stats;
> > >    Seq Scan on stats  (cost=0.00..0.00 rows=1 width=36)
> > >
> > > Combined with rapidly increasing error margin on row estimates when
> > > adding joins means that a query joining to a bunch of empty tables
> > > when a database first starts up can result in some pretty wild plan
> > > costs.
> >
> > The issue is that the table stats are likely going to quickly out of date in
> > that case, even a hand full of inserts (which wouldn't trigger
> > autovacuum analyzing) would lead to the "0 rows" stats causing very bad 
> > plans.
> >
>
> It's not obvious to me (as noted elsewhere in the thread) which is
> worse: a bunch of JOINs on empty tables can result in (specific
> example) plans with cost=15353020, and then trigger JIT, and...here we
> collide with my other thread about JIT [1].
>
> Regards,
> James Coleman
>
> 1: 
> https://www.postgresql.org/message-id/CAAaqYe-g-Q0Mm5H9QLcu8cHeMwok%2BHaxS4-UC9Oj3bK3a5jPvg%40mail.gmail.com

Thinking about this a bit more: it seems like what we're missing is either:

1. A heuristic for "this table will probably remain empty", or
2. A way to invalidate "0 rows" stats more quickly on even a handful of inserts.

I think one of those (ignoring questions about "how" for now) would
solve both cases?

Regards,
James Coleman


Reply via email to