On Mon, Jun 26, 2023 at 4:16 PM James Coleman <jtc...@gmail.com> wrote: > > On Mon, Jun 26, 2023 at 4:00 PM Andres Freund <and...@anarazel.de> wrote: > > > > Hi, > > > > On 2023-06-26 13:40:49 -0400, James Coleman wrote: > > > Have we ever discussed running an analyze immediately after creating a > > > table? > > > > That doesn't make a whole lot of sense to me - we could just insert the > > constants stats we wanted in that case. > > > > I thought that was implicit in that, but fair enough :) > > > > Consider the following: > > > > > > create table stats(i int, t text not null); > > > explain select * from stats; > > > Seq Scan on stats (cost=0.00..22.70 rows=1270 width=36 > > > analyze stats; > > > explain select * from stats; > > > Seq Scan on stats (cost=0.00..0.00 rows=1 width=36) > > > > > > Combined with rapidly increasing error margin on row estimates when > > > adding joins means that a query joining to a bunch of empty tables > > > when a database first starts up can result in some pretty wild plan > > > costs. > > > > The issue is that the table stats are likely going to quickly out of date in > > that case, even a hand full of inserts (which wouldn't trigger > > autovacuum analyzing) would lead to the "0 rows" stats causing very bad > > plans. > > > > It's not obvious to me (as noted elsewhere in the thread) which is > worse: a bunch of JOINs on empty tables can result in (specific > example) plans with cost=15353020, and then trigger JIT, and...here we > collide with my other thread about JIT [1]. > > Regards, > James Coleman > > 1: > https://www.postgresql.org/message-id/CAAaqYe-g-Q0Mm5H9QLcu8cHeMwok%2BHaxS4-UC9Oj3bK3a5jPvg%40mail.gmail.com
Thinking about this a bit more: it seems like what we're missing is either: 1. A heuristic for "this table will probably remain empty", or 2. A way to invalidate "0 rows" stats more quickly on even a handful of inserts. I think one of those (ignoring questions about "how" for now) would solve both cases? Regards, James Coleman