Aggressive autoanalyze and autovacuum settings solve most query problems. These are my settings: default_statistics_target = 5000 autovacuum_vacuum_scale_factor = 0.015 autovacuum_vacuum_threshold = 250 autovacuum_analyze_scale_factor = 0.015 autovacuum_analyze_threshold = 250
Such a high default_statistics_target value is controversial, but works for our databases, and resetting it to 100 doesn't noticably speed up slow parse/optimize on queries that take a long time to parse/optimize any more than the 5000 value. On Mon, Aug 26, 2024 at 6:30 AM Siraj G <tosira...@gmail.com> wrote: > Thanks Tom. Collecting full stats on the tables involved corrected the > execution. > > On Tue, Aug 13, 2024 at 9:57 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > >> Siraj G <tosira...@gmail.com> writes: >> > We migrated a PgSQL database from Cloud SQL to compute engine and since >> > then there is a SQL we observed taking a long time. After some study, I >> > found that the SQL is using NESTED LOOP where the cost is too high. >> >> The core of your problem seems to be here: >> >> > -> Index Scan using >> marketing_a_cancel__55ffff_idx on >> > marketing_app_leadhistory w0 (cost=0.57..4274.30 rows=1 width=8) >> (actual >> > time=46.678..51.232 rows=44 loops=1) >> > Index Cond: ((cancel_event_id IS NOT NULL) >> AND >> > (cancel_event_type = 1)) >> > Filter: ((status_id = 93) AND >> > ((followup_date)::date >= '2024-08-01'::date) AND >> ((followup_date)::date <= >> > '2024-08-07'::date)) >> > Rows Removed by Filter: 22268 >> > Buffers: shared hit=9170 read=19 >> >> If the planner had estimated 40-some rows out of this step, rather >> than one, it would certainly not have chosen to use nestloop joins >> atop this. So the big problem to focus on is making that estimate >> better. >> >> A secondary problem is that the choice of index seems poor: the >> index itself is selecting 44+22268 = 22312 rows and then the filter >> condition is throwing away 99.8% of those rows. Probably, using >> an index on (status_id, followup_date) would have worked better. >> >> I suspect that both of these things are tied to the non-normalization >> of your "cancel" condition. The planner probably believes that >> "cancel_event_id IS NOT NULL" is statistically independent of >> "cancel_event_type = 1"; but I'll bet it isn't, and thus the index >> condition selects many more rows than the planner guessed. You might >> be able to improve that estimate by creating extended stats on both of >> those columns, but really a better idea would be to take a step back >> and figure out if those two columns can't be merged into one. >> >> regards, tom lane >> > -- Death to America, and butter sauce. Iraq lobster!