Aggressive autoanalyze and autovacuum settings solve most query problems.
These are my settings:
default_statistics_target = 5000
autovacuum_vacuum_scale_factor = 0.015
autovacuum_vacuum_threshold = 250
autovacuum_analyze_scale_factor = 0.015
autovacuum_analyze_threshold = 250

Such a high default_statistics_target value is controversial, but works for
our databases, and resetting it to 100 doesn't noticably speed up slow
parse/optimize on queries that take a long time to parse/optimize any more
than the 5000 value.

On Mon, Aug 26, 2024 at 6:30 AM Siraj G <tosira...@gmail.com> wrote:

> Thanks Tom. Collecting full stats on the tables involved corrected the
> execution.
>
> On Tue, Aug 13, 2024 at 9:57 AM Tom Lane <t...@sss.pgh.pa.us> wrote:
>
>> Siraj G <tosira...@gmail.com> writes:
>> > We migrated a PgSQL database from Cloud SQL to compute engine and since
>> > then there is a SQL we observed taking a long time. After some study, I
>> > found that the SQL is using NESTED LOOP where the cost is too high.
>>
>> The core of your problem seems to be here:
>>
>> >                      ->  Index Scan using
>> marketing_a_cancel__55ffff_idx on
>> > marketing_app_leadhistory w0  (cost=0.57..4274.30 rows=1 width=8)
>> (actual
>> > time=46.678..51.232 rows=44 loops=1)
>> >                            Index Cond: ((cancel_event_id IS NOT NULL)
>> AND
>> > (cancel_event_type = 1))
>> >                            Filter: ((status_id = 93) AND
>> > ((followup_date)::date >= '2024-08-01'::date) AND
>> ((followup_date)::date <=
>> > '2024-08-07'::date))
>> >                            Rows Removed by Filter: 22268
>> >                            Buffers: shared hit=9170 read=19
>>
>> If the planner had estimated 40-some rows out of this step, rather
>> than one, it would certainly not have chosen to use nestloop joins
>> atop this.  So the big problem to focus on is making that estimate
>> better.
>>
>> A secondary problem is that the choice of index seems poor: the
>> index itself is selecting 44+22268 = 22312 rows and then the filter
>> condition is throwing away 99.8% of those rows.  Probably, using
>> an index on (status_id, followup_date) would have worked better.
>>
>> I suspect that both of these things are tied to the non-normalization
>> of your "cancel" condition.  The planner probably believes that
>> "cancel_event_id IS NOT NULL" is statistically independent of
>> "cancel_event_type = 1"; but I'll bet it isn't, and thus the index
>> condition selects many more rows than the planner guessed.  You might
>> be able to improve that estimate by creating extended stats on both of
>> those columns, but really a better idea would be to take a step back
>> and figure out if those two columns can't be merged into one.
>>
>>                         regards, tom lane
>>
>

-- 
Death to America, and butter sauce.
Iraq lobster!

Reply via email to