Re: Slow query and wrong row estimates for CTE

Dane Foster Wed, 17 Feb 2021 08:52:07 -0800

A small update (see below/inline).


On Tue, Feb 16, 2021 at 2:11 PM Dane Foster <studdu...@gmail.com> wrote:

> Short conclusion:
> Switching from CTEs to temporary tables and analyzing reduced the runtime
> from 15 minutes to about 1.5 minutes.
>
>
> Longer conclusion:
>
> @Justin Pryzby <pry...@telsasoft.com>
>
>    - I experimented w/ materializing the CTEs and it helped at the
>    margins but did not significantly contribute to a reduction in runtime.
>    - No clustering was required because once I switched to temporary
>    tables the new plan no longer used the for_upsert index.
>
> @Michael Lewis <mle...@entrata.com>
>
>    - Increasing work_mem to 100MB (up from 4MB) helped at the margins
>    (i.e., some 100's of millisecond improvement) but did not represent a
>    significant reduction in the runtime.
>    - It wasn't obvious to me which window function would be appropriate
>    for the problem I was trying to solve therefore I didn't experiment w/ that
>    approach.
>
> I want to update/correct this statement:

>
>    - The selectivity of score_name='student_performance_index' was not
>    enough for the planner to choose an index over doing a FTS.
>
> I added a partial index (WHERE
score_name='student_performance_index'::citext) and that had a *dramatic*
impact. That part of the query went from ~12 seconds to ~1 second.

> Finally, thank you both for helping me bring this poor performing query to
> heel. Your insights were helpful and greatly appreciated.
>
> Sincerely,
>
> Dane
>
>
> On Tue, Feb 16, 2021 at 10:25 AM Dane Foster <studdu...@gmail.com> wrote:
>
>>
>> On Tue, Feb 16, 2021 at 10:13 AM Michael Lewis <mle...@entrata.com>
>> wrote:
>>
>>>                                                    Sort Method: external
>>>> merge  Disk: 30760kB
>>>>                                                    Worker 0:  Sort
>>>> Method: external merge  Disk: 30760kB
>>>>                                                    Worker 1:  Sort
>>>> Method: external merge  Disk: 30760kB
>>>>
>>>
>>> If you can increase work_mem, even setting it temporarily higher for the
>>> session or transaction, that may dramatically change the plan.
>>>
>> I will try increasing work_mem for the session later today.
>>
>>> The advice given by Justin particularly about row estimates would be
>>> wise to pursue.
>>>
>>
>>
>>> I'd wonder how selective that condition of score_name =
>>> 'student_performance_index' is in filtering out many of the 9.3 million
>>> tuples in that table and if an index with that as the leading column, or
>>> just an index on that column would be helpful.
>>>
>> There are 1,206,355 rows where score_name='student_performance_idex'.
>>
>>> You'd need to look at pg_stats for the table and see how many distinct
>>> values, and if student_performance_index is relatively high or low (or not
>>> present) in the MCVs list.
>>>
>> I will look into that.
>>
>>
>>> I am not sure if your query does what you want it to do as I admit I
>>> didn't follow your explanation of the desired behavior. My hunch is that
>>> you want to make use of a window function and get rid of one of the CTEs.
>>>
>> If you could tell me what part(s) are unclear I would appreciate it so
>> that I can write a better comment.
>>
>> Thank you sooo much for all the feedback. It is *greatly* appreciated!
>> Sincerely,
>>
>> Dane
>>
>>

Re: Slow query and wrong row estimates for CTE

Reply via email to