Re: postgres chooses objectively wrong index

Alexey Ermakov Wed, 18 Mar 2026 13:12:38 -0700

On 2026-03-19 00:38, Merlin Moncure wrote:

On Tue, Mar 17, 2026 at 11:27 PM Alexey Ermakov<[email protected]> wrote:


    On 2026-03-18 04:52, Merlin Moncure wrote:


        Hello. I think planner doesn't have information about
        distribution of *async.task_execution_state(task)* unless
        it's part of any full index. I would try to give that with
        extended statistics (postgresql 14+):

        create statistics (mcv) task_task_execution_state_stat on 
((async.task_execution_state(task))) from async.task;
        analyze async.task;

        If that won't help - please show distribution from
        pg_stats_ext view for extended statistic above.


    This unfortunately fails, probably because the table type
    includes system columns (despite not using them).
    orchestrator_service_user@orchestrator=> create statistics
     task_stats (mcv) on (async.task_execution_state(task)) from
    async.task;
    ERROR:  statistics creation on system columns is not supported

    This would require some refactoring to fix.


    Interesting... In that case functional index should help (as it
    also makes statistic for the planner):

    create index concurrently on task_task_execution_state_idx async.task using 
btree ((async.task_execution_state(task)));

    analyze async.task;


This can't help performance, as the index...
CREATE INDEX ON async.task(concurrency_pool, priority, entered)
WHERE async.task_execution_state(task) = 'READY';

...is very precisely configured to provide exactly what's needed; Ineed tasks for that exact pool in that exact order if and only ifready. The partial predicate is designed to keep the index nice andsmall since only a small portion of tasks would be eligible at anyspecific time.

The index I suggested was not intended to be used by such queries, it'sonly a way to provide statistics for the planner as `create statistics`on expression is not working here.

It might not be enough and require additional columns (in that case itwill be replacement for your index) or perhaps elevated statisticstarget. It could even make things worse but I'm sure you have ways totest that before putting it on an important database.

It should help if planner underestimate number of rows but even thentotal estimation won't be perfect when we have 2 conditions on statethat obviously statistically dependent. Combining both conditions onapplication side would work much better if that is possible...

What would also might help there - output of `explain (analyze,buffers)` of a query that really had a bad plan and executes in secondswith sizes of indexes. And same output but with `set enable_sort = off`to see plan that supposed to be better. And just in case number oflive/dead tuples in that table from pg_stat_user_tables.


--

Alexey Ermakov

Re: postgres chooses objectively wrong index

Reply via email to