Re: Incremental Sort Cost Estimation Instability

David Rowley Wed, 11 Sep 2024 18:05:51 -0700

On Thu, 27 Jun 2024 at 03:00, Andrei Lepikhov <[email protected]> wrote:
> I tried to invent a simple solution to fight this minor case. But the
> most clear and straightforward way here is to save a reference to the
> expression that triggered the PathKey creation inside the PathKey itself.
> See the sketch of the patch in the attachment.
> I'm not sure this instability is worth fixing this way, but the
> dependence of the optimisation outcome on the query text looks buggy.


I don't think that's going to work as that'll mean it'll just choose
whichever expression was used when the PathKey was first created.  For
your example query, both PathKey's are first created for the GROUP BY
clause in standard_qp_callback(). I only have to change the GROUP BY
in your query to use the equivalent column in the other table to get
it to revert back to the plan you complained about.

postgres=# EXPLAIN (costs off) SELECT count(*) FROM test t1, test t2
WHERE t1.x=t2.y AND t1.y=t2.x GROUP BY t2.y,t2.x;
                        QUERY PLAN
----------------------------------------------------------
 GroupAggregate
   Group Key: t2.y, t2.x
   ->  Sort
         Sort Key: t2.y, t2.x
         ->  Merge Join
               Merge Cond: (t1.y = t2.x)
               Join Filter: (t2.y = t1.x)
               ->  Index Scan using test_y_idx on test t1
               ->  Index Scan using test_x_idx on test t2
(9 rows)

Maybe doing something with estimate_num_groups() to find the
EquivalenceClass member with the least distinct values would be
better. I just can't think how that could be done in a performant way.

David

Re: Incremental Sort Cost Estimation Instability

Reply via email to