On Thu, 27 Jun 2024 at 03:00, Andrei Lepikhov <lepi...@gmail.com> wrote: > I tried to invent a simple solution to fight this minor case. But the > most clear and straightforward way here is to save a reference to the > expression that triggered the PathKey creation inside the PathKey itself. > See the sketch of the patch in the attachment. > I'm not sure this instability is worth fixing this way, but the > dependence of the optimisation outcome on the query text looks buggy.
I don't think that's going to work as that'll mean it'll just choose whichever expression was used when the PathKey was first created. For your example query, both PathKey's are first created for the GROUP BY clause in standard_qp_callback(). I only have to change the GROUP BY in your query to use the equivalent column in the other table to get it to revert back to the plan you complained about. postgres=# EXPLAIN (costs off) SELECT count(*) FROM test t1, test t2 WHERE t1.x=t2.y AND t1.y=t2.x GROUP BY t2.y,t2.x; QUERY PLAN ---------------------------------------------------------- GroupAggregate Group Key: t2.y, t2.x -> Sort Sort Key: t2.y, t2.x -> Merge Join Merge Cond: (t1.y = t2.x) Join Filter: (t2.y = t1.x) -> Index Scan using test_y_idx on test t1 -> Index Scan using test_x_idx on test t2 (9 rows) Maybe doing something with estimate_num_groups() to find the EquivalenceClass member with the least distinct values would be better. I just can't think how that could be done in a performant way. David