On Wed, 2008-06-04 at 22:18 -0400, Tom Lane wrote: > [ redirecting thread from -performance to -hackers ] > > Simon Riggs <[EMAIL PROTECTED]> writes: > > I've got a test case which shows something related and weird, though not > > the exact case. > > > The queries shown here have significantly different costs, depending > > upon whether we use tables a or b in the query. Since a and b are > > equivalent this result isn't expected at all. > > Hmm. I had been guessing that there was something about your original > query that prevented the system from applying best_appendrel_indexscan, > but after fooling with this a bit, I don't believe that's the issue > at all. The problem is that these two cases "should" be equivalent: > > select ... from a join b on (a.id = b.id) left join c on (a.id = c.id); > > select ... from a join b on (a.id = b.id) left join c on (b.id = c.id); > > but they are not seen that way by the current planner. It correctly > forms an EquivalenceClass consisting of a.id and b.id, but it cannot put > c.id into that same class, and so the clause a.id = c.id is just left > alone; there is noplace that can generate "b.id = c.id" as an > alternative join condition. This means that (for the first query) > we can consider the join orders "(a join b) leftjoin c" and > "(a leftjoin c) join b", but there is no way to consider the join > order "(b leftjoin c) join a"; to implement that we'd need to have the > alternative join clause available. So if that join order is > significantly better than the other two, we lose. > > This is going to take a bit of work to fix :-(. I am toying with the > idea that we could go ahead and put c.id into the EquivalenceClass > as a sort of second-class citizen that's labeled as associated with this > particular outer join --- the implication being that we can execute the > outer join using a generated clause that equates c.id to any one of the > first-class members of the EquivalenceClass, but above the outer join > we can't assume that c.id hasn't gone to null, so it's not really equal > to anything else in the class. I think it might also be possible > to get rid of the reconsider_outer_join_clauses() kluge in favor of > driving transitive-equality-to-a-constant off of this representation.
Yes, EquivalenceClass allows an implication to be made either way around, which is wrong for this class of problem. I was imagining a higher level ImplicationClass that was only of the form A => B but not B => A. So we end up with an ImplicationTree, rather than a just a flat Class. Which is where I punted... > However there's a larger issue here, which is the very identity of an > outer join :-(. Currently, for the first query above, the left join > is defined as being between a and c, with a being the minimum > left-hand-side needed to form the join. To be able to handle a case > like this, it seems that the notion of a "minimum left hand side" > falls apart altogether. We can execute the OJ using either a or b > as left hand side. So the current representation of OuterJoinInfo, > and the code that uses it to enforce valid join orders, needs a serious > rethink. Hadn't seen that at all. > Looks like 8.4 development material to me, rather than something we > can hope to back-patch a fix for... Definitely. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers