On Tue, Sep 22, 2009 at 11:21 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> Currently, however, we only consider this possibility when the inner >> rel is NOT a joinrel. It seems like it might be possible to change >> this, but it doesn't look straightforward. > > Well, it's straightforward enough in theory, it's just the exponential > growth in the number of join paths to consider that's a problem :-(. > I think what we'd need is a heuristic to limit the paths considered.
[ picking this up again now that CommitFest is over ] So that I don't have to keep referring to "this possibility" and similar circumlocution, let's define an index-accelerated path (feel free to suggest a different term) to be the generalization to joinrels of what we now call a nested inner index-scan. In other words, they can only be usefully used on the inner side of a nested loop, where they'll be rescanned with different parameters for each outer row, and they can only ever win when some part of the path involves a *partial* index scan that can used one of the passed parameters as an index qual. For a fixed a set of parameters to be pushed down from the outer side, it seems to me that we could build up a set of index-accelerated paths for a joinrel {A B} by considering the combination of (1) an index-accelerated path for A joined to an index-accelerated path for B, or (2) an index-accelerated path for A joined to the cheapest total path for B. I believe we don't need to worry about path keys because this will only ever be used on the inner side of a nested loop, so the pathkeys will be ignored anyway. In other words, the first heuristic is that you can't build up an index-accelerated path for the joinrel unless there is an index-accelerated path for a least one of its baserels. That still leaves a lot of silly paths, though. In many cases, if you're thinking about joining A to {B C} using an index-accelerated path, you'd be just as well off joining to B first and then to C. So it might be that we only need to consider index-accelerated paths when there is no legal join between the LHS and a subset of the RHS. But I'm not completely sure that I'm right about this, nor whether it's easy to check for. The other problem I see here is that the bottom-up approach that we use in general is going to be difficult to apply here, because the set of paths will vary depending on what parameters are pushed down from the outer side. > I think Andrew was suggesting that we not attempt to consider this > automatically, but only when the user writes the query in a way that > exposes it directly via LATERAL. While that goes against my normal > instincts for the planner, it isn't unreasonable as a first step. Possibly, but I'm not sure we have a good enough design sketch at this point to even know whether such a restriction would be useful. Another thought, only semi-related. One of the big use cases for LATERAL in general is to use a set-returning function in the FROM clause that uses vars from a preceding FROM item. I am idly wondering if there's a reason why ExecProject is not its own node type. It seems that we have close to the same logic scattered through a whole bunch of different node types... ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers