On Wed, Oct 1, 2014 at 12:01 AM, Andres Freund <and...@2ndquadrant.com> wrote:
> On 2014-09-30 23:25:45 +1300, David Rowley wrote: > > > > I've not quite gotten my head around how we might stop the unneeded > > relation from being the primary path to join the other inner relations, > > i.e. what would stop the planner making a plan that hashed all the other > > relations and planned to perform a sequence scan on the relation that we > > have no need to scan (because the foreign key tells us the join is > > pointless). If we were not use use that relation then we'd just have a > > bunch of hash tables with no path to join them up. If we did anything to > > force the planner into creating a plan that would suit skipping > relations, > > then we could possibly be throwing away the optimal plan..... Right? > > I'm not 100% sure I understand your problem description, but let me > describe how I think this would work. During planning, you'd emit the > exactly same plan as you'd today, with two exceptions: > a) When costing a node where one side of a join is very likely to be > removable, you'd cost it nearly as if there wasn't a join. > Ok given the tables: create table t1 (x int primary key); create table t2 (y int primary key); suppose the planner came up with something like: test=# explain (costs off) select t2.* from t1 inner join t2 on t1.x=t2.y; QUERY PLAN ---------------------------- Hash Join Hash Cond: (t1.x = t2.y) -> Seq Scan on t1 -> Hash -> Seq Scan on t2 If we had a foreign key... alter table t2 add constraint t2_y_fkey foreign key (y) references t1 (x); ...the join to t1 could possibly be "ignored" by the executor... but there's a problem as the plan states we're going to seqscan then hash that relation, then seqscan t1 with a hash lookup on each of t1's rows. In this case how would the executor skip the scan on t1? I can see how this might work if it was t2 that we were removing, as we'd just skip the hash lookup part in the hash join node. > b) The planner would attach some sort of 'one time join qual' to the > 'likely removable' join nodes. If, during executor init, that qual > returns false, simply don't perform the join. Just check the inner > relation, but entirely skip the outer relation. > > I think in the example that I've given above that the relation we'd want to keep is the outer relation, we'd want to throw away the inner one, but I can't quite see how that would work. Hopefully that makes sense. Regards David Rowley