On Wed, Nov 19, 2014 at 11:49 PM, David Rowley <dgrowle...@gmail.com> wrote:
> On Sun, Nov 16, 2014 at 12:19 PM, David Rowley <dgrowle...@gmail.com> > wrote: > >> On Sun, Nov 16, 2014 at 10:09 AM, Simon Riggs <si...@2ndquadrant.com> >> wrote: >> >>> >>> I propose that we keep track of whether there are any potentially >>> skippable joins at the top of the plan. When we begin execution we do >>> a single if test to see if there is run-time work to do. If we pass >>> the run-time tests we then descend the tree and prune the plan to >>> completely remove unnecessary nodes. We end with an EXPLAIN and >>> EXPLAIN ANALYZE that looks like this >>> >>> > QUERY PLAN >>> > ------------------------------------------------------------------ >>> > Aggregate (actual rows=1 loops=1) >>> > -> Seq Scan on t1 (actual rows=1000000 loops=1) >>> >>> Doing that removes all the overheads and complexity; it also matches >>> how join removal currently works. >>> >>> >> >> I've attached an updated patch which works in this way. All of the skipping code that I had added to the executor's join functions has now been removed. Here's an example output with the plan trimmed, and then untrimmed. set constraints b_c_id_fkey deferred; explain (costs off) select b.* from b inner join c on b.c_id = c.id; QUERY PLAN --------------- Seq Scan on b (1 row) -- add a item to the trigger queue by updating a referenced record. update c set id = 2 where id=1; explain (costs off) select b.* from b inner join c on b.c_id = c.id; QUERY PLAN ------------------------------ Hash Join Hash Cond: (b.c_id = c.id) -> Seq Scan on b -> Hash -> Seq Scan on c (5 rows) A slight quirk with the patch as it stands is that I'm unconditionally NOT removing Sort nodes that sit below a MergeJoin node. The reason for this is that I've not quite figured out a way to determine if the Sort order is required still. An example of this can be seen in the regression tests: -- check merge join nodes are removed properly set enable_hashjoin = off; -- this should remove joins to b and c. explain (costs off) select COUNT(*) from a inner join b on a.b_id = b.id left join c on a.id = c.id; QUERY PLAN --------------------------- Aggregate -> Sort Sort Key: a.b_id -> Seq Scan on a (4 rows) As the patch stands there's still a couple of FIXMEs in there, so there's still a bit of work to do yet. Comments are welcome Regards David Rowley
inner_join_removals_2014-11-24_7cde1e4.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers