On Tue, Aug 27, 2019 at 8:51 AM Amit Langote <amitlangot...@gmail.com> wrote:
> Hi Richard, > > On Mon, Aug 26, 2019 at 6:33 PM Richard Guo <ri...@pivotal.io> wrote: > > > > Hi All, > > > > To generate partitionwise join, we need to make sure there exists an > > equi-join condition for each pair of partition keys, which is performed > > by have_partkey_equi_join(). This makes sense and works well. > > > > But if, let's say, one certain pair of partition keys (foo.k = bar.k) > > has formed an equivalence class containing consts, no join clause would > > be generated for it, since we have already generated 'foo.k = const' and > > 'bar.k = const' and pushed them into the proper restrictions earlier. > > > > This will make partitionwise join fail to be planned if there are > > multiple partition keys and the pushed-down restrictions 'xxx = const' > > fail to prune away any partitions. > > > > Consider the examples below: > > > > create table p (k1 int, k2 int, val int) partition by range(k1,k2); > > create table p_1 partition of p for values from (1,1) to (10,100); > > create table p_2 partition of p for values from (10,100) to (20,200); > > > > If we are joining on each pair of partition keys, we can generate > > partitionwise join: > > > > # explain (costs off) > > select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 = > bar.k2; > > QUERY PLAN > > ---------------------------------------------------------------------- > > Append > > -> Hash Join > > Hash Cond: ((foo.k1 = bar.k1) AND (foo.k2 = bar.k2)) > > -> Seq Scan on p_1 foo > > -> Hash > > -> Seq Scan on p_1 bar > > -> Hash Join > > Hash Cond: ((foo_1.k1 = bar_1.k1) AND (foo_1.k2 = bar_1.k2)) > > -> Seq Scan on p_2 foo_1 > > -> Hash > > -> Seq Scan on p_2 bar_1 > > (11 rows) > > > > But if we add another qual 'foo.k2 = const', we will be unable to > > generate partitionwise join any more, because have_partkey_equi_join() > > thinks not every partition key has an equi-join condition. > > > > # explain (costs off) > > select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 = > bar.k2 and foo.k2 = 16; > > QUERY PLAN > > ----------------------------------------- > > Hash Join > > Hash Cond: (foo.k1 = bar.k1) > > -> Append > > -> Seq Scan on p_1 foo > > Filter: (k2 = 16) > > -> Seq Scan on p_2 foo_1 > > Filter: (k2 = 16) > > -> Hash > > -> Append > > -> Seq Scan on p_1 bar > > Filter: (k2 = 16) > > -> Seq Scan on p_2 bar_1 > > Filter: (k2 = 16) > > (13 rows) > > > > Is this a problem? > > Perhaps. Maybe it has to do with the way have_partkey_equi_join() has > been coded. If it was coded such that it figured out on its own that > the equivalence (foo.k2, bar.k2, ...) does exist, then that would > allow partitionwise join to occur, which I think would be OK to do. > But maybe I'm missing something. > > This should be caused by how we deduce join clauses from equivalence classes. ECs containing consts will not be considered so we cannot generate (foo.k2 = bar.k2) for the query above. In addition, when generating join clauses from equivalence classes, we only select the joinclause with the 'best score', or the first joinclause with a score of 3. This may make us miss some joinclause on partition keys. Check the query below as a more illustrative example: create table p (k int, val int) partition by range(k); create table p_1 partition of p for values from (1) to (10); create table p_2 partition of p for values from (10) to (100); If we use quals 'foo.k = bar.k and foo.k = bar.val', we can generate partitionwise join: # explain (costs off) select * from p as foo join p as bar on foo.k = bar.k and foo.k = bar.val; QUERY PLAN ----------------------------------------- Append -> Hash Join Hash Cond: (foo.k = bar.k) -> Seq Scan on p_1 foo -> Hash -> Seq Scan on p_1 bar Filter: (k = val) -> Hash Join Hash Cond: (foo_1.k = bar_1.k) -> Seq Scan on p_2 foo_1 -> Hash -> Seq Scan on p_2 bar_1 Filter: (k = val) (13 rows) But if we exchange the order of the two quals to 'foo.k = bar.val and foo.k = bar.k', then partitionwise join cannot be generated any more, because we only have joinclause 'foo.k = bar.val' as it first reached score of 3. We have missed the joinclause on the partition key although it does exist. # explain (costs off) select * from p as foo join p as bar on foo.k = bar.val and foo.k = bar.k; QUERY PLAN ----------------------------------------- Hash Join Hash Cond: (foo.k = bar.val) -> Append -> Seq Scan on p_1 foo -> Seq Scan on p_2 foo_1 -> Hash -> Append -> Seq Scan on p_1 bar Filter: (val = k) -> Seq Scan on p_2 bar_1 Filter: (val = k) (11 rows) Thanks Richard