Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> wrote: > I originally thought to provide it along-with the changes to > expand_inherited_rtentry(), but that thread is taking longer. Jeevan > Chalke needs rebased patches for his work on aggregate pushdown and > Thomas might need them for further review. So, here they are.
Since I have related patch in the current commitfest (https://commitfest.postgresql.org/14/1247/), I spent some time reviewing your patch: * generate_partition_wise_join_paths() Right parenthesis is missing in the prologue. * get_partitioned_child_rels_for_join() I think the Assert() statement is easier to understand inside the loop, see the assert.diff attachment. * have_partkey_equi_join() As the function handles generic join, this comment doesn't seem to me relevant: /* * The equi-join between partition keys is strict if equi-join between * at least one partition key is using a strict operator. See * explanation about outer join reordering identity 3 in * optimizer/README */ strict_op = op_strict(opexpr->opno); And I think the function can return true even if strict_op is false for all the operators evaluated in the loop. * match_expr_to_partition_keys() I'm not sure this comment is clear enough: /* * If it's a strict equi-join a NULL partition key on one side will * not join a NULL partition key on the other side. So, rows with NULL * partition key from a partition on one side can not join with those * from a non-matching partition on the other side. So, search the * nullable partition keys as well. */ if (!strict_op) continue; My understanding of the problem of NULL values generated by outer join is: these NULL values --- if evaluated by non-strict expression --- can make row of N-th partition on one side of the join match row(s) of *other than* N-th partition(s) on the other side. Thus the nullable input expressions may only be evaluated by strict operators. I think it'd be clearer if you stressed that (undesired) *match* of partition keys can be a problem, rather than mismatch. If you insist on your wording, then I think you should at least move the comment below to the part that only deals with strict operators. * There are several places where lfirst_node() macro should be used. For example rel = lfirst_node(RelOptInfo, lc); instead of rel = (RelOptInfo *) lfirst(lc); * map_and_merge_partitions() Besides a few changes proposed in map_and_merge_partitions.diff (a few of them to suppress compiler warnings) I think that this part needs more thought: { Assert(mergemap1[index1] != mergemap2[index2] && mergemap1[index1] >= 0 && mergemap2[index2] >= 0); /* * Both the partitions map to different merged partitions. This * means that multiple partitions from one relation matches to one * partition from the other relation. Partition-wise join does not * handle this case right now, since it requires ganging multiple * partitions together (into one RelOptInfo). */ merged_index = -1; } I could hit this path with the following test: CREATE TABLE a(i int) PARTITION BY LIST(i); CREATE TABLE a_0 PARTITION OF a FOR VALUES IN (0, 2); CREATE TABLE b(j int) PARTITION BY LIST(j); CREATE TABLE b_0 PARTITION OF b FOR VALUES IN (1, 2); SET enable_partition_wise_join TO on; SELECT * FROM a FULL JOIN b ON i = j; I don't think there's a reason not to join a_0 partition to b_0, is there? -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c new file mode 100644 index a75b1a3..3094b56 *** a/src/backend/optimizer/plan/planner.c --- b/src/backend/optimizer/plan/planner.c *************** get_partitioned_child_rels_for_join(Plan *** 6160,6170 **** PartitionedChildRelInfo *pc = lfirst(l); if (bms_is_member(pc->parent_relid, join_relids)) result = list_concat(result, list_copy(pc->child_rels)); } - /* The root partitioned table is included as a child rel */ - Assert(list_length(result) >= bms_num_members(join_relids)); - return result; } --- 6160,6172 ---- PartitionedChildRelInfo *pc = lfirst(l); if (bms_is_member(pc->parent_relid, join_relids)) + { + /* The root partitioned table is included as a child rel */ + Assert(list_length(pc->child_rels) >= 1); + result = list_concat(result, list_copy(pc->child_rels)); + } } return result; }
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c new file mode 100644 index eb35fab..aa9c70d *** a/src/backend/catalog/partition.c --- b/src/backend/catalog/partition.c *************** partition_list_bounds_merge(int partnatt *** 3110,3116 **** --- 3110,3119 ---- */ if (jointype == JOIN_INNER || jointype == JOIN_RIGHT || jointype == JOIN_SEMI) + { merged_index = -1; + merged_datum = NULL; + } else if (jointype == JOIN_LEFT || jointype == JOIN_FULL || jointype == JOIN_ANTI) { *************** partition_list_bounds_merge(int partnatt *** 3140,3146 **** --- 3143,3152 ---- */ if (jointype == JOIN_INNER || jointype == JOIN_LEFT || jointype == JOIN_SEMI || jointype == JOIN_ANTI) + { merged_index = -1; + merged_datum = NULL; + } else if (jointype == JOIN_RIGHT || jointype == JOIN_FULL) { /* *************** map_and_merge_partitions(int *partmap1, *** 3334,3346 **** /* * If the given to partitions map to each other, find the corresponding ! * merged partition index . */ if (partmap1[index1] == index2 && partmap2[index2] == index1) { /* ! * If both the partitions are mapped to the same merged partition, get ! * the index of merged partition. */ if (mergemap1[index1] == mergemap2[index2]) { --- 3340,3352 ---- /* * If the given to partitions map to each other, find the corresponding ! * merged partition index. */ if (partmap1[index1] == index2 && partmap2[index2] == index1) { /* ! * If both the partitions are mapped to the same merged partition, or ! * neither maps at all yet, get the index of merged partition. */ if (mergemap1[index1] == mergemap2[index2]) { *************** map_and_merge_partitions(int *partmap1, *** 3352,3359 **** */ if (merged_index < 0) { ! merged_index = *next_index; ! *next_index = *next_index + 1; mergemap1[index1] = merged_index; mergemap2[index2] = merged_index; } --- 3358,3364 ---- */ if (merged_index < 0) { ! merged_index = (*next_index)++; mergemap1[index1] = merged_index; mergemap2[index2] = merged_index; } *************** map_and_merge_partitions(int *partmap1, *** 3366,3380 **** * partitions map to the same merged partition. */ else if (mergemap1[index1] >= 0 && mergemap2[index2] < 0) ! { ! mergemap2[index2] = mergemap1[index1]; ! merged_index = mergemap1[index1]; ! } else if (mergemap1[index1] < 0 && mergemap2[index2] >= 0) ! { ! mergemap1[index1] = mergemap2[index2]; ! merged_index = mergemap2[index2]; ! } else { Assert(mergemap1[index1] != mergemap2[index2] && --- 3371,3379 ---- * partitions map to the same merged partition. */ else if (mergemap1[index1] >= 0 && mergemap2[index2] < 0) ! merged_index = mergemap2[index2] = mergemap1[index1]; else if (mergemap1[index1] < 0 && mergemap2[index2] >= 0) ! merged_index = mergemap1[index1] = mergemap2[index2]; else { Assert(mergemap1[index1] != mergemap2[index2] &&
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers