Re: [HACKERS] advanced partition matching algorithm for partition-wise join

Etsuro Fujita Wed, 17 Jul 2019 23:56:11 -0700

On Mon, Jul 8, 2019 at 8:33 PM Etsuro Fujita <[email protected]> wrote:
> I'll review the remaining parts (ie,
> "0002-Partition-wise-join-for-1-1-1-0-0-1-partition-matchi-v22.patch"
> and "0003-Tests-for-0-1-1-1-and-1-0-partition-matching-v22.patch")
> closely next.


I've been reviewing the main patch
"0002-Partition-wise-join-for-1-1-1-0-0-1-partition-matchi-v22.patch",
but it's pretty large and complicated, so I'm still at the first pass,
and so maybe I'm missing something, but let me comment a few things on
the patch.  First of all, I think the patch's problem setting, which
is stated in the commit message, is reasonable:

    Earlier version of partition-wise join implementation allowed
    partition-wise join between two relations with exactly same partition
    bounds. This commit allows partition-wise join to be applied under
    following conditions

    1. the partition bounds of joining relations are such that rows from
    given partition on one side join can join with rows from maximum one
    partition on the other side i.e. bounds of a given partition on one
    side match/overlap with those of maximum one partition on the other
    side.

And I agree with the patch's approach to this that tries to find such
a partition mapping between two input partitioned relations for a join
relation, by trying to match their partition bounds, which is
implemented in a new function partition_bounds_merge(), in general.
But one thing that I'm concerned about most at this point is this in
try_partitionwise_join():

    /*
-    * Since we allow partitionwise join only when the partition bounds of the
-    * joining relations exactly match, the partition bounds of the join
-    * should match those of the joining relations.
+    * Get the list of matching partitions to be joined along with the
+    * partition bounds of the join relation. Because of the restrictions
+    * imposed by partition matching algorithm, not every pair of joining
+    * relations for this join will be able to use partition-wise join. But all
+    * those pairs which can use partition-wise join will produce the same
+    * partition bounds for the join relation.
     */
-   Assert(partition_bounds_equal(joinrel->part_scheme->partnatts,
-                                 joinrel->part_scheme->parttyplen,
-                                 joinrel->part_scheme->parttypbyval,
-                                 joinrel->boundinfo, rel1->boundinfo));
-   Assert(partition_bounds_equal(joinrel->part_scheme->partnatts,
-                                 joinrel->part_scheme->parttyplen,
-                                 joinrel->part_scheme->parttypbyval,
-                                 joinrel->boundinfo, rel2->boundinfo));
+   join_boundinfo = partition_bounds_merge(part_scheme->partnatts,
+                                           part_scheme->partsupfunc,
+                                           part_scheme->partcollation,
+                                           rel1, rel2,
+                                           parent_sjinfo->jointype,
+                                           &parts1, &parts2);
+
+   if (join_boundinfo == NULL)
+       return;

I.e., partition_bounds_merge() is performed for each pair of input
partitioned relations for a join relation in try_partitionwise_join().
Since partition_bounds_merge() would need a lot of CPU cycles, I don't
think this is acceptable; ISTM that some redesign is needed to avoid
this.  I'm wondering that once we successfully merged partition bounds
from a pair of input partitioned relations for the join relation, by
using the merged partition bounds, we could get the lists of matching
to-be-joined partitions for subsequent pairs of input partitioned
relations for the join relation in a more efficient way than by
performing partition_bounds_merge() as proposed in the patch.

Best regards,
Etsuro Fujita

Re: [HACKERS] advanced partition matching algorithm for partition-wise join

Reply via email to