On Fri, Aug 16, 2019 at 10:25 PM Etsuro Fujita <etsuro.fuj...@gmail.com> wrote: > It seems that I performed the above tests on an assertion-enabled > build. :( So I executed the tests one more time. Here are the > results. > > * 2-way self-join of pt: explain analyze select * from pt t0, pt t1 > where t0.a = t1.a; > - HEAD: > Planning Time: 0.969 ms > Execution Time: 13.843 ms > - with patch: > Planning Time: 1.720 ms > Execution Time: 14.393 ms > - with patch plus attached: > Planning Time: 1.630 ms > Execution Time: 14.002 ms > > * 4-way self-join of pt: explain analyze select * from pt t0, pt t1, > pt t2, pt t3 where t0.a = t1.a > > and t1.a = t2.a and t2.a = t3.a; > - HEAD: > Planning Time: 12.203 ms > Execution Time: 31.784 ms > - with patch: > Planning Time: 32.102 ms > Execution Time: 32.504 ms > - with patch plus attached: > Planning Time: 19.471 ms > Execution Time: 32.582 ms > > * 8-way self-join of pt: explain analyze select * from pt t0, pt t1, > pt t2, pt t3, pt t4, pt t5, pt t6, pt t7 where t0.a = t1.a and t1.a = > t2.a and t2.a = t3.a and t3.a = t4.a and t4.a = t5.a and t5.a = t6.a > and t6.a = t7.a; > - HEAD: > Planning Time: 948.131 ms > Execution Time: 55.645 ms > - with patch: > Planning Time: 2939.813 ms > Execution Time: 56.760 ms > - with patch plus attached: > Planning Time: 1108.076 ms > Execution Time: 55.750 ms > > Note: the attached patch still uses the proposed partition matching > algorithm for these queries. As I said before, these queries don't > need that algorithm, so we could eliminate the planning overhead > compared to HEAD, by planning these queries as before, perhaps, but I > haven't modified the patch as such yet.
I modified the patch further as such. Attached is an updated version of the patch created on top of the patch in [1]. I did the tests again using the updated version of the patch. Here are the results: * 2-way self-join of pt: Planning Time: 1.043 ms Execution Time: 13.931 ms * 4-way self-join of pt: Planning Time: 12.499 ms Execution Time: 32.392 ms * 8-way self-join of pt: Planning Time: 968.412 ms Execution Time: 56.328 ms The planning time for each test case still increased slightly, but IMO I think that would be acceptable. To see the efficiency of the attached, I did another testing with test cases that really need the new partition-matching algorithm: * explain analyze select * from pt6 t6, pt7 t7 where t6.a = t7.a; - base patch in [1] Planning Time: 1.758 ms Execution Time: 13.977 ms - with attached Planning Time: 1.777 ms Execution Time: 13.959 ms * explain analyze select * from pt4 t4, pt5 t5, pt6 t6, pt7 t7 where t4.a = t5.a and t5.a = t6.a and t6.a = t7.a; - base patch in [1] Planning Time: 33.201 ms Execution Time: 32.480 ms - with attached Planning Time: 21.019 ms Execution Time: 32.777 ms * explain analyze select * from pt0 t0, pt1 t1, pt2 t2, pt3 t3, pt4 t4, pt5 t5, pt6 t6, pt7 t7 where t0.a = t1.a and t1.a = t2.a and t2.a = t3.a and t3.a = t4.a and t4.a = t5.a and t5.a = t6.a and t6.a = t7.a; - base patch in [1] Planning Time: 3060.000 ms Execution Time: 55.553 ms - with attached Planning Time: 1144.996 ms Execution Time: 56.233 ms where pt0, pt1, pt2, pt3, pt4, pt5, pt6 and pt7 are list partitioned tables that have slighly different list values. (The structure and list values of ptN are almost the same as that of pt used above, but ptN's N-th partition ptNpN has an extra list value that pt's N-th partition ptpN doesn't have.) If anyone is interested in this testing, I'll send a script file for producing these list partitioned tables. About the attached: * The attached patch modified try_partitionwise_join() so that we call partition_bounds_equal() then partition_bounds_merge() (if necessary) to create the partition bounds for the join rel. We don't support for merging the partition bounds for the hash-partitioning case, so this makes code to handle the hash-partitioning case in partition_bounds_merge() completely unnecessary, so I removed that entirely. * I removed this assertion in partition_bounds_merge(), because I think this is covered by two assertions above this. + Assert((*outer_parts == NIL || *inner_parts != NIL) && + (*inner_parts == NIL || *outer_parts != NIL)); * (I forgot to mention this in a previous email, but) I removed this bit of generate_matching_part_pairs(), because we already have the same check in try_partitionwise_join(), so this bit would be redundant IIUC. + switch (jointype) + { + case JOIN_INNER: + case JOIN_SEMI: + + /* + * An inner or semi join can not return any row when the + * matching partition on either side is missing. We should + * have eliminated all such cases while merging the bounds. + */ + Assert(part1 >= 0 && part2 >= 0); + break; + + case JOIN_LEFT: + case JOIN_ANTI: + Assert(part1 >= 0); + if (part2 < 0) + merged = false; + break; + + case JOIN_FULL: + if (part1 < 0 || part2 < 0) + merged = false; + break; + + default: + elog(ERROR, "unrecognized join type: %d", (int) jointype); + } * I added more comments. If there are no objections, I'll merge the attached with the base patch in [1]. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/CAPmGK177W%2BjNgpM5f_m-vdDKbEBu_%3D3CyPzFjkT_1nzf1Vqn%2BA%40mail.gmail.com
Modify-partition-matching-algorithm-1.patch
Description: Binary data