Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

yh z Tue, 10 Jan 2023 00:26:19 -0800

Hi all,

Thanks for yours reply.


After receiving your comments and making targeted modifications. The
conclusion is that the option
"table.optimizer.bushy-join-reorder-threshold" can be added. Relevant PR
[1] has been submitted. Sincerely welcome to review it. Thank you.

This discussion will be closed soon. Thanks for your comments.

[1] https://github.com/apache/flink/pull/21530

Best regards,
Yunhong Zheng

godfrey he <godfre...@gmail.com> 于2023年1月9日周一 10:26写道：

> Hi Yunhong,
>
> Thanks for driving this discuss!
>
> This option looks good to me,
> and looking forward to contributing this rule back to Apache Calcite.
>
> Best,
> Godfrey
>
>
>
> yh z <zhengyunhon...@gmail.com> 于2023年1月5日周四 15:32写道：
> >
> > Hi Benchao,
> >
> > Thanks for your reply.
> >
> > Since our existing test results are based on multiple performance
> > optimization points on the TPC-DS benchmark[1][2], we haven't separately
> > tested the performance improvement brought by new bushy join reorder
> > rule. I will complete this test recently and update the results to this
> > email.
> >
> > I am very happy to contribute to Calcite. Later, I will push the PR of
> the
> > bushy join reorder rule to Calcite.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-27583
> > [2] https://issues.apache.org/jira/browse/FLINK-29942
> >
> > Best regards,
> > Yunhong Zheng
> >
> > Benchao Li <libenc...@apache.org> 于2023年1月4日周三 19:03写道：
> >
> > > Hi Yunhong,
> > >
> > > Thanks for the updating. And introducing the new bushy join reorder
> > > algorithm would be great. And I also agree with the newly added config
> > > option "table.optimizer.bushy-join-reorder-threshold" and 12 as the
> default
> > > value.
> > >
> > >
> > > > As for optimization
> > > > latency, this is the problem to be solved by the parameters to be
> > > > introduced in this discussion. When there are many tables need to be
> > > > reordered, the optimization latency will increase greatly. But when
> the
> > > > table numbers less than the threshold, the latency is the same as the
> > > > LoptOptimizeJoinRule.
> > >
> > >
> > > This sounds great. If possible, could you share more numbers to us?
> E.g.,
> > > what's the latency of optimization when there are 11/12 tables for both
> > > approach?
> > >
> > >  For question #3: The implementation of Calcite
> MultiJoinOptimizeBushyRule
> > > > is very simple, and it will not store the intermediate results at
> all.
> > > So,
> > > > the implementation of Calcite cannot get all possible join reorder
> > > results
> > > > and it cannot combine with the current cost model to get more
> reasonable
> > > > join reorder results.
> > >
> > >
> > > It's ok to do it in Flink as the first step. It would be great to also
> > > contribute it to Calcite later if possible, it depends on you.
> > >
> > > yh z <zhengyunhon...@gmail.com> 于2023年1月3日周二 15:27写道：
> > >
> > > > Hi Benchao,
> > > >
> > > > Thanks for your reply.
> > > >
> > > > Actually,  I mistakenly wrote the name "bushy join reorder" to "busy
> join
> > > > reorder". I'm sorry for the trouble bring to you. "Bushy join
> reorder"
> > > > means we can build a bushy join tree based on cost model, but now
> Flink
> > > can
> > > > only build a left-deep tree using Calcite LoptOptimizeJoinRule. I
> hope my
> > > > answers can help you solve the following questions:
> > > >
> > > > For question #1: The biggest advantage of this "bushy join reorder"
> > > > strategy over the default Flink left-deep tree strategy is that it
> can
> > > > retail all possible join reorder plans, and then select the optimal
> plan
> > > > according to the cost model. This means that the busy join reorder
> > > strategy
> > > > can be better combined with the current cost model to get more
> reasonable
> > > > join reorder results. We verified it on the TPC-DS benchmark, with
> the
> > > > spark plan as a reference, the new busy join reorder strategy can
> make
> > > more
> > > > TPC-DS query plans be adjusted to be consistent with the Spark plan,
> and
> > > > the execution time is signifcantly reduced.  As for optimization
> > > > latency, this is the problem to be solved by the parameters to be
> > > > introduced in this discussion. When there are many tables need to be
> > > > reordered, the optimization latency will increase greatly. But when
> the
> > > > table numbers less than the threshold, the latency is the same as the
> > > > LoptOptimizeJoinRule.
> > > >
> > > > For question #2: According to my research, many compute or database
> > > systems
> > > > have the "bushy join reorder" strategies based on dynamic
> programming.
> > > For
> > > > example, Spark and PostgresSql use the same strategy, and the
> threshold
> > > be
> > > > set to 12. Also, some papers, like [1] and [2], have also researched
> this
> > > > strategy, and [2] set the threshold to 14.
> > > >
> > > > For question #3: The implementation of Calcite
> MultiJoinOptimizeBushyRule
> > > > is very simple, and it will not store the intermediate results at
> all.
> > > So,
> > > > the implementation of Calcite cannot get all possible join reorder
> > > results
> > > > and it cannot combine with the current cost model to get more
> reasonable
> > > > join reorder results.
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> > > > [2] https://db.in.tum.de/~radke/papers/hugejoins.pdf
> > > >
> > > >
> > > >
> > > > Benchao Li <libenc...@apache.org> 于2023年1月3日周二 12:54写道：
> > > >
> > > > > Hi Yunhong,
> > > > >
> > > > > Thanks for driving this~
> > > > >
> > > > > I haven't gone deep into the implementation details yet. Regarding
> the
> > > > > general description, I would ask a few questions firstly:
> > > > >
> > > > > #1, Is there any benchmark results about the optimization latency
> > > change
> > > > > compared to current approach? In OLAP scenario, query optimization
> > > > latency
> > > > > is more crucial.
> > > > >
> > > > > #2, About the term "busy join reorder", is there any others systems
> > > which
> > > > > also use this term? I know Calcite has a rule[1] which uses the
> term
> > > > "bushy
> > > > > join".
> > > > >
> > > > > #3, About the implementation, if this does the same work as Calcite
> > > > > MultiJoinOptimizeBushyRule, is it possible to use the Calcite
> version
> > > > > directly or extend it in some way?
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78
> > > > >
> > > > > yh z <zhengyunhon...@gmail.com> 于2022年12月29日周四 14:44写道：
> > > > >
> > > > > > Hi, devs,
> > > > > >
> > > > > > I'd like to start a discuss about adding an option called
> > > > > > "table.oprimizer.busy-join-reorder-threshold" for planner rule
> while
> > > we
> > > > > try
> > > > > > to introduce a new busy join reorder rule[1] into Flink.
> > > > > >
> > > > > > This join reorder rule is based on dynamic programing[2], which
> can
> > > > store
> > > > > > all possible intermediate results, and the cost model can be
> used to
> > > > > select
> > > > > > the optimal join reorder result. Compare with the existing Lopt
> join
> > > > > > reorder rule, the new rule can give more possible results and the
> > > > result
> > > > > > can be more accurate. However, the search space of this rule will
> > > > become
> > > > > > very large as the number of tables increases. So we should
> introduce
> > > an
> > > > > > option to limit the expansion of search space, if the number of
> table
> > > > can
> > > > > > be reordered less than the threshold, the new busy join reorder
> rule
> > > is
> > > > > > used. On the contrary, the Lopt rule is used.
> > > > > >
> > > > > > The default threshold intended to be set to 12. One reason is
> that in
> > > > the
> > > > > > tpc-ds benchmark test, when the number of tables exceeds 12, the
> > > > > > optimization time will be very long. The other reason is that it
> > > refers
> > > > > to
> > > > > > relevant engines, like Spark, whose recommended setting is 12.[3]
> > > > > >
> > > > > > Looking forward to your feedback.
> > > > > >
> > > > > > [1]  https://issues.apache.org/jira/browse/FLINK-30376
> > > > > > [2]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> > > > > > [3]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
> > > > > >
> > > > > > Best regards,
> > > > > > Yunhong Zheng
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Best,
> > > > > Benchao Li
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Best,
> > > Benchao Li
> > >
>

Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Reply via email to