Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

yh z Wed, 04 Jan 2023 23:32:15 -0800

Hi Benchao,

Thanks for your reply.


Since our existing test results are based on multiple performance
optimization points on the TPC-DS benchmark[1][2], we haven't separately
tested the performance improvement brought by new bushy join reorder
rule. I will complete this test recently and update the results to this
email.

I am very happy to contribute to Calcite. Later, I will push the PR of the
bushy join reorder rule to Calcite.

[1] https://issues.apache.org/jira/browse/FLINK-27583
[2] https://issues.apache.org/jira/browse/FLINK-29942

Best regards,
Yunhong Zheng

Benchao Li <[email protected]> 于2023年1月4日周三 19:03写道：

> Hi Yunhong,
>
> Thanks for the updating. And introducing the new bushy join reorder
> algorithm would be great. And I also agree with the newly added config
> option "table.optimizer.bushy-join-reorder-threshold" and 12 as the default
> value.
>
>
> > As for optimization
> > latency, this is the problem to be solved by the parameters to be
> > introduced in this discussion. When there are many tables need to be
> > reordered, the optimization latency will increase greatly. But when the
> > table numbers less than the threshold, the latency is the same as the
> > LoptOptimizeJoinRule.
>
>
> This sounds great. If possible, could you share more numbers to us? E.g.,
> what's the latency of optimization when there are 11/12 tables for both
> approach?
>
>  For question #3: The implementation of Calcite MultiJoinOptimizeBushyRule
> > is very simple, and it will not store the intermediate results at all.
> So,
> > the implementation of Calcite cannot get all possible join reorder
> results
> > and it cannot combine with the current cost model to get more reasonable
> > join reorder results.
>
>
> It's ok to do it in Flink as the first step. It would be great to also
> contribute it to Calcite later if possible, it depends on you.
>
> yh z <[email protected]> 于2023年1月3日周二 15:27写道：
>
> > Hi Benchao,
> >
> > Thanks for your reply.
> >
> > Actually,  I mistakenly wrote the name "bushy join reorder" to "busy join
> > reorder". I'm sorry for the trouble bring to you. "Bushy join reorder"
> > means we can build a bushy join tree based on cost model, but now Flink
> can
> > only build a left-deep tree using Calcite LoptOptimizeJoinRule. I hope my
> > answers can help you solve the following questions:
> >
> > For question #1: The biggest advantage of this "bushy join reorder"
> > strategy over the default Flink left-deep tree strategy is that it can
> > retail all possible join reorder plans, and then select the optimal plan
> > according to the cost model. This means that the busy join reorder
> strategy
> > can be better combined with the current cost model to get more reasonable
> > join reorder results. We verified it on the TPC-DS benchmark, with the
> > spark plan as a reference, the new busy join reorder strategy can make
> more
> > TPC-DS query plans be adjusted to be consistent with the Spark plan, and
> > the execution time is signifcantly reduced.  As for optimization
> > latency, this is the problem to be solved by the parameters to be
> > introduced in this discussion. When there are many tables need to be
> > reordered, the optimization latency will increase greatly. But when the
> > table numbers less than the threshold, the latency is the same as the
> > LoptOptimizeJoinRule.
> >
> > For question #2: According to my research, many compute or database
> systems
> > have the "bushy join reorder" strategies based on dynamic programming.
> For
> > example, Spark and PostgresSql use the same strategy, and the threshold
> be
> > set to 12. Also, some papers, like [1] and [2], have also researched this
> > strategy, and [2] set the threshold to 14.
> >
> > For question #3: The implementation of Calcite MultiJoinOptimizeBushyRule
> > is very simple, and it will not store the intermediate results at all.
> So,
> > the implementation of Calcite cannot get all possible join reorder
> results
> > and it cannot combine with the current cost model to get more reasonable
> > join reorder results.
> >
> >
> > [1]
> >
> >
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> > [2] https://db.in.tum.de/~radke/papers/hugejoins.pdf
> >
> >
> >
> > Benchao Li <[email protected]> 于2023年1月3日周二 12:54写道：
> >
> > > Hi Yunhong,
> > >
> > > Thanks for driving this~
> > >
> > > I haven't gone deep into the implementation details yet. Regarding the
> > > general description, I would ask a few questions firstly:
> > >
> > > #1, Is there any benchmark results about the optimization latency
> change
> > > compared to current approach? In OLAP scenario, query optimization
> > latency
> > > is more crucial.
> > >
> > > #2, About the term "busy join reorder", is there any others systems
> which
> > > also use this term? I know Calcite has a rule[1] which uses the term
> > "bushy
> > > join".
> > >
> > > #3, About the implementation, if this does the same work as Calcite
> > > MultiJoinOptimizeBushyRule, is it possible to use the Calcite version
> > > directly or extend it in some way?
> > >
> > > [1]
> > >
> > >
> >
> https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78
> > >
> > > yh z <[email protected]> 于2022年12月29日周四 14:44写道：
> > >
> > > > Hi, devs,
> > > >
> > > > I'd like to start a discuss about adding an option called
> > > > "table.oprimizer.busy-join-reorder-threshold" for planner rule while
> we
> > > try
> > > > to introduce a new busy join reorder rule[1] into Flink.
> > > >
> > > > This join reorder rule is based on dynamic programing[2], which can
> > store
> > > > all possible intermediate results, and the cost model can be used to
> > > select
> > > > the optimal join reorder result. Compare with the existing Lopt join
> > > > reorder rule, the new rule can give more possible results and the
> > result
> > > > can be more accurate. However, the search space of this rule will
> > become
> > > > very large as the number of tables increases. So we should introduce
> an
> > > > option to limit the expansion of search space, if the number of table
> > can
> > > > be reordered less than the threshold, the new busy join reorder rule
> is
> > > > used. On the contrary, the Lopt rule is used.
> > > >
> > > > The default threshold intended to be set to 12. One reason is that in
> > the
> > > > tpc-ds benchmark test, when the number of tables exceeds 12, the
> > > > optimization time will be very long. The other reason is that it
> refers
> > > to
> > > > relevant engines, like Spark, whose recommended setting is 12.[3]
> > > >
> > > > Looking forward to your feedback.
> > > >
> > > > [1]  https://issues.apache.org/jira/browse/FLINK-30376
> > > > [2]
> > > >
> > > >
> > >
> >
> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
> > > > [3]
> > > >
> > > >
> > >
> >
> https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
> > > >
> > > > Best regards,
> > > > Yunhong Zheng
> > > >
> > >
> > >
> > > --
> > >
> > > Best,
> > > Benchao Li
> > >
> >
>
>
> --
>
> Best,
> Benchao Li
>

Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Reply via email to