Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Jark Wu Tue, 10 Jan 2023 01:42:02 -0800

+1 for the option and the default value. 

Best,
Jark


> 2023年1月10日 16:24，yh z <zhengyunhon...@gmail.com> 写道：
> 
> Hi all,
> 
> Thanks for yours reply.
> 
> After receiving your comments and making targeted modifications. The
> conclusion is that the option
> "table.optimizer.bushy-join-reorder-threshold" can be added. Relevant PR
> [1] has been submitted. Sincerely welcome to review it. Thank you.
> 
> This discussion will be closed soon. Thanks for your comments.
> 
> [1] https://github.com/apache/flink/pull/21530
> 
> Best regards,
> Yunhong Zheng
> 
> godfrey he <godfre...@gmail.com> 于2023年1月9日周一 10:26写道：
> 
>> Hi Yunhong,
>> 
>> Thanks for driving this discuss!
>> 
>> This option looks good to me,
>> and looking forward to contributing this rule back to Apache Calcite.
>> 
>> Best,
>> Godfrey
>> 
>> 
>> 
>> yh z <zhengyunhon...@gmail.com> 于2023年1月5日周四 15:32写道：
>>> 
>>> Hi Benchao,
>>> 
>>> Thanks for your reply.
>>> 
>>> Since our existing test results are based on multiple performance
>>> optimization points on the TPC-DS benchmark[1][2], we haven't separately
>>> tested the performance improvement brought by new bushy join reorder
>>> rule. I will complete this test recently and update the results to this
>>> email.
>>> 
>>> I am very happy to contribute to Calcite. Later, I will push the PR of
>> the
>>> bushy join reorder rule to Calcite.
>>> 
>>> [1] https://issues.apache.org/jira/browse/FLINK-27583
>>> [2] https://issues.apache.org/jira/browse/FLINK-29942
>>> 
>>> Best regards,
>>> Yunhong Zheng
>>> 
>>> Benchao Li <libenc...@apache.org> 于2023年1月4日周三 19:03写道：
>>> 
>>>> Hi Yunhong,
>>>> 
>>>> Thanks for the updating. And introducing the new bushy join reorder
>>>> algorithm would be great. And I also agree with the newly added config
>>>> option "table.optimizer.bushy-join-reorder-threshold" and 12 as the
>> default
>>>> value.
>>>> 
>>>> 
>>>>> As for optimization
>>>>> latency, this is the problem to be solved by the parameters to be
>>>>> introduced in this discussion. When there are many tables need to be
>>>>> reordered, the optimization latency will increase greatly. But when
>> the
>>>>> table numbers less than the threshold, the latency is the same as the
>>>>> LoptOptimizeJoinRule.
>>>> 
>>>> 
>>>> This sounds great. If possible, could you share more numbers to us?
>> E.g.,
>>>> what's the latency of optimization when there are 11/12 tables for both
>>>> approach?
>>>> 
>>>> For question #3: The implementation of Calcite
>> MultiJoinOptimizeBushyRule
>>>>> is very simple, and it will not store the intermediate results at
>> all.
>>>> So,
>>>>> the implementation of Calcite cannot get all possible join reorder
>>>> results
>>>>> and it cannot combine with the current cost model to get more
>> reasonable
>>>>> join reorder results.
>>>> 
>>>> 
>>>> It's ok to do it in Flink as the first step. It would be great to also
>>>> contribute it to Calcite later if possible, it depends on you.
>>>> 
>>>> yh z <zhengyunhon...@gmail.com> 于2023年1月3日周二 15:27写道：
>>>> 
>>>>> Hi Benchao,
>>>>> 
>>>>> Thanks for your reply.
>>>>> 
>>>>> Actually,  I mistakenly wrote the name "bushy join reorder" to "busy
>> join
>>>>> reorder". I'm sorry for the trouble bring to you. "Bushy join
>> reorder"
>>>>> means we can build a bushy join tree based on cost model, but now
>> Flink
>>>> can
>>>>> only build a left-deep tree using Calcite LoptOptimizeJoinRule. I
>> hope my
>>>>> answers can help you solve the following questions:
>>>>> 
>>>>> For question #1: The biggest advantage of this "bushy join reorder"
>>>>> strategy over the default Flink left-deep tree strategy is that it
>> can
>>>>> retail all possible join reorder plans, and then select the optimal
>> plan
>>>>> according to the cost model. This means that the busy join reorder
>>>> strategy
>>>>> can be better combined with the current cost model to get more
>> reasonable
>>>>> join reorder results. We verified it on the TPC-DS benchmark, with
>> the
>>>>> spark plan as a reference, the new busy join reorder strategy can
>> make
>>>> more
>>>>> TPC-DS query plans be adjusted to be consistent with the Spark plan,
>> and
>>>>> the execution time is signifcantly reduced.  As for optimization
>>>>> latency, this is the problem to be solved by the parameters to be
>>>>> introduced in this discussion. When there are many tables need to be
>>>>> reordered, the optimization latency will increase greatly. But when
>> the
>>>>> table numbers less than the threshold, the latency is the same as the
>>>>> LoptOptimizeJoinRule.
>>>>> 
>>>>> For question #2: According to my research, many compute or database
>>>> systems
>>>>> have the "bushy join reorder" strategies based on dynamic
>> programming.
>>>> For
>>>>> example, Spark and PostgresSql use the same strategy, and the
>> threshold
>>>> be
>>>>> set to 12. Also, some papers, like [1] and [2], have also researched
>> this
>>>>> strategy, and [2] set the threshold to 14.
>>>>> 
>>>>> For question #3: The implementation of Calcite
>> MultiJoinOptimizeBushyRule
>>>>> is very simple, and it will not store the intermediate results at
>> all.
>>>> So,
>>>>> the implementation of Calcite cannot get all possible join reorder
>>>> results
>>>>> and it cannot combine with the current cost model to get more
>> reasonable
>>>>> join reorder results.
>>>>> 
>>>>> 
>>>>> [1]
>>>>> 
>>>>> 
>>>> 
>> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
>>>>> [2] https://db.in.tum.de/~radke/papers/hugejoins.pdf
>>>>> 
>>>>> 
>>>>> 
>>>>> Benchao Li <libenc...@apache.org> 于2023年1月3日周二 12:54写道：
>>>>> 
>>>>>> Hi Yunhong,
>>>>>> 
>>>>>> Thanks for driving this~
>>>>>> 
>>>>>> I haven't gone deep into the implementation details yet. Regarding
>> the
>>>>>> general description, I would ask a few questions firstly:
>>>>>> 
>>>>>> #1, Is there any benchmark results about the optimization latency
>>>> change
>>>>>> compared to current approach? In OLAP scenario, query optimization
>>>>> latency
>>>>>> is more crucial.
>>>>>> 
>>>>>> #2, About the term "busy join reorder", is there any others systems
>>>> which
>>>>>> also use this term? I know Calcite has a rule[1] which uses the
>> term
>>>>> "bushy
>>>>>> join".
>>>>>> 
>>>>>> #3, About the implementation, if this does the same work as Calcite
>>>>>> MultiJoinOptimizeBushyRule, is it possible to use the Calcite
>> version
>>>>>> directly or extend it in some way?
>>>>>> 
>>>>>> [1]
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78
>>>>>> 
>>>>>> yh z <zhengyunhon...@gmail.com> 于2022年12月29日周四 14:44写道：
>>>>>> 
>>>>>>> Hi, devs,
>>>>>>> 
>>>>>>> I'd like to start a discuss about adding an option called
>>>>>>> "table.oprimizer.busy-join-reorder-threshold" for planner rule
>> while
>>>> we
>>>>>> try
>>>>>>> to introduce a new busy join reorder rule[1] into Flink.
>>>>>>> 
>>>>>>> This join reorder rule is based on dynamic programing[2], which
>> can
>>>>> store
>>>>>>> all possible intermediate results, and the cost model can be
>> used to
>>>>>> select
>>>>>>> the optimal join reorder result. Compare with the existing Lopt
>> join
>>>>>>> reorder rule, the new rule can give more possible results and the
>>>>> result
>>>>>>> can be more accurate. However, the search space of this rule will
>>>>> become
>>>>>>> very large as the number of tables increases. So we should
>> introduce
>>>> an
>>>>>>> option to limit the expansion of search space, if the number of
>> table
>>>>> can
>>>>>>> be reordered less than the threshold, the new busy join reorder
>> rule
>>>> is
>>>>>>> used. On the contrary, the Lopt rule is used.
>>>>>>> 
>>>>>>> The default threshold intended to be set to 12. One reason is
>> that in
>>>>> the
>>>>>>> tpc-ds benchmark test, when the number of tables exceeds 12, the
>>>>>>> optimization time will be very long. The other reason is that it
>>>> refers
>>>>>> to
>>>>>>> relevant engines, like Spark, whose recommended setting is 12.[3]
>>>>>>> 
>>>>>>> Looking forward to your feedback.
>>>>>>> 
>>>>>>> [1]  https://issues.apache.org/jira/browse/FLINK-30376
>>>>>>> [2]
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf
>>>>>>> [3]
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Yunhong Zheng
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> Best,
>>>>>> Benchao Li
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> 
>>>> Best,
>>>> Benchao Li
>>>> 
>>

Re: [DISCUSS] Adding a option for planner to decide which join reorder rule to choose

Reply via email to