Hi all, Thanks for yours reply.
After receiving your comments and making targeted modifications. The conclusion is that the option "table.optimizer.bushy-join-reorder-threshold" can be added. Relevant PR [1] has been submitted. Sincerely welcome to review it. Thank you. This discussion will be closed soon. Thanks for your comments. [1] https://github.com/apache/flink/pull/21530 Best regards, Yunhong Zheng godfrey he <godfre...@gmail.com> 于2023年1月9日周一 10:26写道: > Hi Yunhong, > > Thanks for driving this discuss! > > This option looks good to me, > and looking forward to contributing this rule back to Apache Calcite. > > Best, > Godfrey > > > > yh z <zhengyunhon...@gmail.com> 于2023年1月5日周四 15:32写道: > > > > Hi Benchao, > > > > Thanks for your reply. > > > > Since our existing test results are based on multiple performance > > optimization points on the TPC-DS benchmark[1][2], we haven't separately > > tested the performance improvement brought by new bushy join reorder > > rule. I will complete this test recently and update the results to this > > email. > > > > I am very happy to contribute to Calcite. Later, I will push the PR of > the > > bushy join reorder rule to Calcite. > > > > [1] https://issues.apache.org/jira/browse/FLINK-27583 > > [2] https://issues.apache.org/jira/browse/FLINK-29942 > > > > Best regards, > > Yunhong Zheng > > > > Benchao Li <libenc...@apache.org> 于2023年1月4日周三 19:03写道: > > > > > Hi Yunhong, > > > > > > Thanks for the updating. And introducing the new bushy join reorder > > > algorithm would be great. And I also agree with the newly added config > > > option "table.optimizer.bushy-join-reorder-threshold" and 12 as the > default > > > value. > > > > > > > > > > As for optimization > > > > latency, this is the problem to be solved by the parameters to be > > > > introduced in this discussion. When there are many tables need to be > > > > reordered, the optimization latency will increase greatly. But when > the > > > > table numbers less than the threshold, the latency is the same as the > > > > LoptOptimizeJoinRule. > > > > > > > > > This sounds great. If possible, could you share more numbers to us? > E.g., > > > what's the latency of optimization when there are 11/12 tables for both > > > approach? > > > > > > For question #3: The implementation of Calcite > MultiJoinOptimizeBushyRule > > > > is very simple, and it will not store the intermediate results at > all. > > > So, > > > > the implementation of Calcite cannot get all possible join reorder > > > results > > > > and it cannot combine with the current cost model to get more > reasonable > > > > join reorder results. > > > > > > > > > It's ok to do it in Flink as the first step. It would be great to also > > > contribute it to Calcite later if possible, it depends on you. > > > > > > yh z <zhengyunhon...@gmail.com> 于2023年1月3日周二 15:27写道: > > > > > > > Hi Benchao, > > > > > > > > Thanks for your reply. > > > > > > > > Actually, I mistakenly wrote the name "bushy join reorder" to "busy > join > > > > reorder". I'm sorry for the trouble bring to you. "Bushy join > reorder" > > > > means we can build a bushy join tree based on cost model, but now > Flink > > > can > > > > only build a left-deep tree using Calcite LoptOptimizeJoinRule. I > hope my > > > > answers can help you solve the following questions: > > > > > > > > For question #1: The biggest advantage of this "bushy join reorder" > > > > strategy over the default Flink left-deep tree strategy is that it > can > > > > retail all possible join reorder plans, and then select the optimal > plan > > > > according to the cost model. This means that the busy join reorder > > > strategy > > > > can be better combined with the current cost model to get more > reasonable > > > > join reorder results. We verified it on the TPC-DS benchmark, with > the > > > > spark plan as a reference, the new busy join reorder strategy can > make > > > more > > > > TPC-DS query plans be adjusted to be consistent with the Spark plan, > and > > > > the execution time is signifcantly reduced. As for optimization > > > > latency, this is the problem to be solved by the parameters to be > > > > introduced in this discussion. When there are many tables need to be > > > > reordered, the optimization latency will increase greatly. But when > the > > > > table numbers less than the threshold, the latency is the same as the > > > > LoptOptimizeJoinRule. > > > > > > > > For question #2: According to my research, many compute or database > > > systems > > > > have the "bushy join reorder" strategies based on dynamic > programming. > > > For > > > > example, Spark and PostgresSql use the same strategy, and the > threshold > > > be > > > > set to 12. Also, some papers, like [1] and [2], have also researched > this > > > > strategy, and [2] set the threshold to 14. > > > > > > > > For question #3: The implementation of Calcite > MultiJoinOptimizeBushyRule > > > > is very simple, and it will not store the intermediate results at > all. > > > So, > > > > the implementation of Calcite cannot get all possible join reorder > > > results > > > > and it cannot combine with the current cost model to get more > reasonable > > > > join reorder results. > > > > > > > > > > > > [1] > > > > > > > > > > > > https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf > > > > [2] https://db.in.tum.de/~radke/papers/hugejoins.pdf > > > > > > > > > > > > > > > > Benchao Li <libenc...@apache.org> 于2023年1月3日周二 12:54写道: > > > > > > > > > Hi Yunhong, > > > > > > > > > > Thanks for driving this~ > > > > > > > > > > I haven't gone deep into the implementation details yet. Regarding > the > > > > > general description, I would ask a few questions firstly: > > > > > > > > > > #1, Is there any benchmark results about the optimization latency > > > change > > > > > compared to current approach? In OLAP scenario, query optimization > > > > latency > > > > > is more crucial. > > > > > > > > > > #2, About the term "busy join reorder", is there any others systems > > > which > > > > > also use this term? I know Calcite has a rule[1] which uses the > term > > > > "bushy > > > > > join". > > > > > > > > > > #3, About the implementation, if this does the same work as Calcite > > > > > MultiJoinOptimizeBushyRule, is it possible to use the Calcite > version > > > > > directly or extend it in some way? > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78 > > > > > > > > > > yh z <zhengyunhon...@gmail.com> 于2022年12月29日周四 14:44写道: > > > > > > > > > > > Hi, devs, > > > > > > > > > > > > I'd like to start a discuss about adding an option called > > > > > > "table.oprimizer.busy-join-reorder-threshold" for planner rule > while > > > we > > > > > try > > > > > > to introduce a new busy join reorder rule[1] into Flink. > > > > > > > > > > > > This join reorder rule is based on dynamic programing[2], which > can > > > > store > > > > > > all possible intermediate results, and the cost model can be > used to > > > > > select > > > > > > the optimal join reorder result. Compare with the existing Lopt > join > > > > > > reorder rule, the new rule can give more possible results and the > > > > result > > > > > > can be more accurate. However, the search space of this rule will > > > > become > > > > > > very large as the number of tables increases. So we should > introduce > > > an > > > > > > option to limit the expansion of search space, if the number of > table > > > > can > > > > > > be reordered less than the threshold, the new busy join reorder > rule > > > is > > > > > > used. On the contrary, the Lopt rule is used. > > > > > > > > > > > > The default threshold intended to be set to 12. One reason is > that in > > > > the > > > > > > tpc-ds benchmark test, when the number of tables exceeds 12, the > > > > > > optimization time will be very long. The other reason is that it > > > refers > > > > > to > > > > > > relevant engines, like Spark, whose recommended setting is 12.[3] > > > > > > > > > > > > Looking forward to your feedback. > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-30376 > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf > > > > > > [3] > > > > > > > > > > > > > > > > > > > > > > > > > https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration > > > > > > > > > > > > Best regards, > > > > > > Yunhong Zheng > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Best, > > > > > Benchao Li > > > > > > > > > > > > > > > > > > -- > > > > > > Best, > > > Benchao Li > > > >