Hi Benchao, Thanks for your reply.
Since our existing test results are based on multiple performance optimization points on the TPC-DS benchmark[1][2], we haven't separately tested the performance improvement brought by new bushy join reorder rule. I will complete this test recently and update the results to this email. I am very happy to contribute to Calcite. Later, I will push the PR of the bushy join reorder rule to Calcite. [1] https://issues.apache.org/jira/browse/FLINK-27583 [2] https://issues.apache.org/jira/browse/FLINK-29942 Best regards, Yunhong Zheng Benchao Li <libenc...@apache.org> 于2023年1月4日周三 19:03写道: > Hi Yunhong, > > Thanks for the updating. And introducing the new bushy join reorder > algorithm would be great. And I also agree with the newly added config > option "table.optimizer.bushy-join-reorder-threshold" and 12 as the default > value. > > > > As for optimization > > latency, this is the problem to be solved by the parameters to be > > introduced in this discussion. When there are many tables need to be > > reordered, the optimization latency will increase greatly. But when the > > table numbers less than the threshold, the latency is the same as the > > LoptOptimizeJoinRule. > > > This sounds great. If possible, could you share more numbers to us? E.g., > what's the latency of optimization when there are 11/12 tables for both > approach? > > For question #3: The implementation of Calcite MultiJoinOptimizeBushyRule > > is very simple, and it will not store the intermediate results at all. > So, > > the implementation of Calcite cannot get all possible join reorder > results > > and it cannot combine with the current cost model to get more reasonable > > join reorder results. > > > It's ok to do it in Flink as the first step. It would be great to also > contribute it to Calcite later if possible, it depends on you. > > yh z <zhengyunhon...@gmail.com> 于2023年1月3日周二 15:27写道: > > > Hi Benchao, > > > > Thanks for your reply. > > > > Actually, I mistakenly wrote the name "bushy join reorder" to "busy join > > reorder". I'm sorry for the trouble bring to you. "Bushy join reorder" > > means we can build a bushy join tree based on cost model, but now Flink > can > > only build a left-deep tree using Calcite LoptOptimizeJoinRule. I hope my > > answers can help you solve the following questions: > > > > For question #1: The biggest advantage of this "bushy join reorder" > > strategy over the default Flink left-deep tree strategy is that it can > > retail all possible join reorder plans, and then select the optimal plan > > according to the cost model. This means that the busy join reorder > strategy > > can be better combined with the current cost model to get more reasonable > > join reorder results. We verified it on the TPC-DS benchmark, with the > > spark plan as a reference, the new busy join reorder strategy can make > more > > TPC-DS query plans be adjusted to be consistent with the Spark plan, and > > the execution time is signifcantly reduced. As for optimization > > latency, this is the problem to be solved by the parameters to be > > introduced in this discussion. When there are many tables need to be > > reordered, the optimization latency will increase greatly. But when the > > table numbers less than the threshold, the latency is the same as the > > LoptOptimizeJoinRule. > > > > For question #2: According to my research, many compute or database > systems > > have the "bushy join reorder" strategies based on dynamic programming. > For > > example, Spark and PostgresSql use the same strategy, and the threshold > be > > set to 12. Also, some papers, like [1] and [2], have also researched this > > strategy, and [2] set the threshold to 14. > > > > For question #3: The implementation of Calcite MultiJoinOptimizeBushyRule > > is very simple, and it will not store the intermediate results at all. > So, > > the implementation of Calcite cannot get all possible join reorder > results > > and it cannot combine with the current cost model to get more reasonable > > join reorder results. > > > > > > [1] > > > > > https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf > > [2] https://db.in.tum.de/~radke/papers/hugejoins.pdf > > > > > > > > Benchao Li <libenc...@apache.org> 于2023年1月3日周二 12:54写道: > > > > > Hi Yunhong, > > > > > > Thanks for driving this~ > > > > > > I haven't gone deep into the implementation details yet. Regarding the > > > general description, I would ask a few questions firstly: > > > > > > #1, Is there any benchmark results about the optimization latency > change > > > compared to current approach? In OLAP scenario, query optimization > > latency > > > is more crucial. > > > > > > #2, About the term "busy join reorder", is there any others systems > which > > > also use this term? I know Calcite has a rule[1] which uses the term > > "bushy > > > join". > > > > > > #3, About the implementation, if this does the same work as Calcite > > > MultiJoinOptimizeBushyRule, is it possible to use the Calcite version > > > directly or extend it in some way? > > > > > > [1] > > > > > > > > > https://github.com/apache/calcite/blob/9054682145727fbf8a13e3c79b3512be41574349/core/src/main/java/org/apache/calcite/rel/rules/MultiJoinOptimizeBushyRule.java#L78 > > > > > > yh z <zhengyunhon...@gmail.com> 于2022年12月29日周四 14:44写道: > > > > > > > Hi, devs, > > > > > > > > I'd like to start a discuss about adding an option called > > > > "table.oprimizer.busy-join-reorder-threshold" for planner rule while > we > > > try > > > > to introduce a new busy join reorder rule[1] into Flink. > > > > > > > > This join reorder rule is based on dynamic programing[2], which can > > store > > > > all possible intermediate results, and the cost model can be used to > > > select > > > > the optimal join reorder result. Compare with the existing Lopt join > > > > reorder rule, the new rule can give more possible results and the > > result > > > > can be more accurate. However, the search space of this rule will > > become > > > > very large as the number of tables increases. So we should introduce > an > > > > option to limit the expansion of search space, if the number of table > > can > > > > be reordered less than the threshold, the new busy join reorder rule > is > > > > used. On the contrary, the Lopt rule is used. > > > > > > > > The default threshold intended to be set to 12. One reason is that in > > the > > > > tpc-ds benchmark test, when the number of tables exceeds 12, the > > > > optimization time will be very long. The other reason is that it > refers > > > to > > > > relevant engines, like Spark, whose recommended setting is 12.[3] > > > > > > > > Looking forward to your feedback. > > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-30376 > > > > [2] > > > > > > > > > > > > > > https://courses.cs.duke.edu/compsci516/cps216/spring03/papers/selinger-etal-1979.pdf > > > > [3] > > > > > > > > > > > > > > https://spark.apache.org/docs/3.3.1/configuration.html#runtime-sql-configuration > > > > > > > > Best regards, > > > > Yunhong Zheng > > > > > > > > > > > > > -- > > > > > > Best, > > > Benchao Li > > > > > > > > -- > > Best, > Benchao Li >