Hi, Lincoln Thanks for your appreciation of this design. Regarding your question:
> do we consider adding a benchmark for the operators to intuitively understand the improvement brought by each improvement? I think it makes sense to add a benchmark, Spark also has this benchmark framework. But I think it is another story to introduce a benchmark framework in Flink, we need to start a new discussion to this work. > for the implementation plan, mentioned in the FLIP that 1.18 will support Calc, HashJoin and HashAgg, then what will be the next step? and which operators do we ultimately expect to cover (all or specific ones)? Our ultimate goal is to support all operators in batch mode, but we prioritize them according to their usage. Operators like Calc, HashJoin, HashAgg, etc. are more commonly used, so we will support them first. Later we support the rest of the operators step by step. Considering the time factor and the development workload, so we can only support Calc, HashJoin, HashAgg in 1.18. In 1.19 or 1.20, we will complete the rest work. I will make this clear in FLIP Best, Ron Jingsong Li <jingsongl...@gmail.com> 于2023年6月5日周一 14:15写道: > > For the state compatibility session, it seems that the checkpoint > compatibility would be broken just like [1] did. Could FLIP-190 [2] still > be helpful in this case for SQL version upgrades? > > I guess this is only for batch processing. Streaming should be another > story? > > Best, > Jingsong > > On Mon, Jun 5, 2023 at 2:07 PM Yun Tang <myas...@live.com> wrote: > > > > Hi Ron, > > > > I think this FLIP would help to improve the performance, looking forward > to its completion in Flink! > > > > For the state compatibility session, it seems that the checkpoint > compatibility would be broken just like [1] did. Could FLIP-190 [2] still > be helpful in this case for SQL version upgrades? > > > > > > [1] > https://docs.google.com/document/d/1qKVohV12qn-bM51cBZ8Hcgp31ntwClxjoiNBUOqVHsI/edit#heading=h.fri5rtpte0si > > [2] > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336489 > > > > Best > > Yun Tang > > > > ________________________________ > > From: Lincoln Lee <lincoln.8...@gmail.com> > > Sent: Monday, June 5, 2023 10:56 > > To: dev@flink.apache.org <dev@flink.apache.org> > > Subject: Re: [DISCUSS] FLIP-315: Support Operator Fusion Codegen for > Flink SQL > > > > Hi Ron > > > > OFGC looks like an exciting optimization, looking forward to its > completion > > in Flink! > > A small question, do we consider adding a benchmark for the operators to > > intuitively understand the improvement brought by each improvement? > > In addition, for the implementation plan, mentioned in the FLIP that 1.18 > > will support Calc, HashJoin and HashAgg, then what will be the next step? > > and which operators do we ultimately expect to cover (all or specific > ones)? > > > > Best, > > Lincoln Lee > > > > > > liu ron <ron9....@gmail.com> 于2023年6月5日周一 09:40写道: > > > > > Hi, Jark > > > > > > Thanks for your feedback, according to my initial assessment, the work > > > effort is relatively large. > > > > > > Moreover, I will add a test result of all queries to the FLIP. > > > > > > Best, > > > Ron > > > > > > Jark Wu <imj...@gmail.com> 于2023年6月1日周四 20:45写道: > > > > > > > Hi Ron, > > > > > > > > Thanks a lot for the great proposal. The FLIP looks good to me in > > > general. > > > > It looks like not an easy work but the performance sounds promising. > So I > > > > think it's worth doing. > > > > > > > > Besides, if there is a complete test graph with all TPC-DS queries, > the > > > > effect of this FLIP will be more intuitive. > > > > > > > > Best, > > > > Jark > > > > > > > > > > > > > > > > On Wed, 31 May 2023 at 14:27, liu ron <ron9....@gmail.com> wrote: > > > > > > > > > Hi, Jinsong > > > > > > > > > > Thanks for your valuable suggestions. > > > > > > > > > > Best, > > > > > Ron > > > > > > > > > > Jingsong Li <jingsongl...@gmail.com> 于2023年5月30日周二 13:22写道: > > > > > > > > > > > Thanks Ron for your information. > > > > > > > > > > > > I suggest that it can be written in the Motivation of FLIP. > > > > > > > > > > > > Best, > > > > > > Jingsong > > > > > > > > > > > > On Tue, May 30, 2023 at 9:57 AM liu ron <ron9....@gmail.com> > wrote: > > > > > > > > > > > > > > Hi, Jingsong > > > > > > > > > > > > > > Thanks for your review. We have tested it in TPC-DS case, and > got a > > > > 12% > > > > > > > gain overall when only supporting only Calc&HashJoin&HashAgg > > > > operator. > > > > > In > > > > > > > some queries, we even get more than 30% gain, it looks like an > > > > > effective > > > > > > > way. > > > > > > > > > > > > > > Best, > > > > > > > Ron > > > > > > > > > > > > > > Jingsong Li <jingsongl...@gmail.com> 于2023年5月29日周一 14:33写道: > > > > > > > > > > > > > > > Thanks Ron for the proposal. > > > > > > > > > > > > > > > > Do you have some benchmark results for the performance > > > > improvement? I > > > > > > > > am more concerned about the improvement on Flink than the > data in > > > > > > > > other papers. > > > > > > > > > > > > > > > > Best, > > > > > > > > Jingsong > > > > > > > > > > > > > > > > On Mon, May 29, 2023 at 2:16 PM liu ron <ron9....@gmail.com> > > > > wrote: > > > > > > > > > > > > > > > > > > Hi, dev > > > > > > > > > > > > > > > > > > I'd like to start a discussion about FLIP-315: Support > Operator > > > > > > Fusion > > > > > > > > > Codegen for Flink SQL[1] > > > > > > > > > > > > > > > > > > As main memory grows, query performance is more and more > > > > determined > > > > > > by > > > > > > > > the > > > > > > > > > raw CPU costs of query processing itself, this is due to > the > > > > query > > > > > > > > > processing techniques based on interpreted execution shows > poor > > > > > > > > performance > > > > > > > > > on modern CPUs due to lack of locality and frequent > instruction > > > > > > > > > mis-prediction. Therefore, the industry is also > researching how > > > > to > > > > > > > > improve > > > > > > > > > engine performance by increasing operator execution > efficiency. > > > > In > > > > > > > > > addition, during the process of optimizing Flink's > performance > > > > for > > > > > > TPC-DS > > > > > > > > > queries, we found that a significant amount of CPU time was > > > spent > > > > > on > > > > > > > > > virtual function calls, framework collector calls, and > invalid > > > > > > > > > calculations, which can be optimized to improve the overall > > > > engine > > > > > > > > > performance. After some investigation, we found Operator > Fusion > > > > > > Codegen > > > > > > > > > which is proposed by Thomas Neumann in the paper[2] can > address > > > > > these > > > > > > > > > problems. I have finished a PoC[3] to verify its > feasibility > > > and > > > > > > > > validity. > > > > > > > > > > > > > > > > > > Looking forward to your feedback. > > > > > > > > > > > > > > > > > > [1]: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-315+Support+Operator+Fusion+Codegen+for+Flink+SQL > > > > > > > > > [2]: http://www.vldb.org/pvldb/vol4/p539-neumann.pdf > > > > > > > > > [3]: https://github.com/lsyldliu/flink/tree/OFCG > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Ron > > > > > > > > > > > > > > > > > > > > > > > > > > >