Hi Ron: Sorry for the late reply after the voting process. I just want to ask
> Traverse the ExecNode DAG and create a FusionExecNode for physical operators that can be fused together. which kind of operators can be fused together ? are the operators in an operator chain? Is this optimization aligned to spark's whole stage codegen ? > If any member operator does not support codegen, generate a Transformation DAG based on the topological relationship of member ExecNode and jump to step 8. step8: Generate a FusionTransformation, setting the parallelism and managed memory for the fused operator. does the "support codegen" means fusion codegen? but why we generate a FusionTransformation when the member operator does not support codegen, IMO it should fallback to the current behavior. In the end, I share the same idea with Lincoln about performance benchmark. Currently flink community's flink-benchmark only covers like schedule, state, datastream operator's performance. A good benchmark harness for sql operator will benefit the sql optimizer topic and observation Thanks, Atiozi. liu ron <ron9....@gmail.com> 于2023年6月6日周二 19:30写道: > Hi dev > > Thanks for all the feedback, it seems that here are no more comments, I > will > start a vote on FLIP-315 [1] later. Thanks again. > > [1]: > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-315+Support+Operator+Fusion+Codegen+for+Flink+SQL > > Best, > Ron > > liu ron <ron9....@gmail.com> 于2023年6月5日周一 16:01写道: > > > Hi, Yun, Jinsong, Benchao > > > > Thanks for your valuable input about this FLIP. > > > > First of all, let me emphasize that from the technical implementation > > point of view, this design is feasible in both stream and batch > scenarios, > > so I consider both stream and batch mode in FLIP. In the stream scenario, > > for stateful operator, according to our business experience, basically > the > > bottleneck is on the state access, so the optimization effect of OFCG for > > the stream will not be particularly obvious, so we will not give priority > > to support it currently. On the contrary, in the batch scenario, where > CPU > > is the bottleneck, this optimization is gainful. > > > > Taking the above into account, we are able to support both stream and > > batch mode optimization in this design, but we will give priority to > > supporting batch operators. As benchao said, when we find a suitable > > streaming business scenario in the future, we can consider doing this > > optimization. Back to Yun issue, the design will break state > compatibility > > in stream mode as[1] and the version upgrade will not support this OFCG. > As > > mentioned earlier, we will not support this feature in stream mode in the > > short term. > > > > Also thanks to Benchao's suggestion, I will state the current goal of > that > > optimization in the FLIP, scoped to batch mode. > > > > Best, > > Ron > > > > liu ron <ron9....@gmail.com> 于2023年6月5日周一 15:04写道: > > > >> Hi, Lincoln > >> > >> Thanks for your appreciation of this design. Regarding your question: > >> > >> > do we consider adding a benchmark for the operators to intuitively > >> understand the improvement brought by each improvement? > >> > >> I think it makes sense to add a benchmark, Spark also has this benchmark > >> framework. But I think it is another story to introduce a benchmark > >> framework in Flink, we need to start a new discussion to this work. > >> > >> > for the implementation plan, mentioned in the FLIP that 1.18 will > >> support Calc, HashJoin and HashAgg, then what will be the next step? and > >> which operators do we ultimately expect to cover (all or specific ones)? > >> > >> Our ultimate goal is to support all operators in batch mode, but we > >> prioritize them according to their usage. Operators like Calc, HashJoin, > >> HashAgg, etc. are more commonly used, so we will support them first. > Later > >> we support the rest of the operators step by step. Considering the time > >> factor and the development workload, so we can only support Calc, > >> HashJoin, HashAgg in 1.18. In 1.19 or 1.20, we will complete the rest > work. > >> I will make this clear in FLIP > >> > >> Best, > >> Ron > >> > >> Jingsong Li <jingsongl...@gmail.com> 于2023年6月5日周一 14:15写道: > >> > >>> > For the state compatibility session, it seems that the checkpoint > >>> compatibility would be broken just like [1] did. Could FLIP-190 [2] > still > >>> be helpful in this case for SQL version upgrades? > >>> > >>> I guess this is only for batch processing. Streaming should be another > >>> story? > >>> > >>> Best, > >>> Jingsong > >>> > >>> On Mon, Jun 5, 2023 at 2:07 PM Yun Tang <myas...@live.com> wrote: > >>> > > >>> > Hi Ron, > >>> > > >>> > I think this FLIP would help to improve the performance, looking > >>> forward to its completion in Flink! > >>> > > >>> > For the state compatibility session, it seems that the checkpoint > >>> compatibility would be broken just like [1] did. Could FLIP-190 [2] > still > >>> be helpful in this case for SQL version upgrades? > >>> > > >>> > > >>> > [1] > >>> > https://docs.google.com/document/d/1qKVohV12qn-bM51cBZ8Hcgp31ntwClxjoiNBUOqVHsI/edit#heading=h.fri5rtpte0si > >>> > [2] > >>> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336489 > >>> > > >>> > Best > >>> > Yun Tang > >>> > > >>> > ________________________________ > >>> > From: Lincoln Lee <lincoln.8...@gmail.com> > >>> > Sent: Monday, June 5, 2023 10:56 > >>> > To: dev@flink.apache.org <dev@flink.apache.org> > >>> > Subject: Re: [DISCUSS] FLIP-315: Support Operator Fusion Codegen for > >>> Flink SQL > >>> > > >>> > Hi Ron > >>> > > >>> > OFGC looks like an exciting optimization, looking forward to its > >>> completion > >>> > in Flink! > >>> > A small question, do we consider adding a benchmark for the operators > >>> to > >>> > intuitively understand the improvement brought by each improvement? > >>> > In addition, for the implementation plan, mentioned in the FLIP that > >>> 1.18 > >>> > will support Calc, HashJoin and HashAgg, then what will be the next > >>> step? > >>> > and which operators do we ultimately expect to cover (all or specific > >>> ones)? > >>> > > >>> > Best, > >>> > Lincoln Lee > >>> > > >>> > > >>> > liu ron <ron9....@gmail.com> 于2023年6月5日周一 09:40写道: > >>> > > >>> > > Hi, Jark > >>> > > > >>> > > Thanks for your feedback, according to my initial assessment, the > >>> work > >>> > > effort is relatively large. > >>> > > > >>> > > Moreover, I will add a test result of all queries to the FLIP. > >>> > > > >>> > > Best, > >>> > > Ron > >>> > > > >>> > > Jark Wu <imj...@gmail.com> 于2023年6月1日周四 20:45写道: > >>> > > > >>> > > > Hi Ron, > >>> > > > > >>> > > > Thanks a lot for the great proposal. The FLIP looks good to me in > >>> > > general. > >>> > > > It looks like not an easy work but the performance sounds > >>> promising. So I > >>> > > > think it's worth doing. > >>> > > > > >>> > > > Besides, if there is a complete test graph with all TPC-DS > >>> queries, the > >>> > > > effect of this FLIP will be more intuitive. > >>> > > > > >>> > > > Best, > >>> > > > Jark > >>> > > > > >>> > > > > >>> > > > > >>> > > > On Wed, 31 May 2023 at 14:27, liu ron <ron9....@gmail.com> > wrote: > >>> > > > > >>> > > > > Hi, Jinsong > >>> > > > > > >>> > > > > Thanks for your valuable suggestions. > >>> > > > > > >>> > > > > Best, > >>> > > > > Ron > >>> > > > > > >>> > > > > Jingsong Li <jingsongl...@gmail.com> 于2023年5月30日周二 13:22写道: > >>> > > > > > >>> > > > > > Thanks Ron for your information. > >>> > > > > > > >>> > > > > > I suggest that it can be written in the Motivation of FLIP. > >>> > > > > > > >>> > > > > > Best, > >>> > > > > > Jingsong > >>> > > > > > > >>> > > > > > On Tue, May 30, 2023 at 9:57 AM liu ron <ron9....@gmail.com> > >>> wrote: > >>> > > > > > > > >>> > > > > > > Hi, Jingsong > >>> > > > > > > > >>> > > > > > > Thanks for your review. We have tested it in TPC-DS case, > >>> and got a > >>> > > > 12% > >>> > > > > > > gain overall when only supporting only > Calc&HashJoin&HashAgg > >>> > > > operator. > >>> > > > > In > >>> > > > > > > some queries, we even get more than 30% gain, it looks like > >>> an > >>> > > > > effective > >>> > > > > > > way. > >>> > > > > > > > >>> > > > > > > Best, > >>> > > > > > > Ron > >>> > > > > > > > >>> > > > > > > Jingsong Li <jingsongl...@gmail.com> 于2023年5月29日周一 > 14:33写道: > >>> > > > > > > > >>> > > > > > > > Thanks Ron for the proposal. > >>> > > > > > > > > >>> > > > > > > > Do you have some benchmark results for the performance > >>> > > > improvement? I > >>> > > > > > > > am more concerned about the improvement on Flink than the > >>> data in > >>> > > > > > > > other papers. > >>> > > > > > > > > >>> > > > > > > > Best, > >>> > > > > > > > Jingsong > >>> > > > > > > > > >>> > > > > > > > On Mon, May 29, 2023 at 2:16 PM liu ron < > >>> ron9....@gmail.com> > >>> > > > wrote: > >>> > > > > > > > > > >>> > > > > > > > > Hi, dev > >>> > > > > > > > > > >>> > > > > > > > > I'd like to start a discussion about FLIP-315: Support > >>> Operator > >>> > > > > > Fusion > >>> > > > > > > > > Codegen for Flink SQL[1] > >>> > > > > > > > > > >>> > > > > > > > > As main memory grows, query performance is more and > more > >>> > > > determined > >>> > > > > > by > >>> > > > > > > > the > >>> > > > > > > > > raw CPU costs of query processing itself, this is due > to > >>> the > >>> > > > query > >>> > > > > > > > > processing techniques based on interpreted execution > >>> shows poor > >>> > > > > > > > performance > >>> > > > > > > > > on modern CPUs due to lack of locality and frequent > >>> instruction > >>> > > > > > > > > mis-prediction. Therefore, the industry is also > >>> researching how > >>> > > > to > >>> > > > > > > > improve > >>> > > > > > > > > engine performance by increasing operator execution > >>> efficiency. > >>> > > > In > >>> > > > > > > > > addition, during the process of optimizing Flink's > >>> performance > >>> > > > for > >>> > > > > > TPC-DS > >>> > > > > > > > > queries, we found that a significant amount of CPU time > >>> was > >>> > > spent > >>> > > > > on > >>> > > > > > > > > virtual function calls, framework collector calls, and > >>> invalid > >>> > > > > > > > > calculations, which can be optimized to improve the > >>> overall > >>> > > > engine > >>> > > > > > > > > performance. After some investigation, we found > Operator > >>> Fusion > >>> > > > > > Codegen > >>> > > > > > > > > which is proposed by Thomas Neumann in the paper[2] can > >>> address > >>> > > > > these > >>> > > > > > > > > problems. I have finished a PoC[3] to verify its > >>> feasibility > >>> > > and > >>> > > > > > > > validity. > >>> > > > > > > > > > >>> > > > > > > > > Looking forward to your feedback. > >>> > > > > > > > > > >>> > > > > > > > > [1]: > >>> > > > > > > > > > >>> > > > > > > > > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > >>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-315+Support+Operator+Fusion+Codegen+for+Flink+SQL > >>> > > > > > > > > [2]: http://www.vldb.org/pvldb/vol4/p539-neumann.pdf > >>> > > > > > > > > [3]: https://github.com/lsyldliu/flink/tree/OFCG > >>> > > > > > > > > > >>> > > > > > > > > Best, > >>> > > > > > > > > Ron > >>> > > > > > > > > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > >>> > >> >