Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-08-06 Thread Junrui Lee
Hi Yu Chen, Apologies for the late reply. Thanks for your feedback and for emphasizing the importance of this work for FLINK-33230. Looking forward to your contributions. Best, Junrui Junrui Lee 于 2024年8月6日周二 22:05写道: > Hi everyone, > > David, Zhu and I had an offline discussion about FLIP-468

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-08-06 Thread Junrui Lee
Hi everyone, David, Zhu and I had an offline discussion about FLIP-468 and more. Here are the key points: 1. Flink lacks a stable public REST API for job submission. Such a public REST API will allow users to create and store compiled plans of jobs, and to directly submit/resumit these plans. It c

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-30 Thread David Morávek
Here is a summary of my current understanding, please correct me if it's wrong. We currently have an API for submitting a list of transformations and configurations called CompiledPlan. However, it is currently specific to the SQL domain. In reviewing the follow-up FLIPs, it is evident that we ar

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-30 Thread Zhu Zhu
Thanks for sharing the thoughts, David! IIUC, there are two goals to make the APIs used for RPCs as versionless as possible? 1. no version mismatch happens if no code changes 2. supports different versioned Flink clients and clusters The first goal can be achieved by explicitly assigning a serialV

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-29 Thread Junrui Lee
Hi David, Thank you very much for your detailed explanation, which is crucial in helping to further improve this FLIP. This FLIP is applicable to both batch and stream processing. For batch processing, it can be used to optimize the StreamGraph (e.g., FLIP-469), while for streaming, we can use th

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-29 Thread David Morávek
Hi all, My main concern is the absence of a comprehensive vision for how we could make this work for both Batch and Streaming. Right now, it feels like the proposal is solely centered around very specific batch optimizations. I’m inclined to support submitting StreamGraph because it could already

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-26 Thread Yu Chen
Hi Junrui, Thanks for driving this. It’s really helpful for user to explore the job they submitted. It's an important cornerstone of the FLINK-33230 since there are no stream-graph informations serialized. I'd like to base on this flip to expose operator level metrics and the presentation of

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-24 Thread Junrui Lee
Hi all, Thank you for all the feedback and suggestions so far. If there are no further comments, we will open the voting thread on Friday, July 26, 2024. Best regards, Junrui Ron Liu 于2024年7月24日周三 13:43写道: > Hi, Junrui > > Thanks for your detailed reply. > > After reading the updated FLIP-468

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-23 Thread Ron Liu
Hi, Junrui Thanks for your detailed reply. After reading the updated FLIP-468 & FLIP-470, I see that the design looks good. Best. Ron Junrui Lee 于2024年7月18日周四 14:26写道: > Hi all, > > I would like to follow up on my previous email regarding your feedback. > Below > is a concise summary of my m

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-17 Thread Junrui Lee
Hi all, I would like to follow up on my previous email regarding your feedback. Below is a concise summary of my main points: 1. Compiled Plan: IIUC, the compiled plan is primarily for ensuring execution plan compatibility across job versions (e.g., during upgrades). Eventually, it needs to be co

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-12 Thread Junrui Lee
Hi all, Thanks for your feedback. Below are my thoughts on the questions you've raised @Fabian - What is the future plan for job submissions in Flink? With the current > proposal, Flink will support JobGraph/StreamGraph/compiled plan > submissions? It might be confusing for users and complicate

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-11 Thread David Morávek
> > For batch scenario, if we want to better support dynamic plan tuning > strategies, the fundamental solution is still to put SQL Optimizer to > flink-runtime. One accompanying question is: how do you envision this to work for streaming where you need to ensure state compatibility after the pla

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-11 Thread Ron Liu
Hi, Junrui The FLIP proposal looks good to me. I have the same question as Fabian: > For join strategies, they are only applicable when using an optimizer (that's currently not part of Flink's runtime) with the Table API or Flink SQL. How do we plan to connect the optimizer with Flink's runtime?

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-11 Thread David Morávek
Hi Junrui, Thank you for drafting the FLIP. I really appreciate the direction it’s taking. We’ve discussed similar approaches multiple times, and it’s great to see this progress. I have a few questions and thoughts: * 1. Transformations in StreamGraphGenerator:* Should we consider taking this a

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-11 Thread Fabian Paul
Thanks for drafting this FLIP. I really like the idea of introducing a concept in Flink that is close to a logical plan submission. I have a few questions about the proposal and its future evolvability. - What is the future plan for job submissions in Flink? With the current proposal, Flink will

Re: [DISCUSS] FLIP-468: Introducing StreamGraph-Based Job Submission.

2024-07-11 Thread Zhu Zhu
+1 for the FLIP. With Dataset API removed in Flink 2.0, StreamGraph will be the only operator level DAG of a Flink job. Exposing it directly to the scheduler can be helpful to deliver some exciting features. This FLIP meationed adaptive plan optimization and operator insights. I'm also thinking o