Hi devs, Xia Sun, Lei Yang, and I would like to initiate a discussion about FLIP-468: Introducing StreamGraph-Based Job Submission.
Currently, Flink has the capability to adjust the ExecutionGraph in batch processing mode, such as dynamically deciding the parallelism of JobVertex based on the input data. However, in certain scenarios, adjustments to the ExecutionGraph are insufficient to resolve issues. At this point, we need to adjust the StreamGraph, including the logic of operators and the data distribution patterns. For instance, adaptively changing hash join or sort merge join to broadcast join when the data volume of one input to a Join operator is small and the other input is large can improve performance. Meanwhile, re-partitioning or adjusting the computational logic can be beneficial to solve data hotspots. As a result, we intend to introduce a mechanism for adaptive optimization of StreamGraph, and leverage it to support adaptive broadcast join and skewed join optimization. This FLIP is the first in a series, aiming to introduce a job submission mode based on StreamGraph. This mode will enable Flink to directly access and adjust the actual logical execution plan of the job at runtime, to enhance the job's execution performance and observability. For more details, please refer to FLIP-468 [1]. We look forward to your feedback. Best, Xia Sun, Lei Yang and Junrui Lee [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-468%3A+Introducing+StreamGraph-Based+Job+Submission