Hi everyone, I would like to open a discussion on providing a unified file merging mechanism for checkpoints[1].
Currently, many files are uploaded to the DFS during checkpoints, leading to the 'file flood' problem when running intensive workloads in a cluster. To tackle this problem, various solutions have been proposed for different types of state files. Although these methods are similar, they lack a systematic view and approach. We believe that it is better to consider this problem as a whole and introduce a unified framework to address the file flood problem for all types of state files. A POC has been implemented based on current FLIP design, and the test results are promising. Looking forward to your comments or feedback. Best regards, Zakelly [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-306%3A+Unified+File+Merging+Mechanism+for+Checkpoints