[ https://issues.apache.org/jira/browse/FLINK-11937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xintong Song updated FLINK-11937: --------------------------------- Fix Version/s: (was: 1.14.0) 1.15.0 > Resolve small file problem in RocksDB incremental checkpoint > ------------------------------------------------------------ > > Key: FLINK-11937 > URL: https://issues.apache.org/jira/browse/FLINK-11937 > Project: Flink > Issue Type: New Feature > Components: Runtime / Checkpointing > Reporter: Congxian Qiu > Priority: Major > Labels: auto-unassigned, pull-request-available > Fix For: 1.15.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently when incremental checkpoint is enabled in RocksDBStateBackend a > separate file will be generated on DFS for each sst file. This may cause > “file flood” when running intensive workload (many jobs with high > parallelism) in big cluster. According to our observation in Alibaba > production, such file flood introduces at lease two drawbacks when using HDFS > as the checkpoint storage FileSystem: 1) huge number of RPC request issued to > NN which may burst its response queue; 2) huge number of files causes big > pressure on NN’s on-heap memory. > In Flink we ever noticed similar small file flood problem and tried to > resolved it by introducing ByteStreamStateHandle(FLINK-2808), but this > solution has its limitation that if we configure the threshold too low there > will still be too many small files, while if too high the JM will finally > OOM, thus could hardly resolve the issue in case of using RocksDBStateBackend > with incremental snapshot strategy. > We propose a new OutputStream called > FileSegmentCheckpointStateOutputStream(FSCSOS) to fix the problem. FSCSOS > will reuse the same underlying distributed file until its size exceeds a > preset threshold. We > plan to complete the work in 3 steps: firstly introduce FSCSOS, secondly > resolve the specific storage amplification issue on FSCSOS, and lastly add an > option to reuse FSCSOS across multiple checkpoints to further reduce the DFS > file number. > More details please refer to the attached design doc. -- This message was sent by Atlassian Jira (v8.3.4#803005)