Thanks a lot for the proposal @Yun Tang ! It sounds great and I can't find any reason not to make this improvement.
—————————————— Name: Feifan Wang Email: zoltar9...@163.com ---- Replied Message ---- | From | Yun Tang<myas...@live.com> | | Date | 06/30/2022 16:56 | | To | dev@flink.apache.org<dev@flink.apache.org> | | Subject | [DISCUSS] Introduce multi delete API to Flink's FileSystem class | Hi guys, As more and more teams move to cloud-based environments. Cloud object storage has become the factual technical standard for big data ecosystems. From our experience, the performance of writing/deleting objects in object storage could vary in each call, the FLIP of changelog state-backend had ever taken experiments to verify the performance of writing the same data with multi times [1], and it proves that p999 latency could be 8x than p50 latency. This is also true for delete operations. Currently, after introducing the checkpoint backpressure mechanism[2], the newly triggered checkpoint could be delayed due to not cleaning checkpoints as fast as possible [3]. Moreover, Flink's checkpoint cleanup mechanism cannot leverage deleting folder API to speed up the procedure with incremental checkpoints[4]. This is extremely obvious in cloud object storage, and all most all object storage SDKs have multi-delete API to accelerate the performance, e.g. AWS S3 [5], Aliyun OSS [6], and Tencentyun COS [7]. A simple experiment shows that deleting 1000 objects with each 5MB size, will cost 39494ms with for-loop single delete operations, and the result will drop to 1347ms if using multi-delete API in Tencent Cloud. However, Flink's FileSystem API refers to the HDFS's FileSystem API and lacks such a multi-delete API, which is somehow outdated currently in cloud-based environments. Thus I suggest adding such a multi-delete API to Flink's FileSystem[8] class and file systems that do not support such a multi-delete feature will roll back to a for-loop single delete. By doing so, we can at least accelerate the speed of discarding checkpoints in cloud environments. WDYT? [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-158%3A+Generalized+incremental+checkpoints#FLIP158:Generalizedincrementalcheckpoints-DFSwritelatency [2] https://issues.apache.org/jira/browse/FLINK-17073 [3] https://issues.apache.org/jira/browse/FLINK-26590 [4] https://github.com/apache/flink/blob/1486fee1acd9cd1e340f6d2007f723abd20294e5/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java#L315 [5] https://docs.aws.amazon.com/AmazonS3/latest/userguide/delete-multiple-objects.html [6] https://www.alibabacloud.com/help/en/object-storage-service/latest/delete-objects-8#section-v6n-zym-tax [7] https://intl.cloud.tencent.com/document/product/436/44018#delete-objects-in-batch [8] https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java Best Yun Tang