Thanks a lot for the proposal  @Yun Tang ! It sounds great and I can't find any 
reason not to make this improvement.


——————————————
Name: Feifan Wang
Email: zoltar9...@163.com


---- Replied Message ----
| From | Yun Tang<myas...@live.com> |
| Date | 06/30/2022 16:56 |
| To | dev@flink.apache.org<dev@flink.apache.org> |
| Subject | [DISCUSS] Introduce multi delete API to Flink's FileSystem class |
Hi guys,

As more and more teams move to cloud-based environments. Cloud object storage 
has become the factual technical standard for big data ecosystems.
From our experience, the performance of writing/deleting objects in object 
storage could vary in each call, the FLIP of changelog state-backend had ever 
taken experiments to verify the performance of writing the same data with multi 
times [1], and it proves that p999 latency could be 8x than p50 latency. This 
is also true for delete operations.

Currently, after introducing the checkpoint backpressure mechanism[2], the 
newly triggered checkpoint could be delayed due to not cleaning checkpoints as 
fast as possible [3].
Moreover, Flink's checkpoint cleanup mechanism cannot leverage deleting folder 
API to speed up the procedure with incremental checkpoints[4].
This is extremely obvious in cloud object storage, and all most all object 
storage SDKs have multi-delete API to accelerate the performance, e.g. AWS S3 
[5], Aliyun OSS [6], and Tencentyun COS [7].
A simple experiment shows that deleting 1000 objects with each 5MB size, will 
cost 39494ms with for-loop single delete operations, and the result will drop 
to 1347ms if using multi-delete API in Tencent Cloud.

However, Flink's FileSystem API refers to the HDFS's FileSystem API and lacks 
such a multi-delete API, which is somehow outdated currently in cloud-based 
environments.
Thus I suggest adding such a multi-delete API to Flink's FileSystem[8] class 
and file systems that do not support such a multi-delete feature will roll back 
to a for-loop single delete.
By doing so, we can at least accelerate the speed of discarding checkpoints in 
cloud environments.

WDYT?


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-158%3A+Generalized+incremental+checkpoints#FLIP158:Generalizedincrementalcheckpoints-DFSwritelatency
[2] https://issues.apache.org/jira/browse/FLINK-17073
[3] https://issues.apache.org/jira/browse/FLINK-26590
[4] 
https://github.com/apache/flink/blob/1486fee1acd9cd1e340f6d2007f723abd20294e5/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java#L315
[5] 
https://docs.aws.amazon.com/AmazonS3/latest/userguide/delete-multiple-objects.html
[6] 
https://www.alibabacloud.com/help/en/object-storage-service/latest/delete-objects-8#section-v6n-zym-tax
[7] 
https://intl.cloud.tencent.com/document/product/436/44018#delete-objects-in-batch
[8] 
https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java


Best
Yun Tang

Reply via email to