[ https://issues.apache.org/jira/browse/FLINK-25200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479437#comment-17479437 ]
Piotr Nowojski commented on FLINK-25200: ---------------------------------------- [~yunta], I'm not sure how much more information would a more realistic test give us. Yes, one thing not covered by [~akalashnikov]'s test is local IO. But when re-uploading instead of duplicating file, it's quite likely that the state file will be already in the file cache for example. Regardless, after looking at those results, I'm beginning to doubt if it makes sense to provide native duplicate support for S3. It looks like the performance cost of both of those operations on the AWS side is the same. I was hoping/expecting orders of magnitude performance difference in favour of the CopyObject API. > Implement duplicating for s3 filesystem > --------------------------------------- > > Key: FLINK-25200 > URL: https://issues.apache.org/jira/browse/FLINK-25200 > Project: Flink > Issue Type: Sub-task > Components: FileSystems > Reporter: Dawid Wysakowicz > Priority: Major > Fix For: 1.15.0 > > > We can use https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html -- This message was sent by Atlassian Jira (v8.20.1#820001)