Github user NicoK commented on the issue:

    https://github.com/apache/flink/pull/5624
  
    Indeed, Presto-S3 does better in 
`com.facebook.presto.hive.PrestoS3FileSystem#create()`:
    ```
    if ((!overwrite) && exists(path)) {
        throw new IOException("File already exists:" + path);
    }
    // file creation
    ```
    But if `overwrite = false`, it will also check for existence first. Also, 
contrary to my initial analysis, the retries when retrieving the file status 
during the existence check do not cover non-existence. I can adapt the tests to 
only use `overwrite = true`, but actual code outside the tests makes use of 
both variants.
    
    It's therefore a good idea to make the distinction between 
`flink-s3-fs-hadoop` and `flink-s3-fs-presto` but only for the existence check, 
not for checking that a file/directory was deleted since
    > Amazon S3 offers eventual consistency for overwrite PUTS and DELETES in 
all regions.
    
    I adapted the code accordingly which effectively boiled down to removing 
some of the new eventual consistent existence checks in 
`PrestoS3FileSystemITCase`.
    
    Regarding the two implementations you provided: for doing the existence 
check, there should not be a difference between a single `fs.exists()` call vs. 
`fs.open()` in terms of consistency.


---

Reply via email to