Github user steveloughran commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5521#discussion_r169132404
  
    --- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/io/FileInputFormat.java ---
    @@ -691,6 +691,12 @@ public void open(FileInputSplit fileSplit) throws 
IOException {
                        LOG.debug("Opening input split " + fileSplit.getPath() 
+ " [" + this.splitStart + "," + this.splitLength + "]");
                }
     
    +           if (!exists(fileSplit.getPath())) {
    --- End diff --
    
    you are doubling the number of checks for file existence here, which, when 
working with S3 implies three more HTTP requests which takes time and cost 
money. Better to do the open() call and catch FileNotFoundException, which all 
filesystems are required to throw if they are given a path which doesn't 
resolve to a file.


---

Reply via email to