Github user steveloughran commented on a diff in the pull request: https://github.com/apache/flink/pull/5521#discussion_r169132404 --- Diff: flink-core/src/main/java/org/apache/flink/api/common/io/FileInputFormat.java --- @@ -691,6 +691,12 @@ public void open(FileInputSplit fileSplit) throws IOException { LOG.debug("Opening input split " + fileSplit.getPath() + " [" + this.splitStart + "," + this.splitLength + "]"); } + if (!exists(fileSplit.getPath())) { --- End diff -- you are doubling the number of checks for file existence here, which, when working with S3 implies three more HTTP requests which takes time and cost money. Better to do the open() call and catch FileNotFoundException, which all filesystems are required to throw if they are given a path which doesn't resolve to a file.
---