Github user NicoK commented on the issue: https://github.com/apache/flink/pull/5624 That's actually the check for whether or not to overwrite the file - let me drop the whole code of this example to give some more context: ``` public FSDataOutputStream create(Path f, FsPermission permission, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException { String key = pathToKey(f); S3AFileStatus status = null; try { // get the status or throw an FNFE status = getFileStatus(f); // if the thread reaches here, there is something at the path if (status.isDirectory()) { // path references a directory: automatic error throw new FileAlreadyExistsException(f + " is a directory"); } if (!overwrite) { // path references a file and overwrite is disabled throw new FileAlreadyExistsException(f + " already exists"); } LOG.debug("Overwriting file {}", f); } catch (FileNotFoundException e) { // this means the file is not found } instrumentation.fileCreated(); FSDataOutputStream output; if (blockUploadEnabled) { output = new FSDataOutputStream( new S3ABlockOutputStream(this, key, new SemaphoredDelegatingExecutor(boundedThreadPool, blockOutputActiveBlocks, true), progress, partSize, blockFactory, instrumentation.newOutputStreamStatistics(statistics), new WriteOperationHelper(key) ), null); } else { // We pass null to FSDataOutputStream so it won't count writes that // are being buffered to a file output = new FSDataOutputStream( new S3AOutputStream(getConf(), this, key, progress ), null); } return output; } ``` At this point, however, there is the read-before-write which is causing the problem. (of course some other concurrent job could create the file in the meantime and this check is not guarding too much but we cannot do too much about it)
---