Github user NicoK commented on the issue:

    https://github.com/apache/flink/pull/5624
  
    That's actually the check for whether or not to overwrite the file - let me 
drop the whole code of this example to give some more context:
    ```
      public FSDataOutputStream create(Path f, FsPermission permission,
          boolean overwrite, int bufferSize, short replication, long blockSize,
          Progressable progress) throws IOException {
        String key = pathToKey(f);
        S3AFileStatus status = null;
        try {
          // get the status or throw an FNFE
          status = getFileStatus(f);
    
          // if the thread reaches here, there is something at the path
          if (status.isDirectory()) {
            // path references a directory: automatic error
            throw new FileAlreadyExistsException(f + " is a directory");
          }
          if (!overwrite) {
            // path references a file and overwrite is disabled
            throw new FileAlreadyExistsException(f + " already exists");
          }
          LOG.debug("Overwriting file {}", f);
        } catch (FileNotFoundException e) {
          // this means the file is not found
    
        }
        instrumentation.fileCreated();
        FSDataOutputStream output;
        if (blockUploadEnabled) {
          output = new FSDataOutputStream(
              new S3ABlockOutputStream(this,
                  key,
                  new SemaphoredDelegatingExecutor(boundedThreadPool,
                      blockOutputActiveBlocks, true),
                  progress,
                  partSize,
                  blockFactory,
                  instrumentation.newOutputStreamStatistics(statistics),
                  new WriteOperationHelper(key)
              ),
              null);
        } else {
    
          // We pass null to FSDataOutputStream so it won't count writes that
          // are being buffered to a file
          output = new FSDataOutputStream(
              new S3AOutputStream(getConf(),
                  this,
                  key,
                  progress
              ),
              null);
        }
        return output;
      }
    ```
    
    At this point, however, there is the read-before-write which is causing the 
problem.
    (of course some other concurrent job could create the file in the meantime 
and this check is not guarding too much but we cannot do too much about it)


---

Reply via email to