gaborgsomogyi commented on code in PR #21458:
URL: https://github.com/apache/flink/pull/21458#discussion_r1724904768


##########
flink-filesystems/flink-s3-fs-base/src/main/java/org/apache/flink/fs/s3/common/writer/S3RecoverableFsDataOutputStream.java:
##########
@@ -126,7 +126,16 @@ public long getPos() throws IOException {
 
     @Override
     public void sync() throws IOException {
-        fileStream.sync();

Review Comment:
   > This change will not alter any processing guarantee.
   
   I can say 100% sure that we're having issue with this now.  The reason is 
the following:
   * `sync` method is defined in `FSDataOutputStream` with the following 
definition:
   ```
       /**
        * Flushes the data all the way to the persistent non-volatile storage 
(for example disks). The
        * method behaves similar to the <i>fsync</i> function, forcing all data 
to be persistent on the
        * devices.
        *
        * @throws IOException Thrown if an I/O error occurs
        */
   ```
   * In case `sync` method call user of the writer instance is expected to call 
further `write` methods
   * What is actually happening it's blowing up the next write with the 
following exception:
   ```
   java.io.IOException: Stream closed.
   
        at 
org.apache.flink.core.fs.RefCountedFileWithStream.requireOpened(RefCountedFileWithStream.java:72)
        at 
org.apache.flink.core.fs.RefCountedFileWithStream.write(RefCountedFileWithStream.java:52)
        at 
org.apache.flink.core.fs.RefCountedBufferingFileStream.flush(RefCountedBufferingFileStream.java:104)
        at 
org.apache.flink.core.fs.RefCountedBufferingFileStream.write(RefCountedBufferingFileStream.java:87)
        at 
org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.write(S3RecoverableFsDataOutputStream.java:112)
        at java.base/java.io.OutputStream.write(OutputStream.java:122)
   ```
   * This can be super easily tested with 
`S3RecoverableFsDataOutputStreamTest.testSync`. Please remove the `expected = 
Exception.class` from the beginning of the test.
   * The following line in the test is testing nothing because never ever 
called: 
https://github.com/apache/flink/blob/56c81995d3b34ed9066b6771755407b93438f5ab/flink-filesystems/flink-s3-fs-base/src/test/java/org/apache/flink/fs/s3/common/writer/S3RecoverableFsDataOutputStreamTest.java#L264
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to