gaborgsomogyi commented on code in PR #21458: URL: https://github.com/apache/flink/pull/21458#discussion_r1724904768
########## flink-filesystems/flink-s3-fs-base/src/main/java/org/apache/flink/fs/s3/common/writer/S3RecoverableFsDataOutputStream.java: ########## @@ -126,7 +126,16 @@ public long getPos() throws IOException { @Override public void sync() throws IOException { - fileStream.sync(); Review Comment: > This change will not alter any processing guarantee. I can say 100% sure that we're having issue with this now. The reason is the following: * `sync` method is defined in `FSDataOutputStream` with the following definition: ``` /** * Flushes the data all the way to the persistent non-volatile storage (for example disks). The * method behaves similar to the <i>fsync</i> function, forcing all data to be persistent on the * devices. * * @throws IOException Thrown if an I/O error occurs */ ``` * In case `sync` method call user of the writer instance is expected to call further `write` methods * What is actually happening it's blowing up the next write with the following exception: ``` java.io.IOException: Stream closed. at org.apache.flink.core.fs.RefCountedFileWithStream.requireOpened(RefCountedFileWithStream.java:72) at org.apache.flink.core.fs.RefCountedFileWithStream.write(RefCountedFileWithStream.java:52) at org.apache.flink.core.fs.RefCountedBufferingFileStream.flush(RefCountedBufferingFileStream.java:104) at org.apache.flink.core.fs.RefCountedBufferingFileStream.write(RefCountedBufferingFileStream.java:87) at org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.write(S3RecoverableFsDataOutputStream.java:112) at java.base/java.io.OutputStream.write(OutputStream.java:122) ``` * This can be super easily tested with `S3RecoverableFsDataOutputStreamTest.testSync`. Please remove the `expected = Exception.class` from the beginning of the test. * The following line in the test is testing nothing because never ever called. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org