Caizhi Weng created FLINK-24728:
-----------------------------------

             Summary: Batch SQL file sink forgets to close the output stream
                 Key: FLINK-24728
                 URL: https://issues.apache.org/jira/browse/FLINK-24728
             Project: Flink
          Issue Type: Bug
          Components: Table SQL / Runtime
    Affects Versions: 1.11.5, 1.12.6, 1.15.0, 1.14.1, 1.13.4
            Reporter: Caizhi Weng
             Fix For: 1.15.0, 1.14.1, 1.13.4


I tried to write a large avro file into HDFS and discover that the displayed 
file size in HDFS is extremely small, but copying that file to local yields the 
correct size. If we create another Flink job and read that avro file from HDFS, 
the job will finish without outputting any record because the file size Flink 
gets from HDFS is the very small file size.

This is because the output format created in 
{{FileSystemTableSink#createBulkWriterOutputFormat}} only finishes the 
{{BulkWriter}}. According to the java doc of {{BulkWriter#finish}} bulk writers 
should not close the output stream and should leave them to the framework.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to