Caizhi Weng created FLINK-24728: ----------------------------------- Summary: Batch SQL file sink forgets to close the output stream Key: FLINK-24728 URL: https://issues.apache.org/jira/browse/FLINK-24728 Project: Flink Issue Type: Bug Components: Table SQL / Runtime Affects Versions: 1.11.5, 1.12.6, 1.15.0, 1.14.1, 1.13.4 Reporter: Caizhi Weng Fix For: 1.15.0, 1.14.1, 1.13.4
I tried to write a large avro file into HDFS and discover that the displayed file size in HDFS is extremely small, but copying that file to local yields the correct size. If we create another Flink job and read that avro file from HDFS, the job will finish without outputting any record because the file size Flink gets from HDFS is the very small file size. This is because the output format created in {{FileSystemTableSink#createBulkWriterOutputFormat}} only finishes the {{BulkWriter}}. According to the java doc of {{BulkWriter#finish}} bulk writers should not close the output stream and should leave them to the framework. -- This message was sent by Atlassian Jira (v8.3.4#803005)