Hi, flinkers! I'm new to this whole thing, and it seems to me that ' org.apache.flink.streaming.api.datastream.DataStream.writeAsCsv(String, WriteMode, long)' does not work properly. To be specific, data were not flushed by update frequency when write to HDFS.
what make it more disturbing is that, if I check the content with 'hdfs dfs -cat xxx', sometimes I got partial records. I did a little digging in flink-0.9.1. And it turns out all 'org.apache.flink.streaming.api.functions.sink.FileSinkFunction.invoke(IN)' does is pushing data to 'org.apache.flink.runtime.fs.hdfs.HadoopDataOutputStream' which is a delegate of 'org.apache.hadoop.fs.FSDataOutputStream'. In this scenario, 'org.apache.hadoop.fs.FSDataOutputStream' is never flushed. Which result in data being held in local buffer, and 'hdfs dfs -cat xxx' might return partial records. Does 'DataStream.writeAsCsv' suppose to work like this? Or I messed up somewhere? Best regards and thanks for your time! Rex