Let BucketingSink roll file on each checkpoint

XilangYan Mon, 19 Mar 2018 18:14:01 -0700

The behavior of BucketingSink is not exactly we want. 
If I understood correctly, when checkpoint requested, BucketingSink will
flush writer to make sure data not loss, but will not close file, nor roll
new file after checkpoint.
In the case of HDFS, if file length is not updated to name node(through
close file or update file length specifically), MR or other data analysis
tool will not read new data. This is not we desired.
I also want to open new file for each checkpoint period to make sure HDFS
file is persistent, because we met some bugs in flush/append hdfs file user
case.


Is there anyway to let BucketingSink roll file on each checkpoint? Thanks in
advance.




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Let BucketingSink roll file on each checkpoint

Reply via email to