Re: Let BucketingSink roll file on each checkpoint

XilangYan Tue, 20 Mar 2018 23:38:49 -0700

Thank you! Fabian

HDFS small file problem can be avoid with big checkpoint interval.


Meanwhile, there is potential data lose problem in current BucketingSink.
Say we consume data in kafka, when checkpoint is requested, kafka offset is
update, but in-progress file in BucketingSink is remained. If flink crushed
after that, data in the in-progress file is lost. Am I right?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Let BucketingSink roll file on each checkpoint

Reply via email to