Hi Bill,
I wrote those two medium posts you mentioned above. But clearly, the
techlab one is much better
I would suggest just "close the file when checkpointing" which is the
easiest way. If you use BucketingSink, you can modify the code to make it
work. Just replace the code from line 691 to 693
Hi,
Since you said BucketingSink, I think it may be related to your bucketer.
Let's say you bucket by hour. In your stream, at a moment, your records'
timestamp ranges from hour 00 to hour 23. Which means in your task, it
needs 24 writers dedicated to each bucket. If you have 4 task slots in a
ta
Hi Juan,
We modified the flink code a little bit to change the flink checkpoint
structure so we can easily identify which is which
you can read my note or the PR
https://medium.com/hadoop-noob/flink-externalized-checkpoint-eb86e693cfed
https://github.com/BranchMetrics/flink/pull/6/files
Hope it he