Hi, how are you printing the debug statements? But yeah all the logic of renaming in progress files and cleaning up after a failed job happens in restoreState(BucketState state). The steps are roughly these:
1. Move current in-progress file to final location 2. truncate the file if necessary (if truncate is not available write a .valid-length file) 3. Move pending files to final location that where part of the checkpoint 4. cleanup any leftover pending/in-progress files Cheers, Aljoscha > On 22 Mar 2016, at 10:08, Vijay Srinivasaraghavan > <vijikar...@yahoo.com.INVALID> wrote: > > Hello, > I have enabled checkpoint and I am using RollingSink to sink the data to HDFS > (2.7.x) from KafkaConsumer. To simulate failover/recovery, I stopped > TaskManager and the job gets rescheduled to other Taskmanager instance. > During this momemnt, the current "in-progress" gets closed and renamed to > part-0-1 from _part-0-1_in-progress. > I was hoping to see the debug statement that I have added to "restoreState" > method but none of my debug statement gets printed. I am not sure if the > restoreState() method gets invoked during this scenario. Could you please > help me understand the flow during "failover" scenario? > P.S: Functionally the code appears to be working fine but I am trying to > understand the underlying implementation details. public void > restoreState(BucketState state) > Regards > Vijay