Re: Exactly Once Guarantees with StreamingFileSink to S3

2019-02-07 Thread Kostas Kloudas
No problem! On Wed, Feb 6, 2019 at 6:38 PM Kaustubh Rudrawar wrote: > Hi Kostas, > > Thanks for the response! Yes - I see the commitAfterRecovery being called > when a Bucket is restored. I confused myself in thinking that > 'onSuccessfulCompletionOfCheckpoint' is called on restore as well, whic

Re: Exactly Once Guarantees with StreamingFileSink to S3

2019-02-06 Thread Kaustubh Rudrawar
Hi Kostas, Thanks for the response! Yes - I see the commitAfterRecovery being called when a Bucket is restored. I confused myself in thinking that 'onSuccessfulCompletionOfCheckpoint' is called on restore as well, which led me to believe that we were only calling commit and not commitAfterRecovery

Re: Exactly Once Guarantees with StreamingFileSink to S3

2019-02-06 Thread Kostas Kloudas
Hi Kaustubh, Your general understanding is correct. In this case though, the sink will call the S3Committer#commitAfterRecovery() method. This method, after failing to commit the MPU, it will check if the file is there and if the length is correct, and if everything is ok (which is the case in yo

Exactly Once Guarantees with StreamingFileSink to S3

2019-02-05 Thread Kaustubh Rudrawar
Hi, I'm trying to understand the exactly once semantics of the StreamingFileSink with S3 in Flink 1.7.1 and am a bit confused on how it guarantees exactly once under a very specific failure scenario. For simplicity, lets say we will roll the current part file on checkpoint (and only on checkpoint