Hi,
Yes, basically all the exactly-once/at-least-once guarantees are not given if 
checkpointing does not work correctly. For example, this will also be the case 
when reading from Kafka and writing to Kafka.

Best,
Aljoscha 
> On 28. Apr 2017, at 15:53, Yassine MARZOUGUI <y.marzou...@mindlytix.com> 
> wrote:
> 
> Hi Aljoscha,
> 
> Thank you for your response. I guess then I will manually rename the pending 
> files. Does this however mean that the BucketingSink is not exactly-once as 
> it is described is the docs, since in this case (failure of the job and 
> failure of checkpoints) there will be duplicates? Or am I missing something 
> in the notion of exactly-once guarantees?
> 
> Best,
> Yassine
> 
> 2017-04-28 15:47 GMT+02:00 Aljoscha Krettek <aljos...@apache.org 
> <mailto:aljos...@apache.org>>:
> Hi,
> Yes, your analysis is correct. The pending files are not recognised as such 
> because they were never in any checkpointed state that could be restored. I’m 
> afraid it’s not possible to build the sink state just from the files existing 
> in the output folder. The reason we have state in the first place is so that 
> we can figure out what each of the files in the output folder are.
> 
> Maybe you could manually move the pending files that you know are correct to 
> “final”?
> 
> Best,
> Aljoscha
> 
>> On 28. Apr 2017, at 11:22, Yassine MARZOUGUI <y.marzou...@mindlytix.com 
>> <mailto:y.marzou...@mindlytix.com>> wrote:
>> 
>> Hi all,
>> 
>> I'm have a failed job containing a BucketingSink. The last successful 
>> checkpoint was before the source started emitting data. The following 
>> checkpoints all failed due to the long timeout as I mentioned here : 
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoints-very-slow-with-high-backpressure-td12762.html
>>  
>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoints-very-slow-with-high-backpressure-td12762.html>.
>> 
>> The Taskmanager has then failed. Upon recovery, the pending fies did not 
>> move to finished state. 
>> 
>> Is that because the sink was not able to checkpoint to list of pending files?
>> Is it possible to build the sink state just from the output folder and the 
>> suffixes of the files?
>> 
>> Thanks,
>> Yassine
> 
> 

Reply via email to