Forgot to cc Kostas

On 23/04/2020 12:11, Eyal Pe'er wrote:
>
> Hi all,
> I am using Flink streaming with Kafka consumer connector
> (FlinkKafkaConsumer) and file Sink (StreamingFileSink) in a cluster
> mode with exactly once policy.
>
> The file sink writes the files to the local disk.
>
> I’ve noticed that if a job fails and automatic restart is on, the task
> managers look for the leftovers files from the last failing job
> (hidden files).
>
> Obviously, since the tasks can be assigned to different task managers,
> this sums up to more failures over and over again.
>
> The only solution I found so far is to delete the hidden files and
> resubmit the job.
>
> If I get it right (and please correct me If I wrong), the events in
> the hidden files were not committed to the bootstrap-server, so there
> is no data loss.
>
>  
>
> Is there a way, forcing Flink to ignore the files that were written
> already? Or maybe there is a better way to implement the solution
> (maybe somehow with savepoints)?
>
>  
>
> Best regards
>
> Eyal Peer
>
>  
>

Attachment: signature.asc
Description: OpenPGP digital signature



Reply via email to