We are on 1.8 as of now will give "stop with savepoint" a try once we upgrade. I am trying to cancel the job with savepoint and restore it back again.
I think there is an issue with how our s3 lifecycle is configured. Thank you for your help. On Sun, Aug 18, 2019 at 8:10 AM Stephan Ewen <se...@apache.org> wrote: > My first guess would also be the same as Rafi's: The lifetime of the MPU > part files is so too low for that use case. > > Maybe this can help: > > - If you want to stop a job with a savepoint and plan to restore later > from it (possible much later, so that the MPU Part lifetime might be > exceeded), then I would recommend to use Flink 1.9's new "stop with > savepoint" feature. That should finalize in-flight uploads and make sure no > lingering part files exist. > > - If you take a savepoint out of a running job to start a new job, you > probably need to configure the sink differently anyways, to not interfere > with the running job. In that case, I would suggest to change the name of > the sink (the operator uid) such that the new job's sink doesn't try to > resume (and interfere with) the running job's sink. > > Best, > Stephan > > > > On Sat, Aug 17, 2019 at 11:23 PM Rafi Aroch <rafi.ar...@gmail.com> wrote: > >> Hi, >> >> S3 would delete files only if you have 'lifecycle rules' [1] defined on >> the bucket. Could that be the case? If so, make sure to disable / extend >> the object expiration period. >> >> [1] >> https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html >> <https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html> >> >> Thanks, >> Rafi >> >> >> On Sat, Aug 17, 2019 at 1:48 AM Oytun Tez <oy...@motaword.com> wrote: >> >>> Hi Swapnil, >>> >>> I am not familiar with the StreamingFileSink, however, this sounds like >>> a checkpointing issue to me FileSink should keep its sink state, and remove >>> from the state the files that it *really successfully* sinks (perhaps >>> you may want to add a validation here with S3 to check file integrity). >>> This leaves us in the state with the failed files, partial files etc. >>> >>> >>> >>> --- >>> Oytun Tez >>> >>> *M O T A W O R D* >>> The World's Fastest Human Translation Platform. >>> oy...@motaword.com — www.motaword.com >>> <http://www.motaword.com/> >>> >>> >>> On Fri, Aug 16, 2019 at 6:02 PM Swapnil Kumar <swku...@zendesk.com> >>> wrote: >>> >>>> Hello, We are using Flink to process input events and aggregate and >>>> write o/p of our streaming job to S3 using StreamingFileSink but whenever >>>> we try to restore the job from a savepoint, the restoration fails with >>>> missing part files error. As per my understanding, s3 deletes those >>>> part(intermittent) files and can no longer be found on s3. Is there a >>>> workaround for this, so that we can use s3 as a sink? >>>> >>>> -- >>>> Thanks, >>>> Swapnil Kumar >>>> >>> -- Thanks, Swapnil Kumar