My first guess would also be the same as Rafi's: The lifetime of the MPU
part files is so too low for that use case.

Maybe this can help:

  - If you want to  stop a job with a savepoint and plan to restore later
from it (possible much later, so that the MPU Part lifetime might be
exceeded), then I would recommend to use Flink 1.9's new "stop with
savepoint" feature. That should finalize in-flight uploads and make sure no
lingering part files exist.

  - If you take a savepoint out of a running job to start a new job, you
probably need to configure the sink differently anyways, to not interfere
with the running job. In that case, I would suggest to change the name of
the sink (the operator uid) such that the new job's sink doesn't try to
resume (and interfere with) the running job's sink.

Best,
Stephan



On Sat, Aug 17, 2019 at 11:23 PM Rafi Aroch <rafi.ar...@gmail.com> wrote:

> Hi,
>
> S3 would delete files only if you have 'lifecycle rules' [1] defined on
> the bucket. Could that be the case? If so, make sure to disable / extend
> the object expiration period.
>
> [1]
> https://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html
>
> Thanks,
> Rafi
>
>
> On Sat, Aug 17, 2019 at 1:48 AM Oytun Tez <oy...@motaword.com> wrote:
>
>> Hi Swapnil,
>>
>> I am not familiar with the StreamingFileSink, however, this sounds like a
>> checkpointing issue to me FileSink should keep its sink state, and remove
>> from the state the files that it *really successfully* sinks (perhaps
>> you may want to add a validation here with S3 to check file integrity).
>> This leaves us in the state with the failed files, partial files etc.
>>
>>
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Translation Platform.
>> oy...@motaword.com — www.motaword.com
>>
>>
>> On Fri, Aug 16, 2019 at 6:02 PM Swapnil Kumar <swku...@zendesk.com>
>> wrote:
>>
>>> Hello, We are using Flink to process input events and aggregate and
>>> write o/p of our streaming job to S3 using StreamingFileSink but whenever
>>> we try to restore the job from a savepoint, the restoration fails with
>>> missing part files error. As per my understanding, s3 deletes those
>>> part(intermittent) files and can no longer be found on s3. Is there a
>>> workaround for this, so that we can use s3 as a sink?
>>>
>>> --
>>> Thanks,
>>> Swapnil Kumar
>>>
>>

Reply via email to