Hi Vishal,

The difference between stop-with-savepoint and
stop-with-savepoint-with-drain is that the latter emits a max watermark
before taking the snapshot. The idea is to trigger all pending timers and
flush the content of some buffering operations like windowing.
Semantically, you should use the first option if you want to stop the job
and resume it at a later point in time. Stop-with-savepoint-with-drain
should only be used if you want to terminate your job and don't intend to
resume it because the max watermark destroys the correctness of results
which are generated after the job is resumed.

For the concrete problem at hand it is difficult to say why it does not
stop. It would be helpful if you could provide us with the debug logs of
such a run. I am also pulling Arvid who works on Flink's connector
ecosystem.

Cheers,
Till

On Mon, Mar 29, 2021 at 11:08 PM Vishal Santoshi <vishal.santo...@gmail.com>
wrote:

> More interested whether a  StreamingFileSink without a drain
> negatively affects it's exactly-once semantics , given that I state on SP
> would have the offsets from kafka + the valid lengths of the part files at
> SP.  To be honest not sure whether the flushed buffers on sink are included
> in the length, or this is not an issue with StreamingFileSink. If it is the
> former then I would assume we should be documented and then have to look
> why this hang happens.
>
> On Mon, Mar 29, 2021 at 4:08 PM Vishal Santoshi <vishal.santo...@gmail.com>
> wrote:
>
>> Is this a known issue. We do a stop + savepoint with drain. I see no back
>> pressure on our operators. It essentially takes a SP and then the SInk (
>> StreamingFileSink to S3 ) just stays in the RUNNING state.
>>
>> Without drain i stop + savepoint works fine.  I would imagine drain is
>> important ( flush the buffers etc  ) but why this hang ( I did it 3 times
>> and waited 15 minutes each time ).
>>
>> Regards.
>>
>

Reply via email to