Got it. Is it possible to add this very important note to the
documentation. Our case is the former as in this is an infinite pipeline
and we were establishing the CiCD release process when non breaking
changes ( DAG compatible changes are made ) on a running pipe.

Regards

On Tue, Mar 30, 2021 at 8:14 AM Till Rohrmann <trohrm...@apache.org> wrote:

> Hi Vishal,
>
> The difference between stop-with-savepoint and
> stop-with-savepoint-with-drain is that the latter emits a max watermark
> before taking the snapshot. The idea is to trigger all pending timers and
> flush the content of some buffering operations like windowing.
> Semantically, you should use the first option if you want to stop the job
> and resume it at a later point in time. Stop-with-savepoint-with-drain
> should only be used if you want to terminate your job and don't intend to
> resume it because the max watermark destroys the correctness of results
> which are generated after the job is resumed.
>
> For the concrete problem at hand it is difficult to say why it does not
> stop. It would be helpful if you could provide us with the debug logs of
> such a run. I am also pulling Arvid who works on Flink's connector
> ecosystem.
>
> Cheers,
> Till
>
> On Mon, Mar 29, 2021 at 11:08 PM Vishal Santoshi <
> vishal.santo...@gmail.com> wrote:
>
>> More interested whether a  StreamingFileSink without a drain
>> negatively affects it's exactly-once semantics , given that I state on SP
>> would have the offsets from kafka + the valid lengths of the part files at
>> SP.  To be honest not sure whether the flushed buffers on sink are included
>> in the length, or this is not an issue with StreamingFileSink. If it is the
>> former then I would assume we should be documented and then have to look
>> why this hang happens.
>>
>> On Mon, Mar 29, 2021 at 4:08 PM Vishal Santoshi <
>> vishal.santo...@gmail.com> wrote:
>>
>>> Is this a known issue. We do a stop + savepoint with drain. I see no
>>> back pressure on our operators. It essentially takes a SP and then the SInk
>>> ( StreamingFileSink to S3 ) just stays in the RUNNING state.
>>>
>>> Without drain i stop + savepoint works fine.  I would imagine drain is
>>> important ( flush the buffers etc  ) but why this hang ( I did it 3 times
>>> and waited 15 minutes each time ).
>>>
>>> Regards.
>>>
>>

Reply via email to