Re: Suspicious watermark of operators after restore from checkpoint

Kostas Kloudas Wed, 07 Aug 2019 05:00:35 -0700

Hi Jan,

Two pointers that may help you explain the behavior are the following:

1) If you have a custom watermark generator, I do not think that Flink
checks if it emits only monotonically
increasing watermarks. This is the responsibility of the generator itself.
This means that although you operator A
is topologically before operator B, operator A may have a smaller watermark
if your watermark generator allows so.

2) Flink currently does not checkpoint the last seen watermark (
https://issues.apache.org/jira/browse/FLINK-5601).
This means that after restoring, your (event) time is assumed to be
Long.Min until the first new watermark comes.
So if you observed late data not being late anymore or sth similar, then it
may not be that the two operators have
different watermarks but that after restoring event time rolls back to the
"beginning of time".

I hope this helps,
Kostas

On Wed, Aug 7, 2019 at 12:11 PM Jan Lukavský <je...@seznam.cz> wrote:

> Hi all,
>
> I have just come across a weird state of operators after restore from
> checkpoint. After the restore, two operators that are connected (i.e.
> operator A is input of operator B) ended up with watermark of operator A
> being less than watermark of operator B. I don't know how to explain
> this. Can it be normal or does it signal a bug somewhere? If I
> understand Flink's checkpointing correctly, the checkpoint barrier flows
> from one operator to another, so the watermark should be aligned.
>
> I'm running a Beam pipeline on Flink 1.8.1.
>
> Am I missing something?
>
> Many thanks for comments,
>
>   Jan
>
>

Re: Suspicious watermark of operators after restore from checkpoint

Reply via email to