Hi Shikhar,

the currently open windows are also part of the operator state. Whenever a
window operator receives a barrier it will checkpoint the state of the user
function and additionally all uncompleted windows. This also means that the
window operator does not buffer the barriers. Once it has taken the
snapshot the barrier is sent to the downstream operators.

Thus, to answer your question, you cannot say that all events from before
the barrier are no longer held in any windowing intermediate state once a
checkpoint is completed. However, this does not matter because the
windowing intermediate state is also checkpointed so that you won't lose
elements.

Cheers,
Till

On Thu, Feb 4, 2016 at 6:39 AM, shikhar <shik...@schmizz.net> wrote:

> Flinkheads,
>
> I'm processing from a Kafka source, using event time with watermarks based
> on a threshold, and using tumbling time windows to perform some rollup. My
> sink is idempotent, and I want to ensure exactly-once processing
> end-to-end.
>
> I am trying to figure out if I can stick with memory checkpointing, and not
> bother with a checkpointing state backend or savepoints for job redeploys.
> It'd be great if I can just rely on the Kafka consumer's offset persistence
> to Zookeeper for that 'group.id' - I see that it saves the relevant offset
> to Zookeeper when a checkpoint has been triggered.
>
> However I'm concerned whether there is potential for dropping events if I
> stick with the memory checkpoints.
>
> The documentation talks of a checkpoint being triggered when the relevant
> barrier has made it all the way to the sink -- how does that interact with
> windowed streams, where some events might get buffered while later ones
> make
> it through?
>
> More concretely, when the Kafka consumer persists an offset to Zookeeper
> based on receiving a checkpoint trigger, can I trust that all events from
> before that offset are not held in any windowing intermediate state?
>
> Thanks!
>
> Shikhar
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoints-and-event-ordering-tp4664.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>

Reply via email to