Hi all!

I think we need to change the interface of the streaming source function.

The function currently has simply a run() method where it does its work,
until canceled.

With this, it is hard to write sources, where the state and the snapshot
barriers are exactly aligned.
When performing the checkpoint, the vertex will grab the state from the
source and inject a checkpoint barrier. It is not clear that the injected
barrier aligns with the state, because the source may have emitted more
records since grabbing the state, or not emitted the record that is
reflected in the state (offset).

If we change the interface to a more iterator-like interface (hasNext() and
next()), then the vertex calls these methods and can checkpoint in-between
calling the methods.
After hasNext() is a well defined point, where the state can be grabbed and
the barrier be emitted.


Any opinions on that?


Stephan

Reply via email to