I’ve seen a few mailing list posts (including this one) that say Flink 
guarantees there is no concurrent access to operator methods (e.g. flatMap, 
snapshotState, etc.) and thus synchronization isn’t needed when writing 
operators that support checkpointing. I was trying to find a place in the 
official docs where this was called out, but was coming up empty.

Is there a section of the docs that covers this topic?

Thanks!

-Joey

On Dec 18, 2019, at 9:38 PM, Zhu Zhu 
<reed...@gmail.com<mailto:reed...@gmail.com>> wrote:

[--- This email originated from outside of the organization. Do not click links 
or open attachments unless you recognize the sender and know the content is 
safe. ---]

Hi Aaron,

It is thread safe since the state snapshot happens in the same thread with the 
user function.

Thanks,
Zhu Zhu

Aaron Langford <aaron.langfor...@gmail.com<mailto:aaron.langfor...@gmail.com>> 
于2019年12月19日周四 上午11:25写道:
Hello Flink Community,

I'm hoping to verify some understanding:

If I have a function with managed state, I'm wondering if a checkpoint will 
ever be taken while a function is mutating state. I'll try to illustrate the 
situation I'm hoping to be safe from:

Happy Path:
t0 -> processFunction invoked with el1
t1 -> set A to 5
t2 -> set B to 10
t3 -> function returns

Unhappy path:
t0 -> processFunction invoked with el1
t1 -> set A to 5
t2 -> function interrupted, checkpoint taken (A = 5, B = 1)
t3 -> set B to 10
t4 -> function returns
...
tn -> flink application fails, restart from prev checkpoint (A=5, B=1)
tn+1 -> recovery begins somewhere, but state is torn anyway, so we're going to 
have a bad time

I don't think this could happen given that checkpoints effectively are messages 
in the pipeline, and the checkpoint is only taken when an operator sees the 
checkpoint barrier.

Hoping to make sure this is correct!

Aaron

Reply via email to