Hi, Till: By operator with multiple inputs, do you mean inputs from multiple subtasks?
On Tue, Nov 1, 2016 at 8:56 PM Till Rohrmann <trohrm...@apache.org> wrote: > Hi Li, > > the statement refers to operators with multiple inputs (two in this case). > With the current implementation you will indeed block one of the inputs > after receiving a checkpoint barrier n until you've received the > corresponding checkpoint barrier n on the other input as well. This is what > we call checkpoint barrier alignment. If the processing time on both input > paths is similar and thus there is no back pressure on any of the inputs, > the alignment should not take too long. In case where one of the inputs is > considerably slower than the other, you should an additional delay. > > For single input operators, you don't have to align the checkpoint > barriers. > > The checkpoint barrier alignment is not strictly necessary, but it allows > us to not having to store all in flight records from the second input which > arrive between the checkpoint barrier on the first input and the > corresponding barrier on the second input. We might change this > implementation in the future, though. > > Cheers, > Till > > On Tue, Nov 1, 2016 at 8:05 AM, Li Wang <wangli1...@gmail.com> wrote: > > Hi all, > > I have a question regarding to the state checkpoint mechanism in Flink. I > find the statement "Once the last stream has received barrier n, the > operator emits all pending outgoing records, and then emits > snapshot n barriers itself” on the document > https://ci.apache.org/projects/flink/flink-docs-master/internals/stream_checkpointing.html#exactly-once-vs-at-least-once > . > > Does this mean that to achieve exactly-once semantic, instead of sending > tuples downstream immediately the operator buffers its outgoing tuples in a > pending queue until the current snapshot is committed? If yes, will this > introduce significant processing delay? > > Thanks, > Li > > > -- Liu, Renjie Software Engineer, MVAD