You're right if you want to guarantee a deterministic computation for an arbitrary allowed lateness. In the general case, you would never be able to calculate the final result of a window in a finite time, because there might always be another element which arrives later. However, for most practical use cases you can define an upper bound for the allowed lateness which you can use to calculate your final result. If not, then you will simply run out of storage capacity at some point of time, because you have to keep some state around for this late element (in the general case).
Cheers, Till On Mon, Nov 7, 2016 at 5:55 PM, Jaromir Vanek <vanek.jaro...@gmail.com> wrote: > Hi Till, thank you for your answer. > > I am afraid defining an allowed lateness won't help. It will just change > the > problem by constant time. If we agree an element can come in arbitrary time > after watermark (depends on the network latency), it may be assigned to the > window or may be not if comes before/after allowed lateness period expires. > Then element may be counted in or discarded. > > Still seems the results are not deterministic. In other words if I run the > job reading from Kafka multiple times I may get different result depending > on external conditions like network and cluster stability. > > Please correct me if i'm wrong. > > J.V. > > > > > > > -- > View this message in context: http://apache-flink-mailing- > list-archive.1008284.n3.nabble.com/Deterministic- > processing-with-out-of-order-streams-tp14409p14422.html > Sent from the Apache Flink Mailing List archive. mailing list archive at > Nabble.com. >