And since you asked for a pointer, Ali:
http://docs.confluent.io/current/streams/concepts.html#windowing


On Mon, Mar 20, 2017 at 5:43 PM, Michael Noll <mich...@confluent.io> wrote:

> Late-arriving and out-of-order data is only treated specially for windowed
> aggregations.
>
> For stateless operations such as `KStream#foreach()` or `KStream#map()`,
> records are processed in the order they arrive (per partition).
>
> -Michael
>
>
>
>
> On Sat, Mar 18, 2017 at 10:47 PM, Ali Akhtar <ali.rac...@gmail.com> wrote:
>
>> > later when message A arrives it will put that message back into
>> > the right temporal context and publish an amended result for the proper
>> > time/session window as if message B were consumed in the timestamp order
>> > before message A.
>>
>> Does this apply to the aggregation Kafka stream methods then, and not to
>> e.g foreach?
>>
>> On Sun, Mar 19, 2017 at 2:40 AM, Hans Jespersen <h...@confluent.io>
>> wrote:
>>
>> > Yes stream processing and CEP are subtlety different things.
>> >
>> > Kafka Streams helps you write stateful apps and allows that state to be
>> > preserved on disk (a local State store) as well as distributed for HA or
>> > for parallel partitioned processing (via Kafka topic partitions and
>> > consumer groups) as well as in memory (as a performance enhancement).
>> >
>> > However a classical CEP engine with a pre-modeled state machine and
>> > pattern matching rules is something different from stream processing.
>> >
>> > It is on course possible to build a CEP system on top on Kafka Streams
>> and
>> > get the best of both worlds.
>> >
>> > -hans
>> >
>> > > On Mar 18, 2017, at 11:36 AM, Sabarish Sasidharan <
>> > sabarish....@gmail.com> wrote:
>> > >
>> > > Hans
>> > >
>> > > What you state would work for aggregations, but not for state machines
>> > and
>> > > CEP.
>> > >
>> > > Regards
>> > > Sab
>> > >
>> > >> On 19 Mar 2017 12:01 a.m., "Hans Jespersen" <h...@confluent.io>
>> wrote:
>> > >>
>> > >> The only way to make sure A is consumed first would be to delay the
>> > >> consumption of message B for at least 15 minutes which would fly in
>> the
>> > >> face of the principals of a true streaming platform so the short
>> answer
>> > to
>> > >> your question is "no" because that would be batch processing not
>> stream
>> > >> processing.
>> > >>
>> > >> However, Kafka Streams does handle late arriving data. So if you had
>> > some
>> > >> analytics that computes results on a time window or a session window
>> > then
>> > >> Kafka streams will compute on the stream in real time (processing
>> > message
>> > >> B) and then later when message A arrives it will put that message
>> back
>> > into
>> > >> the right temporal context and publish an amended result for the
>> proper
>> > >> time/session window as if message B were consumed in the timestamp
>> order
>> > >> before message A. The end result of this flow is that you eventually
>> get
>> > >> the same results you would get in a batch processing system but with
>> the
>> > >> added benefit of getting intermediary result at much lower latency.
>> > >>
>> > >> -hans
>> > >>
>> > >> /**
>> > >> * Hans Jespersen, Principal Systems Engineer, Confluent Inc.
>> > >> * h...@confluent.io (650)924-2670
>> > >> */
>> > >>
>> > >>> On Sat, Mar 18, 2017 at 10:29 AM, Ali Akhtar <ali.rac...@gmail.com>
>> > wrote:
>> > >>>
>> > >>> Is it possible to have Kafka Streams order messages correctly by
>> their
>> > >>> timestamps, even if they arrived out of order?
>> > >>>
>> > >>> E.g, say Message A with a timestamp of 5:00 PM and Message B with a
>> > >>> timestamp of 5:15 PM, are sent.
>> > >>>
>> > >>> Message B arrives sooner than Message A, due to network issues.
>> > >>>
>> > >>> Is it possible to make sure that, across all consumers of Kafka
>> Streams
>> > >>> (even if they are across different servers, but have the same
>> consumer
>> > >>> group), Message A is consumed first, before Message B?
>> > >>>
>> > >>> Thanks.
>> > >>>
>> > >>
>> >
>>
>
>
>

Reply via email to