Re: Dismissing late messages in Kafka Streams

2016-10-20 Thread Guozhang Wang
Hello Nicolas, Thanks for reporting this, and your observations about the window retention mechanism are correct. I think our documentation do have a few lacks about the windowing operations in that: 1. We should mention about the default retention period of the window as one day (or should the d

Re: Dismissing late messages in Kafka Streams

2016-10-20 Thread Matthias J. Sax
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Nicolas, you can filter "old" messages using KStream#transform(). It provides a ProcessorContext object that allows you to access the timestamp of the currently processed record. Thus, you "only" need to maintain the largest timestamp you can ever s

Re: Dismissing late messages in Kafka Streams

2016-10-20 Thread Nicolas Fouché
I forgot to mention that the default maintain duration of a window is 1 day. Would it be useful to warn the developer is the current maintain duration is "not compatible" with the current window size and interval ? 2016-10-20 14:49 GMT+02:00 Nicolas Fouché : > Hi Michael, > > thanks for the quic

Re: Dismissing late messages in Kafka Streams

2016-10-20 Thread Nicolas Fouché
Hi Michael, thanks for the quick reply. Let's try to explain things a bit better: Within a call to `aggregateByKey`, I specified a window with the size of 15 days and an interval (hop) of 1 day, without setting a maintain duration. I produced a message, the timestamp being the current clock time,

Re: Dismissing late messages in Kafka Streams

2016-10-20 Thread Michael Noll
Nicolas, > I set the maintain duration of the window to 30 days. > If it consumes a message older than 30 days, then a new aggregate is created for this old window. I assume you mean: If a message should have been included in the original ("old") window but that message happens to arrive late (a