Thanks Matthias, That's far cleaner :)
Cheers, Liam Clarke On Mon, Dec 3, 2018 at 4:59 PM Matthias J. Sax <matth...@confluent.io> wrote: > The nulls are expected. > > It's not about expired session windows though: sessions window are > stored as `<(key,start-timestamp,end-timestamp), value>`. If the window > boundaries changed due to new incoming events (or maybe a merge of two > windows due to late arriving records), the window is updated via a > delete of the old window and an create of the new window (because it's > two different keys for old and new window). > > Thus, using session window does not seem to be the best approach for > your problem. Check out this example on how to do event de-duplication: > > > https://github.com/confluentinc/kafka-streams-examples/blob/5.0.1-post/src/test/java/io/confluent/examples/streams/EventDeduplicationLambdaIntegrationTest.java > > Hope this helps. > > > -Matthias > > > On 12/2/18 4:59 PM, Liam Clarke wrote: > > Hi all, > > > > Using > > > > I am using a session window of 1 minute to detect and prevent duplicated > > events within the window using reduce() to preserve the first seen value > > and accumulate how often duplicates were seen, I'm then converting the > > windowed KTable into a stream to continue processing events. > > > > However, later in the stream, I've been hitting NPEs because after the > > window expiry, the stream is emitting a previously seen key with a null > > value when I'd assumed any values in the stream would be non-null. Is > this > > an idiom expressing that the window for that key has expired and as such > > needs to be explicitly handled? > > > > I'm worried that my filtering those values from the ongoing stream is > > hiding a problem further upstream, but if it's working as designed, then > > sweet as. > > > > Kind regards, > > > > Liam Clarke > > > >