kafka-streams: interaction between max.poll.records and window expiration ?

2020-12-20 Thread Mathieu D
Hello there, One of our input topics does not have so much traffic. Divided by the number of partitions, and given the default 'max.poll.records' setting (being 1000 if I understand the doc correctly), it could happen that fetching 1000 records at once, the event timestamps between the first and l

Re: kafka-streams / window expiration because of shuffling

2020-11-23 Thread Mathieu D
feature, I think you’re on the right track. If you > specify a partitioner that produces exactly the same partitioning as the > source, you _should_ be able to avoid any shuffling, although there will > still be a repartition topic there. > > I hope this helps, > John >

kafka-streams / window expiration because of shuffling

2020-11-20 Thread Mathieu D
Hello there, We're processing IOT device data, each device sending several metrics. When we upgrade our streams app, we set a brand new 'application.id' to reprocess a bit of past data, to warm up the state stores and aggregations to make sure all outputs will be valid. Downstream is designed for

Re: Perf on history reprocessing [kafka-streams]

2020-10-25 Thread Mathieu D
To clarify my question: here i'm focusing on the kafka-streams part. Le ven. 23 oct. 2020 à 20:07, Mathieu D a écrit : > Hello there > > Sometimes we need to reprocess a large amount of history data. > I find the performance in that case quite disappointing. More precisely >

Re: toombstones, kafkastreams, Avro & NPE

2020-10-24 Thread Mathieu D
Hi Nicolae A shot in the dark : make sure to manage nulls separately in your custom serializer. Something like this: private val _serializer = new KafkaAvroSerializer { override def serialize(topic: String, obj: Any): Array[Byte] = obj match { case null => null case _ => val record = RecordForm

Perf on history reprocessing

2020-10-23 Thread Mathieu D
Hello there Sometimes we need to reprocess a large amount of history data. I find the performance in that case quite disappointing. More precisely throughput is quite low (which is not surprising for a system optimized for low latency). Is there any knob to turn to get a much higher throughput in

Re: kafka-streams: do not output anything while state is not stable

2020-10-23 Thread Mathieu D
Ok. Thanks ;-) Le mar. 20 oct. 2020 à 19:12, Matthias J. Sax a écrit : > It's highly use-case dependent, but applying a filter at the end does > sound like a good solution to me. > > -Matthias > > On 10/19/20 12:40 PM, Mathieu D wrote: > > Hello there, > >

kafka-streams: do not output anything while state is not stable

2020-10-19 Thread Mathieu D
Hello there, Let's say I need to restart my streams app from a blank state (whether by changing app.id or using application-reset-tool). My app is designed on "at least once" paradigm, and outputs are upserts. The input topics have a few days worth of data, and the app will restart from there. If

Re: join not working in relation to how StreamsBuilder builds the topology

2020-08-11 Thread Mathieu D
; record from the left would instead produce a K:(LeftVal, null) result. It > seems like even if the repartition is somehow going to the wrong partition, > you should see the (left, null) result at some point. I’m struggling to > think why you would only see null results. > > >

join not working in relation to how StreamsBuilder builds the topology

2020-08-10 Thread Mathieu D
Dear community, I have a quite tough problem I struggle to diagnose and fix. I'm not sure if it's a bug in Kafka-streams or some subtlety I didn't get in using the DSL api. The problem is the following. We have a quite elaborate stream app, working well, in production. We'd like to add a left joi

Need advice on how to deploy and update a streams app

2020-04-19 Thread Mathieu D
7;d be happy to have a quick talk over google-meet or anything. This could be easier to discuss than to write a long email. If you have any resources, books, posts, conference talks, I'd be happy (if they are about real-life™ apps... not really interested in hello-worlds...) Thanks a lot ! Mathieu D