How does the /tmp/kafka-streams folder work?

2018-12-24 Thread Edmondo Porcu
Hello Kafka users, we are running a Kafka Streams as a fully stateless application, meaning that we are not persisting /tmp/kafka-streams on a durable volume but we are rather losing it at each restart. This application is performing a KTable-KTable join of data coming from Kafka Connect, and some

Producing a null AVRO record on a compacted topic with kafka-avro-console-producer

2018-05-29 Thread Edmondo Porcu
We are using Kafka Connect to stream from a database with a JDBC Connector. Some row were wrongly deleted, therefore we have our key-value stores that are stale. We thought we could solve the problem by using kafka-avro-console-producer and produce a message with the deleted key and the null paylo

Non duplicated WindowStore in Kstream - KStream Join?

2018-05-23 Thread Edmondo Porcu
We need to perform a Kstream - Kstream join with a very large window, where a tick on the left would trigger a join only with the most recent record on the right, and viceversa. This is not how the default window works, since the WindowStoreIterator returned by window.fetch inside the KStreamKStre

Forcing un-assignment of partitions for Kafka application

2018-05-23 Thread Edmondo Porcu
We have a Kafka Streams app that fails to start correctly because somehow the consumer doesn't get assigned any partition and the Queryable stores are not available when the app starts. Is there a way to force release the assignment that consumers have on specific partitions? Edmondo

How does KStream transform performs repartitioning?

2018-05-22 Thread Edmondo Porcu
Hello users, we are performing a Transform so that out of a larger message we emit a new output record only if that specific field has changed. Since we introduced that to reduce the number of output records, our final Kstream - KStream windowed join is not ticking anymore, although the window i

Architectural patterns for full log replayability

2018-05-22 Thread Edmondo Porcu
Hello Kafka Users, we'd like to understand how you are designing systems based on Kafka so to be able to replay the full log. In particular, let's take the following example: - A product service streams products - A purchase service streams purchases - A recommendation service join the two and de