Re: Kafka streams in Kubernetes

2019-06-08 Thread Matthias J. Sax
If depends how much state you need to restore and how much restore-time you can accept in your application. The amount of data that needs to be restored, does not depend on the window-size, but the store retention time (default 1 day, configurable via `Materialized#withRetention()`). The window si

Re: Kafka streams in Kubernetes

2019-06-08 Thread Pavel Sapozhnikov
I suggest take a look at Strimzi project https://strimzi.io/ Kafka operator deployed in Kubernetes environment. On Sat, Jun 8, 2019, 6:09 PM Parthasarathy, Mohan wrote: > Hi, > > I have read several articles about this topic. We are soon going to deploy > our streaming apps inside k8s. My under

Kafka streams in Kubernetes

2019-06-08 Thread Parthasarathy, Mohan
Hi, I have read several articles about this topic. We are soon going to deploy our streaming apps inside k8s. My understanding from reading these articles is that stateful set in k8s is not mandatory as the application can rebuild its state if the state store is not present. Can people share th

Parallel computation of windows in Flink

2019-06-08 Thread Mike Kaplinskiy
Hi everyone, I’m using a Kafka source with a lot of watermark skew (i.e. new partitions were added to the topic over time). The sink is a FileIO.Write().withNumShards(1) to get ~ 1 file per day & an early trigger to write at most 40,000 records per file. Unfortunately it looks like there's 1 threa