Hi Ryan, This is a great question. Kafka Streams currently has the same issue as with Samza that due to its key-based partitioning mechanism, with increased partitions the state stores may be invalid since the keyed-messages will be re-routed to different partitions and hence different tasks.
For now the recommended solution as Samza does is to over-partition your input Kafka streams upon starting up to avoid manually wiping out states and restarting, which is admittedly not ideal but in practice worked well at organizations like LinkedIn; moving forward we are also planning to add the savepoint feature similar to Flink so that users can checkpoint their distributed application state as a whole when scaling up and down, the tricky thing is doing that "behind the scene" without stopping the world and we are still discussing if it is doable. Guozhang On Wed, Apr 27, 2016 at 10:00 AM, Ryan Thompson <ryan.thomp...@uptake.com> wrote: > Hello, > > I'm wondering if fault tolerant state management with kafka streams works > seamlessly if partitions are scaled up. My understanding is that this is > indeed a problem that stateful stream processing frameworks need to solve, > and that: > > with samza, this is not a solved problem (though I also understand it's > being worked on, based on a conversation I had yesterday at the kafka > summit with someone who works on samza) > > with flink, there's a plan to solve this: "The way we plan to implement > this in Flink is by shutting the dataflow down with a checkpoint, and > bringing the dataflow back up with a different parallelism." > > http://www.confluent.io/blog/real-time-stream-processing-the-next-step-for-apache-flink/ > > with kafka streams, I haven't been able to find a solid answer on whether > or not this problem is solved for users, or if we need to handle it > ourselves. > > Thanks, > Ryan > -- -- Guozhang