Hi, I guess, it's currently not possible to load balance between different machines. It might be a nice optimization to add into Streams though.
Right now, you should reduce the number of threads. Load balancing is based on threads, and thus, if Streams place tasks to all threads of one machine, it will automatically assign the remaining tasks to thread of the second machine. Btw: If you have only 9 input partitions, you will get most likely 9 tasks (might be more, depending on your topology structure) and thus, you cannot utilize more then 9 thread anyway. Thus, running with 28 thread will most likely result in many idle threads. See the docs for more details: - http://docs.confluent.io/current/streams/architecture.html#parallelism-model - http://docs.confluent.io/current/streams/architecture.html#threading-model -Matthias On 3/21/17 3:40 PM, Prasad, Karthik wrote: > Hey, > > I have a typical scenario of a kafka-streams application in a production > environment. > > We have a kafka-cluster with multiple topics. Messages from one topic is > being consumed by a the kafka-streams application. The topic, currently, has > 9 partitions. We have configured consumer thread count to 14. We are running > 2 instances of this stream application on 2 different machines, thereby > consisting of 28 threads across both machines. The group id for the consumers > are the same. But, what I observe is that all partitions are being assigned > to threads on a single machine. Now, I do understand that if the task on the > active machine fails, then the threads in the other machine would take over. > My question is that is there a way that kafka-streams can auto-balance across > instances of the same stream application ? If yes, how do I go about doing > that ? Please let me know. Thanks, > > Best, > Karthik Prasad > Senior Software Engineer > Sony Interactive Entertainment > > >
signature.asc
Description: OpenPGP digital signature