Hey Sophie, This was indeed the issue. An environment variable got passed through wrong. Thank you for your tip that made me check this.
Giselle On 2020/07/29 17:41:43, Sophie Blee-Goldman <sop...@confluent.io> wrote: > Hey Giselle, > > How many stream threads is each instance configured with? If the total > number of threads > across all instances exceeds the total number of tasks, then some threads > won't get any > assigned tasks. There's a known bug where tasks might not get evenly > distributed over all > instances in this scenario, as Streams would only attempt to balance the > tasks over the > threads. See KAFKA-9173 <https://issues.apache.org/jira/browse/KAFKA-9173>. > Luckily, this should be fixed in 2.6 which is just about to be > released. > > Instances that joined later, or restarted, would be more likely to have > these threads with no > assigned tasks due to the stickiness optimization, as you guessed. > > If the problem you've run into is due to running more stream threads than > tasks, I would > recommend just decreasing the number of threads per instance to get a > balanced assignment. > This won't hurt performance in any way since those extra threads would have > just been sitting > idle anyways. Or better yet, upgrade to 2.6. > > Regarding the colocation question: no, the assignment doesn't take that > into account at the > moment. Typically Streams applications won't be running on the same machine > as the broker. > Clearly it has been difficult enough to optimize for two things at the same > time, stickiness and > balance, without introducing a third :) > > On Wed, Jul 29, 2020 at 4:58 AM Giselle Van Dongen < > giselle.vandon...@klarrio.com> wrote: > > > We have a Kafka Streams (2.4) app consisting of 5 instances. It reads from > > a Kafka topic with 20 partitions (5 brokers). > > > > We notice that the partition assignment does not always lead to well > > distributed load over the different threads. We notice this at startup as > > well as after a recovery of a failed thread. > > > > 1. At startup, some instances get a significantly lower load and sometimes > > even no load. It seems like instances that come up slightly later get no > > partitions assigned (because of sticky assignment?). > > > > 2. When one thread (container) dies and comes back it often does not > > receive any or very few partitions to work on. We assume this has to do > > with the sticky assignment. Is there any way we can make this distribution > > more equal? > > > > I was also wondering whether Kafka Streams takes into account colocation > > of Kafka brokers with stream processing threads when assigning partitions. > > Do partitions on brokers get assigned to the streams thread that is > > colocated with it on the same machine? > > >