Hey Giselle,

How many stream threads is each instance configured with? If the total
number of threads
across all instances exceeds the total number of tasks, then some threads
won't get any
assigned tasks. There's a known bug where tasks might not get evenly
distributed over all
instances in this scenario, as Streams would only attempt to balance the
tasks over the
threads. See KAFKA-9173 <https://issues.apache.org/jira/browse/KAFKA-9173>.
Luckily, this should be fixed in 2.6 which is just about to be
released.

Instances that joined later, or restarted, would be more likely to have
these threads with no
assigned tasks due to the stickiness optimization, as you guessed.

If the problem you've run into is due to running more stream threads than
tasks, I would
recommend just decreasing the number of threads per instance to get a
balanced assignment.
This won't hurt performance in any way since those extra threads would have
just been sitting
idle anyways. Or better yet, upgrade to 2.6.

Regarding the colocation question: no, the assignment doesn't take that
into account at the
moment. Typically Streams applications won't be running on the same machine
as the broker.
Clearly it has been difficult enough to optimize for two things at the same
time, stickiness and
balance, without introducing a third :)

On Wed, Jul 29, 2020 at 4:58 AM Giselle Van Dongen <
giselle.vandon...@klarrio.com> wrote:

> We have a Kafka Streams (2.4) app consisting of 5 instances. It reads from
> a Kafka topic with 20 partitions (5 brokers).
>
> We notice that the partition assignment does not always lead to well
> distributed load over the different threads. We notice this at startup as
> well as after a recovery of a failed thread.
>
> 1. At startup, some instances get a significantly lower load and sometimes
> even no load. It seems like instances that come up slightly later get no
> partitions assigned (because of sticky assignment?).
>
> 2. When one thread (container) dies and comes back it often does not
> receive any or very few partitions to work on. We assume this has to do
> with the sticky assignment. Is there any way we can make this distribution
> more equal?
>
> I was also wondering whether Kafka Streams takes into account colocation
> of Kafka brokers with stream processing threads when assigning partitions.
> Do partitions on brokers get assigned to the streams thread that is
> colocated with it on the same machine?
>

Reply via email to