I have a question about partition assignment for a kafka streams app. As I
understand it the more complex your topology is the greater the number of
internal topics kafka streams will create. In my case the app has 8 graphs
in the topology. There are 6 partitions for each graph (this matches the
number of partitions of the input topic). So there are 48 partitions that
the app needs to handle. These get balanced equally across all 3 servers
where the app is running (each server also has 2 threads so there are 6
available instances of the app).

The problem for me is that the partitions of the input topic have the
heaviest workload. But these 6 partitions are not distributed evenly
amongst the instances. They are just considered 6 partitions amongst the 48
the app needs to balance. But this means if a server gets most or all of
these 6 partitions, it ends up exhausting all of the resources on that
server.

Is there a way of equally balancing these 6 specific partitions amongst the
available instances? I thought writing a custom partition grouper might
help here:

https://kafka.apache.org/10/documentation/streams/developer-guide/config-streams.html#partition-grouper

But the advice seems to be to not do this otherwise you risk breaking the
app.

Thanks!

Reply via email to