I have a question about partition assignment for a kafka streams app. As I understand it the more complex your topology is the greater the number of internal topics kafka streams will create. In my case the app has 8 graphs in the topology. There are 6 partitions for each graph (this matches the number of partitions of the input topic). So there are 48 partitions that the app needs to handle. These get balanced equally across all 3 servers where the app is running (each server also has 2 threads so there are 6 available instances of the app).
The problem for me is that the partitions of the input topic have the heaviest workload. But these 6 partitions are not distributed evenly amongst the instances. They are just considered 6 partitions amongst the 48 the app needs to balance. But this means if a server gets most or all of these 6 partitions, it ends up exhausting all of the resources on that server. Is there a way of equally balancing these 6 specific partitions amongst the available instances? I thought writing a custom partition grouper might help here: https://kafka.apache.org/10/documentation/streams/developer-guide/config-streams.html#partition-grouper But the advice seems to be to not do this otherwise you risk breaking the app. Thanks!