Guozhang Wang created KAFKA-6039:
------------------------------------

             Summary: Improve TaskAssignor to be more load balanced
                 Key: KAFKA-6039
                 URL: https://issues.apache.org/jira/browse/KAFKA-6039
             Project: Kafka
          Issue Type: Improvement
          Components: streams
            Reporter: Guozhang Wang


Today our task placement may still generate sub-optimal assignment regarding 
load balance. One reason is that it does not account for sub-topologies. For 
example say you have an aggregation following from a repartition topic, then 
you will end up with two sub-topologies where the first one is very light and 
the second one is computational heavy with state stores, however when we 
consider their tasks we treat them equally so in the worst case one client can 
get X number of tasks from first sub-topology and be very idle while the other 
getting X number of tasks from the second sub-topology and busy to death.

One strawman approach to make this better is try to achieve balance across 
sub-topologies: i.e. each client trying to get similar amount of tasks within a 
sub-topology. However there are some more considerations to include (as 
mentioned in the sub-taks).





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to