[ 
https://issues.apache.org/jira/browse/KAFKA-5337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027969#comment-16027969
 ] 

ASF GitHub Bot commented on KAFKA-5337:
---------------------------------------

Github user grantneale closed the pull request at:

    https://github.com/apache/kafka/pull/3158


> Partition assignment strategy that distributes lag evenly across consumers in 
> each group
> ----------------------------------------------------------------------------------------
>
>                 Key: KAFKA-5337
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5337
>             Project: Kafka
>          Issue Type: New Feature
>          Components: consumer
>    Affects Versions: 0.10.2.1
>            Reporter: Grant Neale
>            Priority: Minor
>
> Existing partition assignment strategies (RangeAssignor and 
> RoundRobinAssignor) do not account for the current consumer group lag on each 
> partition.  This can result in sub-optimal assignments when the distribution 
> of lags for a given topic and consumer group is skewed.
> The LagBasedAssignor operates on a per-topic basis, and attempts to assign 
> partitions such that lag is distributed as evenly across a consumer group.
> h4. Algorithm:
> For each topic, first obtain the lag on all partitions. Lag on a given 
> partition is the difference between the end offset and the last offset 
> committed by the consumer group. If no offsets have been committed for a 
> partition we determine the lag based on the code auto.offset.reset property. 
> If auto.offset.reset=latest, we assume a lag of 0. If 
> auto.offset.reset=earliest (or any other value) we assume lag equal to the 
> total number of message currently available in that partition.
> Next, create a map storing the current total lag of all partitions assigned 
> to each member of the consumer group. Partitions are assigned in decreasing 
> order of lag, with each partition assigned to the consumer with least total 
> number of assigned partitions, breaking ties by assigning to the consumer 
> with the least total currently assigned lag.
> Assigning partitions evenly across consumers (by partition count) ensures 
> that the assignment is reasonably balanced (by partition count) when all 
> partitions have a current lag of 0 or if the distribution of lags is heavily 
> skewed. It also gives the consumer group the best possible chance of 
> remaining balanced if the assignment is retained for a long period (assuming 
> throughput is consistent across members of the consumer group).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to