[ 
https://issues.apache.org/jira/browse/KAFKA-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14530767#comment-14530767
 ] 

Bryan Baugher commented on KAFKA-2172:
--------------------------------------

I originally asked out on the mailing list about this[1] but I'm also having 
troubles with the round robin partitioning because of its requirements. Similar 
to the above it makes deployments difficult as when changing our topic 
subscriptions the consumer group stops consuming messages. In our case our 
consumers are building their topic subscriptions from config they retrieve 
regularly from a REST service. Every consumer should have the same topic 
subscription except when the config changes and there some lag before all 
consumers retrieve the new config.

Would you be open to a patch that provides another assignor which takes a 
simpler approach and just assigns each partition to a consumer interested in 
that topic with the least number of partitions assigned? This would not provide 
the optimal solution in the case where topic subscriptions are not equal but 
should generally do fine and should come up with the same answer as the round 
robin assignor when they are.

[1] - 
http://mail-archives.apache.org/mod_mbox/kafka-users/201505.mbox/%3CCANZ-JHE6TRf%2BHdT-%3DK9AKFVXasLjg445cmcRVEBi5tG93XTNqA%40mail.gmail.com%3E

> Round-robin partition assignment strategy too restrictive
> ---------------------------------------------------------
>
>                 Key: KAFKA-2172
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2172
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Jason Rosenberg
>
> The round-ropin partition assignment strategy, was introduced for the 
> high-level consumer, starting with 0.8.2.1.  This appears to be a very 
> attractive feature, but it has an unfortunate restriction, which prevents it 
> from being easily utilized.  That is that it requires all consumers in the 
> consumer group have identical topic regex selectors, and that they have the 
> same number of consumer threads.
> It turns out this is not always the case for our deployments.  It's not 
> unusual to run multiple consumers within a single process (with different 
> topic selectors), or we might have multiple processes dedicated for different 
> topic subsets.  Agreed, we could change these to have separate group ids for 
> each sub topic selector (but unfortunately, that's easier said than done).  
> In several cases, we do at least have separate client.ids set for each 
> sub-consumer, so it would be incrementally better if we could at least loosen 
> the requirement such that each set of topics selected by a groupid/clientid 
> pair are the same.
> But, if we want to do a rolling restart for a new version of a consumer 
> config, the cluster will likely be in a state where it's not possible to have 
> a single config until the full rolling restart completes across all nodes.  
> This results in a consumer outage while the rolling restart is happening.
> Finally, it's especially problematic if we want to canary a new version for a 
> period before rolling to the whole cluster.
> I'm not sure why this restriction should exist (as it obviously does not 
> exist for the 'range' assignment strategy).  It seems it could be made to 
> work reasonably well with heterogenous topic selection and heterogenous 
> thread counts.  The documentation states that "The round-robin partition 
> assignor lays out all the available partitions and all the available consumer 
> threads. It then proceeds to do a round-robin assignment from partition to 
> consumer thread."
> If the assignor can "lay out all the available partitions and all the 
> available consumer threads", it should be able to uniformly assign partitions 
> to the available threads.  In each case, if a thread belongs to a consumer 
> that doesn't have that partition selected, just move to the next available 
> thread that does have the selection, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to