[ 
https://issues.apache.org/jira/browse/KAFKA-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533651#comment-14533651
 ] 

Joel Koshy commented on KAFKA-2172:
-----------------------------------

The restriction really has to do with simplicity in the assignment code. i.e., 
it is definitely possible to remove the restriction. The main motivation for 
round-robin was for heavy consumers such as the mirror-maker. In these 
consumers it is (usually, but not always) less of an issue to take down all 
instances reconfigure and bring them back up if you want to change 
subscriptions. I agree this is too restrictive though in practice.

[~bbaugher] Sure we can consider alternate assignment algorithms. We don't need 
an optimal solution - in fact optimal can be very complicated and very 
subjective. There are some obvious nice-haves though. E.g., all partitions of a 
topic should ideally not go to the same consumer instance if there are other 
instances willing to read from that topic.

It may be useful to come up with one or more approaches and do some simulations 
(with different assignments, consumer counts, partition counts, etc.) and see 
well those approaches perform.


> Round-robin partition assignment strategy too restrictive
> ---------------------------------------------------------
>
>                 Key: KAFKA-2172
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2172
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Jason Rosenberg
>
> The round-ropin partition assignment strategy, was introduced for the 
> high-level consumer, starting with 0.8.2.1.  This appears to be a very 
> attractive feature, but it has an unfortunate restriction, which prevents it 
> from being easily utilized.  That is that it requires all consumers in the 
> consumer group have identical topic regex selectors, and that they have the 
> same number of consumer threads.
> It turns out this is not always the case for our deployments.  It's not 
> unusual to run multiple consumers within a single process (with different 
> topic selectors), or we might have multiple processes dedicated for different 
> topic subsets.  Agreed, we could change these to have separate group ids for 
> each sub topic selector (but unfortunately, that's easier said than done).  
> In several cases, we do at least have separate client.ids set for each 
> sub-consumer, so it would be incrementally better if we could at least loosen 
> the requirement such that each set of topics selected by a groupid/clientid 
> pair are the same.
> But, if we want to do a rolling restart for a new version of a consumer 
> config, the cluster will likely be in a state where it's not possible to have 
> a single config until the full rolling restart completes across all nodes.  
> This results in a consumer outage while the rolling restart is happening.
> Finally, it's especially problematic if we want to canary a new version for a 
> period before rolling to the whole cluster.
> I'm not sure why this restriction should exist (as it obviously does not 
> exist for the 'range' assignment strategy).  It seems it could be made to 
> work reasonably well with heterogenous topic selection and heterogenous 
> thread counts.  The documentation states that "The round-robin partition 
> assignor lays out all the available partitions and all the available consumer 
> threads. It then proceeds to do a round-robin assignment from partition to 
> consumer thread."
> If the assignor can "lay out all the available partitions and all the 
> available consumer threads", it should be able to uniformly assign partitions 
> to the available threads.  In each case, if a thread belongs to a consumer 
> that doesn't have that partition selected, just move to the next available 
> thread that does have the selection, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to