Re: [DISCUSS] KIP-54 Sticky Partition Assignment Strategy

Vahid S Hashemian Thu, 23 Jun 2016 15:50:18 -0700

Hi Jason,

I appreciate your feedback.
Please see my comments below, and advise if you have further suggestions. 
Thanks.
 
Regards,
--Vahid

From:   Jason Gustafson <ja...@confluent.io>
To:     dev@kafka.apache.org
Date:   06/22/2016 04:41 PM
Subject:        Re: [DISCUSS] KIP-54 Sticky Partition Assignment Strategy

Hey Vahid,

Thanks for the updates. I think the lack of comments on this KIP suggests
that the motivation might need a little work. Here are the two main
benefits of this assignor as I see them:

1. It can give a more balanced assignment when subscriptions do not match
in a group (this is the same problem solved by KIP-49).
2. It potentially allows applications to save the need to cleanup 
partition
state when rebalancing since partitions are more likely to stay assigned 
to
the same consumer.

Does that seem right to you?

Yes, it does. Your summarized it nicely. #1 is an advantage of this 
strategy compared to existing round robin and fair strategies.

I think it's unclear how serious the first problem is. Providing better
balance when subscriptions differ is nice, but are rolling updates the 
only
scenario where this is encountered? Or are there more general use cases
where differing subscriptions could persist for a longer duration? I'm 
also
wondering if this assignor addresses the problem found in KAFKA-2019. It
would be useful to confirm whether this problem still exists with the new
consumer's round robin strategy and how (whether?) it is addressed by this
assignor.

I'm not very clear on the first part of this paragraph. You could clarify 
it for me, but in general balancing out the partitions across consumers in 
a group as much as possible would normally mean balancing the load within 
the cluster, and that's something a user would want to have compared to 
cases where the assignments and therefore the load could be quite 
unbalanced depending on the subscriptions. Having an optimal balance is 
definitely more reassuring that knowing partition assignments could get 
quite unbalanced. There is an example in the KIP that explains a simple 
use case that leads to an unbalanced assignment with round robin 
assignment. This unbalance could become much more severe in real use cases 
with many more topics / partitions / consumers, and that's ideally 
something we would want to avoid, if possible.

Regarding KAFKA-2019, when I try the simple use case of 
https://issues.apache.org/jira/browse/KAFKA-2019?focusedCommentId=14360892&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14360892

each of my consumers gets 3 partitions, which is not the same as what is 
mentioned in the comment. I might be missing something in the 
configuration (except setting the strategy to 'roundrobin', and fetcher 
threads to '2') or the issue may have been resolved already by some other 
patch. In any case, the issue based on what I read in the JIRA stems from 
multiple threads that each consumer may have and how they threads of each 
consumer are assigned first before assigning partitions to other consumer 
threads.

Since the new consumer is single threaded there is no such problem in its 
round robin strategy. It simply considers consumers one by one for each 
partition assignment, and when one consumer is assigned a partition, the 
next assignment starts with considering the next consumer in the list (and 
not the same consumer that was just assigned). This removes the 
possibility of the issue reported in KAFKA-2019 surfacing in the new 
consumer. In the sticky strategy we do not have this issue either, since 
every time an assignment is about to happen we start with the consumer 
with least number of assignments. So we will not have a scenario where a 
consumer is repeated assigned partitions as in KAFKA-2019 (unless that 
consumer is lagging behind other consumers on the number of partitions 
assigned).

The major selling point seems to be the second point. This is definitely
nice to have, but would you expect a lot of value in practice since
consumer groups are usually assumed to be stable? It might help to 
describe
some specific use cases to help motivate the proposal. One of the 
downsides
is that it requires users to restructure their code to get any benefit 
from
it. In particular, they need to move partition cleanup out of the
onPartitionsRevoked() callback and into onPartitionsAssigned(). This is a
little awkward and will probably make explaining the consumer more
difficult. It's probably worth including a discussion of this point in the
proposal with an example.

Even though consumer groups are usually stable, it might be the case that 
consumers do not initially join the group at the same time. The sticky 
strategy in that situation lets those who joined earlier stick to their 
partitions to some extent (assuming fairness take precedence over 
stickiness). In terms of specific use cases, Andrew touched on examples of 
how Kafka can benefit from a sticky assignor. I could add those to the KIP 
if you also think they help building the case in favor of sticky assignor. 
I agree with you about the downside and I'll make sure I add that to the 
KIP as you suggested.

Thanks,
Jason

On Tue, Jun 7, 2016 at 4:05 PM, Vahid S Hashemian 
<vahidhashem...@us.ibm.com
> wrote:

> Hi Jason,
>
> I updated the KIP and added some details about the user data, the
> assignment algorithm, and the alternative strategies to consider.
>
> 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-54+-+Sticky+Partition+Assignment+Strategy

>
> Please let me know if I missed to add something. Thank you.
>
> Regards,
> --Vahid
>
>
>

Re: [DISCUSS] KIP-54 Sticky Partition Assignment Strategy

Reply via email to