Just adding some related reference here: Henry Cai is contributing some advanced feature in Kafka Streams regarding static assignment: https://github.com/apache/kafka/pull/1543
The main motivation is that when you do rolling bounce for upgrading your Kafka Streams code, for example, you would prefer to not move assigned partitions of the current bouncing instance to others, and today it is worked around by increasing the session.timeout; but what is more tricky is that when the bouncing instance comes back, it will still trigger a rebalance. The idea is that as long as we can encode the previous iteration's assignment map, and we can check that the list of partitions / members does not change regarding to their previous assigned partitions, we keep the assigned as is. Guozhang On Thu, Jun 23, 2016 at 10:24 AM, Andrew Coates <big.andy.coa...@gmail.com> wrote: > Hey Jason, > > Good to know on the round robin assignment. I'll look into that. > > The issue I have with the current rebalance listener is that it's not > intuitive and unnecessarily exposes the inner workings of rebalance logic. > When the onPartitionsRevoked method is called it's not really saying the > partitions were revoked. It's really saying a rebalance is happening and > you need to deal with any in-flight partitions & commit offsets. So maybe > the method name is wrong! Maybe it should be 'onRebalance' or > 'commitOffsets'..? Then the interface could also have an > onPartitionsRevoked method that is only called when partitions have been > revoked and given to someone else to handle, rather than just kind of > paused while we rebalance... maybe the new method could be > onPausePartitions? > > Andy > > On Thu, 23 Jun 2016, 18:06 Jason Gustafson, <ja...@confluent.io> wrote: > > > Hey Andy, > > > > Thanks for jumping in. A couple comments: > > > > In addition, I think it is important that during a rebalance consumers do > > > not first have all partitions revoked, only to have a very similar, (or > > the > > > same!), set reassigned. This is less than initiative and complicates > > client > > > code unnecessarily. Instead, the `ConsumerPartitionListener` should > only > > be > > > called for true changes in assignment I.e. any new partitions assigned > > and > > > any existing ones revoked, when comparing the new assignment to the > > > previous one. > > > > > > The problem is that the revocation callback is called before you know > what > > the assignment for the next generation will be. This is necessary for the > > consumer to be able to commit offsets for its assigned partitions. Once > the > > consumer has a new assignment, it is no longer safe to commit offsets > from > > the previous generation. Unless sticky assignment can give us some > > guarantee on which partitions will remain after the rebalance, all of > them > > must be included in the revocation callback. > > > > > > > There is one last scenario I'd like to highlight that I think the KIP > > > should describe: say you have a group consuming from two topics, each > > topic > > > with two partitions. As of 0.9.0.1 the maximum number of consumers you > > can > > > have is 2, not 4. With 2 consumers each will get one partition from > each > > > topic. A third consumer with not have any partitions assigned. This > > should > > > be fixed by the 'fair' part of the strategy, but it would be good to > see > > > this covered explicitly in the KIP. > > > > > > This would be true for range assignment, but with 4 partitions total, > > round-robin assignment would give one partition to each of the 4 > consumers > > (assuming subscriptions match). > > > > Thanks, > > Jason > > > > > > On Thu, Jun 23, 2016 at 1:42 AM, Andrew Coates < > big.andy.coa...@gmail.com> > > wrote: > > > > > Hi all, > > > > > > I think sticky assignment is immensely important / useful in many > > > situations. Apps that use Kafka are many and varied. Any app that > stores > > > any state, either in the form of data from incoming messages, cached > > > results from previous out-of-process calls or expensive operations, > (and > > > let's face it, that's most!), can see a big negative impact from > > partition > > > movement. > > > > > > The main issue partition movement brings is that it makes building > > elastic > > > services very hard. Consider: you've got an app consuming from Kafka > that > > > locally caches data to improve performance. You want the app to auto > > scale > > > as the throughout to the topic(s) increases. Currently, when one or > > more > > > new instance are added and the group rebalances, all existing instances > > > have all partitions revoked, and then a new, potentially quite > different, > > > set assigned. An intuitive pattern is to evict partition state, I.e. > the > > > cached data, when a partition is revoked. So in this case all apps > flush > > > their entire cache causing throughput to drop massively, right when you > > > want to increase it! > > > > > > Even if the app is not flushing partition state when partitions are > > > revoked, the lack of a 'sticky' strategy means that a proportion of the > > > cached state is now useless, and instances have partitions assigned for > > > which they have no cached state, again negatively impacting throughout. > > > > > > With a 'sticky' strategy throughput can be maintained and indeed > > increased, > > > as intended. > > > > > > The same is also true in the presence of failure. An instance failing, > > > (maybe due to high load), can invalidate the caching of existing > > instances, > > > negatively impacting throughout of the remaining instances, (possibly > at > > a > > > time the system needs throughput the most!) > > > > > > My question would be 'why move partitions if you don't have to?'. I > will > > > certainly be setting the 'sticky' assignment strategy as the default > once > > > it's released, and I have a feeling it will become the default in the > > > communitie's 'best-practice' guides. > > > > > > In addition, I think it is important that during a rebalance consumers > do > > > not first have all partitions revoked, only to have a very similar, (or > > the > > > same!), set reassigned. This is less than initiative and complicates > > client > > > code unnecessarily. Instead, the `ConsumerPartitionListener` should > only > > be > > > called for true changes in assignment I.e. any new partitions assigned > > and > > > any existing ones revoked, when comparing the new assignment to the > > > previous one. > > > > > > I think the change to how the client listener is called should be part > of > > > this work. > > > > > > There is one last scenario I'd like to highlight that I think the KIP > > > should describe: say you have a group consuming from two topics, each > > topic > > > with two partitions. As of 0.9.0.1 the maximum number of consumers you > > can > > > have is 2, not 4. With 2 consumers each will get one partition from > each > > > topic. A third consumer with not have any partitions assigned. This > > should > > > be fixed by the 'fair' part of the strategy, but it would be good to > see > > > this covered explicitly in the KIP. > > > > > > Thanks, > > > > > > > > > Andy > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, 23 Jun 2016, 00:41 Jason Gustafson, <ja...@confluent.io> > wrote: > > > > > > > Hey Vahid, > > > > > > > > Thanks for the updates. I think the lack of comments on this KIP > > suggests > > > > that the motivation might need a little work. Here are the two main > > > > benefits of this assignor as I see them: > > > > > > > > 1. It can give a more balanced assignment when subscriptions do not > > match > > > > in a group (this is the same problem solved by KIP-49). > > > > 2. It potentially allows applications to save the need to cleanup > > > partition > > > > state when rebalancing since partitions are more likely to stay > > assigned > > > to > > > > the same consumer. > > > > > > > > Does that seem right to you? > > > > > > > > I think it's unclear how serious the first problem is. Providing > better > > > > balance when subscriptions differ is nice, but are rolling updates > the > > > only > > > > scenario where this is encountered? Or are there more general use > cases > > > > where differing subscriptions could persist for a longer duration? > I'm > > > also > > > > wondering if this assignor addresses the problem found in KAFKA-2019. > > It > > > > would be useful to confirm whether this problem still exists with the > > new > > > > consumer's round robin strategy and how (whether?) it is addressed by > > > this > > > > assignor. > > > > > > > > The major selling point seems to be the second point. This is > > definitely > > > > nice to have, but would you expect a lot of value in practice since > > > > consumer groups are usually assumed to be stable? It might help to > > > describe > > > > some specific use cases to help motivate the proposal. One of the > > > downsides > > > > is that it requires users to restructure their code to get any > benefit > > > from > > > > it. In particular, they need to move partition cleanup out of the > > > > onPartitionsRevoked() callback and into onPartitionsAssigned(). This > > is a > > > > little awkward and will probably make explaining the consumer more > > > > difficult. It's probably worth including a discussion of this point > in > > > the > > > > proposal with an example. > > > > > > > > Thanks, > > > > Jason > > > > > > > > > > > > > > > > On Tue, Jun 7, 2016 at 4:05 PM, Vahid S Hashemian < > > > > vahidhashem...@us.ibm.com > > > > > wrote: > > > > > > > > > Hi Jason, > > > > > > > > > > I updated the KIP and added some details about the user data, the > > > > > assignment algorithm, and the alternative strategies to consider. > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-54+-+Sticky+Partition+Assignment+Strategy > > > > > > > > > > Please let me know if I missed to add something. Thank you. > > > > > > > > > > Regards, > > > > > --Vahid > > > > > > > > > > > > > > > > > > > > > > > > > -- -- Guozhang