Hi Jun

The goal of static membership, is to hold the rebalance when there's
consumer dropped (before session timeout).
For K8s, it's good because when the pods are broken (or during upgrade),
it'll kill the pod and bring a new one up to replace the old one.
In this case, we don't want the consumer group to rebalance twice (once
when old pods down, and once after new pods up).
We hold the rebalance, and after the new pods up, we check everything is
good, no rebalance will be triggered.

So, answering your questions below:

1. The desired behavior is partitions should be balanced eventually (but
then it conflicts with "no rebalance" nature of static membership with the
healthy backbone). Could you point out what I am missing here?
--> Yes, the root cause of this issue is: there should not be any
"unbalanced" cases after rebalance completed

2. What is recommended under this unbalanced partition situation?
  2-a Leaving it unbalanced? (unlikely)
  2-b Do I have to adjust session timeout so I can artificially cause a
rebalance that eventually makes assignment even? (I sense somewhat
cumbersome)
  2-c Is there a special PartitionAssignor implementation we should use
under static membership and the assignor magically guarantees the
assignment even?

--> Again, we expect that there should be no "unbalanced" situation after
rebalance completed.
If this really happened, you might need to try to report in JIRA here:
https://issues.apache.org/jira/projects/KAFKA/issues
And try to collect logs and current consumer status to us. (if possible,
enable DEBUG log for troubleshooting)
And please let us know which partition.assignment.strategy
<https://kafka.apache.org/documentation/#consumerconfigs_partition.assignment.strategy>
you're using? RoundRobin? CooperativeStickyAssignor?

I saw you mentioned you've. been suffering from rebalance storms from time
to time.
I think you should first figure out why the rebalance happen so frequently.
And then we can know how to fix it.
Static membership is a way to improve it, but as I mentioned above, it only
works for some cases.

If you're interested in read more about static membership, here's a good
blog:
https://www.confluent.io/blog/kafka-rebalance-protocol-static-membership/

Thank you.
Luke

On Wed, Jan 12, 2022 at 2:21 PM jun aoki <ja...@apache.org> wrote:

> Hi kafka experts,
>
> My understanding of static membership is that assuming kubernetes, for
> example, can provide a fixed number of healthy pods almost always, so that
> kafka doesn't have to do any rebalancing.
>
> It leads me to think, if the starting point is partition assignment being
> unbalanced (say there are total 10 partitions and 2 pods. 8 partitions are
> assigned one of them and 2 partitions go to the other), it will be
> unbalanced forever because pods are kept healthy by k8s and no rebalancing
> ever occurs. And I don't think it is the desired behavior.
>
> My questions are
> 1. The desired behavior is partitions should be balanced eventually (but
> then it conflicts with "no rebalance" nature of static membership with the
> healthy backbone). Could you point out what I am missing here?
> 2. What is recommended under this unbalanced partition situation?
>   2-a Leaving it unbalanced? (unlikely)
>   2-b Do I have to adjust session timeout so I can artificially cause a
> rebalance that eventually makes assignment even? (I sense somewhat
> cumbersome)
>   2-c Is there a special PartitionAssignor implementation we should use
> under static membership and the assignor magically guarantees the
> assignment even?
>
> We've been suffering from rebalance storms from time to time and static
> membership seems the way to resolve it, but I do want to make sure we know
> how to work around some edge cases like it.
>
> --
> -jun
>

Reply via email to