Sophie Blee-Goldman created KAFKA-8951: ------------------------------------------
Summary: Avoid unnecessary rebalances and downtime for "safe" partitions Key: KAFKA-8951 URL: https://issues.apache.org/jira/browse/KAFKA-8951 Project: Kafka Issue Type: Improvement Components: clients, streams Reporter: Sophie Blee-Goldman With cooperative rebalancing, any partition that is encoded in one consumer's Subscription cannot be re-assigned to a different consumer during that rebalance. The partition must be removed from the assignment and revoked by its old owner before triggering a second rebalance during which it can be assigned. This is to enforce a synchronization barrier so that no two consumers can ever own the same partition at the same time This leads to down time for that partition plus a second rebalance, which may not always be necessary. In Streams for example, the consumer will pause all partitions of an active task until it is running (ie has been initialized and restored). It should be safe to give these partitions away, provided they are not resumed between sending the joinGroup request and receiving the syncGroup response. One proposal would be to modify two methods in the ConsumerPartitionAssignor interface. 1) ConsumerPartitionAssignor#subscriptionUserData would be passed in the set of `ownedPartitions` that will be included in the subscription, allowing it to remove any that it knows are safe to give away. 2) ConsumerPartitionAssignor#onAssignment would be passed the set of revoked partitions, allowing it to remove any that it knows were already reassigned and should not trigger another rebalance. -- This message was sent by Atlassian Jira (v8.3.4#803005)