Jason Gustafson created KAFKA-5586:
--------------------------------------

             Summary: Handle client disconnects during JoinGroup
                 Key: KAFKA-5586
                 URL: https://issues.apache.org/jira/browse/KAFKA-5586
             Project: Kafka
          Issue Type: Bug
            Reporter: Jason Gustafson


If a consumer disconnects with a JoinGroup in-flight, we do not remove it from 
the group until after the Join phase completes. If the client immediately 
re-sends the JoinGroup request and it already had a memberId, then the callback 
will be replaced and there is no harm done. For the other cases:

1. If the client disconnected due to a failure and does not re-send the 
JoinGroup, the consumer will still be included in the new group generation 
after the rebalance completes, but will immediately timeout and trigger a new 
rebalance.
2. If the consumer was not a member of the group and re-sends JoinGroup, then a 
new memberId will be created for that consumer and the old one will not be 
removed. When the rebalance completes, the old memberId will timeout and a 
rebalance will be triggered.

To address these issues, we should add some additional logic to handle client 
disconnections during the join phase. For newly generated memberIds, we should 
simply remove them. For existing members, we should probably leave them in the 
group and reset the heartbeat expiration task.

Note that we currently have no facility to expose disconnects from the network 
layer to the other layers, so we need to find a good approach for this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to