[jira] [Commented] (KAFKA-2017) Persist Coordinator State for Coordinator Failover

Guozhang Wang (JIRA) Tue, 13 Oct 2015 17:19:26 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956001#comment-14956001
 ]


Guozhang Wang commented on KAFKA-2017:
--------------------------------------

[~hachikuji] [~onurkaraman] [~junrao] With the new protocol, coordinator does 
not need to remember any member metadata except the member ids since now we 
only validate on member-id and generation-id. So after KAFKA-2464 is merged in 
I propose to store the group metadata as:

{code}
/coordinator/consumers/[groupId]:
version: short
generationId: int
members: String   // <- member-ids split by "," and do now allow "," in 
member-id, the first member is always the leader.
{code}

The reading logic is: 

1. Upon handling HB / OffsetCommit / OffsetFetch request, after validating the 
group belongs to itself and coordinator.isActive, if the group does not exist 
in the group metadata cache, try reading from ZK; leave other non-persistent 
fields in the GroupMetadata and MemberMetadata as null.

2. Upon handling JoinGroup, after validating the group belongs to itself and 
coordinator.isActive, if the group does not exist in the group metadata cache, 
try reading from ZK; if the consumer already exists, follow the normal path of 
handlJoinGroup, the only difference is that we will update the member metadata 
and always trigger a rebalance.

3. Upon handling SyncGroup, after validating the group belongs to itself and 
coordinator.isActive, if the group does not exist in the group metadata cache, 
try reading from ZK; then follow the normal path of handleSyncGroup.

The write logic is as follows:

After the Join-group barrier, update the ZK with the generation id / leader-id 
/ members.

With this proposal, we do not need a "Initialize" state as in the original 
proposal 
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Client-side+Assignment+Proposal.

> Persist Coordinator State for Coordinator Failover
> --------------------------------------------------
>
>                 Key: KAFKA-2017
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2017
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: consumer
>    Affects Versions: 0.9.0.0
>            Reporter: Onur Karaman
>            Assignee: Guozhang Wang
>             Fix For: 0.9.0.0
>
>         Attachments: KAFKA-2017.patch, KAFKA-2017_2015-05-20_09:13:39.patch, 
> KAFKA-2017_2015-05-21_19:02:47.patch
>
>
> When a coordinator fails, the group membership protocol tries to failover to 
> a new coordinator without forcing all the consumers rejoin their groups. This 
> is possible if the coordinator persists its state so that the state can be 
> transferred during coordinator failover. This state consists of most of the 
> information in GroupRegistry and ConsumerRegistry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-2017) Persist Coordinator State for Coordinator Failover

Reply via email to