[ 
https://issues.apache.org/jira/browse/KAFKA-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13750956#comment-13750956
 ] 

Tejas Patil commented on KAFKA-1012:
------------------------------------

Thanks a lot [~junrao] for all the awesome comments !!!

20. ZookeeperConsumerConnector:
20.1 Good suggestion from design point of view. Would include in next patch.
20.2 Would include in next patch
20.3 With 20.1, this would move to OffsetClientManager. Would wait for 
KAFKA-989 to get merged in trunk so that I could use it in my patch.
20.4 "loading offsets" fails rebalance and then rebalance would be re-tried. It 
can be argued that subsequent retries can fail too as those happen w/o any 
delay. We could have an infinite loop with some exponential delay till offset 
fetch does not get an offset loading code. It is expected that the offsets 
topic logs for a single partition would be small in size and the broker should 
not take much time to load .... so we might not need to have a loop. Would keep 
this open for discussion.

21. KafkaApis:
21.1 This would just save the #connections on the controller node and not the 
rest brokers. With the thick client change, this would not be needed.
21.2 Clients, written in different langauges, might choose any ack level. To be 
on safe side, all cases are handled. Would keep as it is.
21.3 This could have been possible if there was no "else" clause. Would keep as 
it is.
21.4 +1 for the suggestion. Would include in next patch.

22. OffsetManager:
22.1 timestamp would be used in coming patch.
22.2 SUPERB CATCH :) How about "currOffset = m.nextOffset" ? Would include in 
next patch.
22.3 This was done to unblock the offset fetch requests right after loading is 
done. In the coming patch with timestamps being stored in offset table, this 
would change.

23. +1. Would include in next patch.
24. The coming patch would include a stnadalone utility to do offset cleanup. 
Here is the section in wiki page:
https://cwiki.apache.org/confluence/display/KAFKA/Inbuilt+Consumer+Offset+Management#InbuiltConsumerOffsetManagement-\5\Offsetscleanupfromoffsettable%2CZkandlogs

25. Migrating existing consumers: 
The patch changes the structure of the offset records in Zk (it stores offset + 
delimiter + timestamp + delimiter + metadata). This could be reverted for time 
being to use the old format (just the offset). The new binary can then be 
deployed bouncing brokers one by one. Consumers would still directly write to 
Zk. Those need to be upgraded too after done with brokers. Brokers would 
continue to use Zk to save offsets but along with that, the logs would get 
populated from offset commits. Also, we could explicitly write the offsets from 
Zk to logs for those entries which did not have any commits after this new 
binary is deployed. Now that we are sure that all the offset info is in the 
logs, we can switch the config to use the inbuilt offset manager. As the data 
is in logs, loading of offsets would bring up the offset table entries. There 
would be no need to have any migration tool.
                
> Implement an Offset Manager and hook offset requests to it
> ----------------------------------------------------------
>
>                 Key: KAFKA-1012
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1012
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: consumer
>            Reporter: Tejas Patil
>            Assignee: Tejas Patil
>            Priority: Minor
>         Attachments: KAFKA-1012.patch, KAFKA-1012-v2.patch
>
>
> After KAFKA-657, we have a protocol for consumers to commit and fetch offsets 
> from brokers. Currently, consumers are not using this API and directly 
> talking with Zookeeper. 
> This Jira will involve following:
> 1. Add a special topic in kafka for storing offsets
> 2. Add an OffsetManager interface which would handle storing, accessing, 
> loading and maintaining consumer offsets
> 3. Implement offset managers for both of these 2 choices : existing ZK based 
> storage or inbuilt storage for offsets.
> 4. Leader brokers would now maintain an additional hash table of offsets for 
> the group-topic-partitions that they lead
> 5. Consumers should now use the OffsetCommit and OffsetFetch API

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to