[ https://issues.apache.org/jira/browse/KAFKA-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075473#comment-14075473 ]
nicu marasoiu commented on KAFKA-1510: -------------------------------------- forcing all to zk too does indeed have the drawback that it will typically copy the same offsets again, and not only once but potentially several times (if kafka is retried). However the alternative is to commit to both kafka and zookeeper unconditionally in the normal flow (right now, the commit to zk happens only after a successful commit to kafka if any). Also, the code is written in a blocking manner, serializing operations with brokers and kafka/zk, which can be done all in parallel on different tcp connections and different threads (or in the same thread with NIO). I think a non blocking client architecture that can do things in parallel is underway, with the new clients in java, is it? > Force offset commits when migrating consumer offsets from zookeeper to kafka > ---------------------------------------------------------------------------- > > Key: KAFKA-1510 > URL: https://issues.apache.org/jira/browse/KAFKA-1510 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.8.2 > Reporter: Joel Koshy > Assignee: nicu marasoiu > Labels: newbie > Fix For: 0.8.2 > > Attachments: forceCommitOnShutdownWhenDualCommit.patch > > > When migrating consumer offsets from ZooKeeper to kafka, we have to turn on > dual-commit (i.e., the consumers will commit offsets to both zookeeper and > kafka) in addition to setting offsets.storage to kafka. However, when we > commit offsets we only commit offsets if they have changed (since the last > commit). For low-volume topics or for topics that receive data in bursts > offsets may not move for a long period of time. Therefore we may want to > force the commit (even if offsets have not changed) when migrating (i.e., > when dual-commit is enabled) - we can add a minimum interval threshold (say > force commit after every 10 auto-commits) as well as on rebalance and > shutdown. > Also, I think it is safe to switch the default for offsets.storage from > zookeeper to kafka and set the default to dual-commit (for people who have > not migrated yet). We have deployed this to the largest consumers at linkedin > and have not seen any issues so far (except for the migration caveat that > this jira will resolve). -- This message was sent by Atlassian JIRA (v6.2#6252)