I had to write a simple offset migration tool and I wanted to get feedback on whether or not this would be a useful addition to Apache Kafka.
Currently the path to upgrade from the zookeeper offsets to the Kafka offset (and often the Scala to Java client) is via dual commit. The process is documented here: http://kafka.apache.org/documentation.html#offsetmigration The reason that process wasn't sufficient in my case is because: - It needs to be done ahead of the upgrade - It requires the old client to commit at least once in dual commit mode - Some frameworks don't expose the dual commit functionality well - Dual commit is not supported in 0.8.1.x The tool I wrote takes the relevant connection information and a consumer group and simply copies the Zookeeper offsets into the Kafka offsets for that group. A rough WIP PR can be seen here: https://github.com/apache/kafka/pull/1715 Even though many users have already made the transition, I think this could still be useful in Kafka. Here are a few reasons: - It simplifies the migration for users who have yet to migrate, especially as the old clients get deprecated and removed - Though the tool is not available in the Kafka 0.8.x or 0.9.x series, downloading and using the jar from maven would be fairly straightforward - Alternatively this could be a separate repo or jar, though I hardly want to push this single tool to maven as a standalone artifact. Do you think this is useful in Apache Kafka? Any thoughts on the approach? Thanks, Grant -- Grant Henke Software Engineer | Cloudera gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke