I had to write a simple offset migration tool and I wanted to get feedback
on whether or not this would be a useful addition to Apache Kafka.

Currently the path to upgrade from the zookeeper offsets to the Kafka
offset (and often the Scala to Java client) is via dual commit. The process
is documented here:
http://kafka.apache.org/documentation.html#offsetmigration

The reason that process wasn't sufficient in my case is because:

   - It needs to be done ahead of the upgrade
   - It requires the old client to commit at least once in dual commit mode
   - Some frameworks don't expose the dual commit functionality well
   - Dual commit is not supported in 0.8.1.x

The tool I wrote takes the relevant connection information and a consumer
group and simply copies the Zookeeper offsets into the Kafka offsets for
that group.
A rough WIP PR can be seen here: https://github.com/apache/kafka/pull/1715

Even though many users have already made the transition, I think this could
still be useful in Kafka. Here are a few reasons:

   - It simplifies the migration for users who have yet to migrate,
   especially as the old clients get deprecated and removed
   - Though the tool is not available in the Kafka 0.8.x or 0.9.x series,
   downloading and using the jar from maven would be fairly straightforward
      - Alternatively this could be a separate repo or jar, though I hardly
      want to push this single tool to maven as a standalone artifact.

Do you think this is useful in Apache Kafka? Any thoughts on the approach?

Thanks,
Grant
-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke

Reply via email to