Okay I did some assessment of use cases we have which aren't using the
default offset storage API and came up with one generalization. I would
like to propose--add a generic metadata field to the offset api on a
per-partition basis. So that would leave us with the following:

OffsetCommitRequest => ConsumerGroup [TopicName [Partition Offset Metadata]]

OffsetFetchResponse => [TopicName [Partition Offset Metadata ErrorCode]]

  Metadata => string

If you want to store a reference to any associated state (say an HDFS file
name) so that if the consumption fails over the new consumer can start up
with the same state, this would be a place to store that. It would not be
intended to support large stuff (we could enforce a 1k limit or something,
just something small or a reference on where to find the state (say a file
name).

Objections?

-Jay


On Mon, Dec 17, 2012 at 10:45 AM, Jay Kreps <jay.kr...@gmail.com> wrote:

> Hey Guys,
>
> David has made a bunch of progress on the offset commit api implementation.
>
> Since this is a public API it would be good to do as much thinking
> up-front as possible to minimize future iterations.
>
> It would be great if folks could do the following:
> 1. Read the wiki here:
> https://cwiki.apache.org/confluence/display/KAFKA/Offset+Management
> 2. Check out the code David wrote here:
> https://issues.apache.org/jira/browse/KAFKA-657
>
> In particular our hope is that this API can act as the first step in
> scaling the way we store offsets (ZK is not really very appropriate for
> this). This of course requires having some plan in mind for offset storage.
> I have written (and then after getting some initial feedback, rewritten) a
> section in the above wiki on how this might work.
>
> If no one says anything I will be taking a slightly modified patch that
> adds this functionality on trunk as soon as David gets in a few minor
> tweaks.
>
> -Jay
>

Reply via email to