Thanks  Neha and Joel.

My understanding about offset is:

1. Offset stored in zk is only used when the consumer is connected again.
2. Joel's suggestion  "in fact setting an autocommit interval and being
willing to deal with duplicates is almost equivalent. " makes sense.
 But if crash happens just after offset committed, then unprocessed message
in consumer will be skipped after reconnected.

Please correct me if I am wrong.


In ConsumerConnector, if ConsumerIterator can return  partition offset with
message together,  then we save offset in client side and commit offset only
after all the message before this offset is done(turn off autoCommit).
I roughly go through the code, if use this option I need change some code.

Another option is use simpleConnector as we discussed before, but this
option required more code work in client side, since one consumer may has
more than 1 simpleConnector.
We need  manage these connector with Zk and merge result for each connector.

I tend to option 1.
  
Thanks,
Yonghui 


From:  Neha Narkhede <neha.narkh...@gmail.com>
Date:  2012年12月21日星期五 上午2:13
To:  <users@kafka.apache.org>
Cc:  永辉 赵 <zhaoyong...@gmail.com>
Subject:  Re: Proper use of ConsumerConnector


> An alternative to using simpleconsumer in this use case is to use the
> zookeeper consumer connector and turn off auto commit.

Keep in mind that this works only if you don't care about controlling per
partition rewind capability.
The high level consumer will not give you control over which partitions your
consumer consumes and
which partitions it commits the offsets for. If you need to rewind
consumption for a subset of those partitions,
then ZookeeperConsumerConnector will not work for you.

Thanks,
Neha


Reply via email to