On 5/9/13 8:27 AM, Chris Curtin wrote:
On Thu, May 9, 2013 at 12:36 AM, Rob Withers <reefed...@gmail.com> wrote:
-----Original Message-----
From: Chris Curtin [mailto:curtin.ch...@gmail.com]
1 When you say the iterator may block, do you mean hasNext() may block?
Yes.
Is this due to a potential non-blocking fetch (broker/zookeeper returns an
empty block if offset is current)? Yet this blocks the network call of the
consumer iterator, do I have that right? Are there other reasons it could
block? Like the call fails and a backup call is made?
I'll let the Kafka team answer this. I don't know the low level details.
The iterator will block if there is no more data to consume. The
iterator is actually reading messages from a BlockingQueue which is fed
messages by the fetcher threads. The reason for this is to allow you to
configure blocking with or without a timeout in the ConsumerIterator.
This is reflected in the consumer timeout property (consumer.timeout.ms)
b. For client crash, what can client do to avoid duplicate
messages
when restarted? What I can think of is to read last message from log
file and ignore the first few received duplicate messages until
receiving the last read message. But is it possible for client to read
log file
directly?
If you can't tolerate the possibility of duplicates you need to look at
the
Simple Consumer example, There you control the offset storage.
Do you have example code that manages only once, even when a consumer for a
given partition goes away?
No, but if you look at the Simple Consumer example where the read occurs
(and the write to System.out) at that point you know the offset you just
read, so you need to put it somewhere. Using the Simple Consumer Kafka
leaves all the offset management to you.
What does happen with rebalancing when a consumer goes away?
Hmm, I can't find the link to the algorithm right now. Jun or Neha can you?
Down at the bottom of the 0.7 design page
http://kafka.apache.org/07/design.html
Is this
behavior of the high-level consumer group?
Yes.
Is there a way to supply one's
own simple consumer with only once, within a consumer group that
rebalances?
No. Simple Consumers don't have rebalancing steps. Basically you take
control of what is requested from which topics and partitions. So you could
ask for a specific offset in a topic/partition 100 times in a row and Kafka
will happily return it to you. Nothing is written to ZooKeeper either, you
control everything.
What happens if a producer goes away?
Shouldn't matter to the consumers. The Brokers are what the consumers talk
to, so if nothing is writing the Broker won't have anything to send.
thanks much,
rob