Yeah this is a good one to discuss.

Current send will block in two conditions:
1. You are beginning a new data batch and have run out of buffer memory and
block.on.buffer.full=true
2. Regardless of block.on.buffer.full the first request for each topic will
block on fetching metadata that contains partition info for that topic.

The blocking for (2) is bounded by metadata.fetch.timeout.ms, so it won't
actually block forever (but it may block for a while).

Let me describe the rationale on 2. Basically we want to avoid having the
client fetch and maintain the full set of partition info because it may be
biggish for very large clusters. We also want to avoid pre-configuring the
set of topics a client will use. This means we have to fetch dynamically.

We could introduce a separate queue to hold these requests while we fetch
metadata but that would mess up our memory bounds and would be fairly
complex.

A user who wants truly non-blocking behavior at send time can avoid this by
calling partitionsFor(topic) at producer initialization time to fetch the
metadata for the topics it wants. This call will block, but it will ensure
each subsequent fetch won't.

For the case where all entries in metadata.broker.list are wrong one option
we had discussed was sanity checking these when you call new KafkaProducer
and forcing the establishment of one connection to something in that list
so we could fail fast. The only downside of that is that in a test
environment you might bring up the client and server simultaneously and
this would enforce an ordering between the two.

-Jay



On Thu, Feb 13, 2014 at 11:00 AM, Guozhang Wang <wangg...@gmail.com> wrote:

> I think we have discussed this issue before but I just want to bring it up
> again to get confirmed.
>
> Currently when the send() call is triggered, first of all the cluster
> information will be fetched for the specified topic, and if such
> information is not available yet the send call will block on refreshing the
> cluster metadata. The hope is that when auto.create.topics is turned on,
> eventually the topic will be created and metadata be propagated back to the
> producer, so it may just block for a little while.
>
> But for some cases the send() call may block indefinitely, for example:
>
> 1. Topic is not created and auto.create.topic is set to false.
> 2. Broker list is wrong so that no metadata can be ever fetched.
>
> Is this what we expect in such scenarios?
>
> --
> -- Guozhang
>

Reply via email to