Even if you have metadata cached, if the broker isn't available then messages can get stuck in the producer indefinitely. Currently the new producer doesn't have any client-side timeouts, which is a bug. See https://issues.apache.org/jira/browse/KAFKA-1788 for more details.
On Fri, Mar 20, 2015 at 8:09 AM, Jiangjie Qin <j...@linkedin.com.invalid> wrote: > This is correct when you send to a topic for the first time. After that > the metadata will be cached, the metadata cache has an age and after it > expires, metadata will be refreshed. > So the time a producer found a broker is not reachable is the minimum > value of the following times: > 1. Linger.ms + retries * retry.backoff.ms > 2. Metadata.max.age.ms > 3. Metadata.fetch.timeout.ms (only when sending to a topic for the first > time) > > Typically you will hit the first one. The default value is linger.ms=0, > retries=0. But you need to send records with callback to detect the > failure. > > Jiangjie (Becket) Qin > > On 3/20/15, 3:46 AM, "Samuel Chase" <samebch...@gmail.com> wrote: > > >@Sunil > > > >The else branch will be executed if > >`metadata.fetch().partitionsForTopic(topic)` returns non NULL value. I > >assume that when Kafka is unreachable, it will return NULL. > >`waitOnMetadata()` then returns; we never enter the else branch when > >Kafka is unreachable. > > > >@Everyone: Is this explanation correct? > > > >On Fri, Mar 20, 2015 at 3:56 PM, sunil kalva <sambarc...@gmail.com> > wrote: > >> @Samuel > >> My point was > >> The else branch of the code will be executed when metadata is not > >> available, and metadata is not available when kafka cluster is not > >>rachable. > >> > >> please correct me if i am wrong.. > >> > >> On Fri, Mar 20, 2015 at 3:43 PM, Samuel Chase <samebch...@gmail.com> > >>wrote: > >> > >>> @Sunil > >>> > >>> On Fri, Mar 20, 2015 at 3:36 PM, sunil kalva <sambarc...@gmail.com> > >>>wrote: > >>> > I think KafkaProducer.send method blocks until it fetches partition > >>> > metadata for configured time using "metadata.fetch.timeout.ms", once > >>> time > >>> > out it throws TimeoutException. You might be experiencing > >>> TimeoutException ? > >>> > >>> My co-worker pointed out that over here: > >>> > >>> > >>> > https://github.com/apache/kafka/blob/0.8.2/clients/src/main/java/org/apa > >>>che/kafka/clients/producer/KafkaProducer.java#L368 > >>> > >>> waitOnMetadata just returns. The else branch of the code is not > >>> executed when Kafka is unreachable. > >>> > >>> Trying to investigate what else must be causing the wait. > >>> > >> > >> > >> > >> -- > >> SunilKalva > > -- Thanks, Ewen