Hi Ismael,

The request does succeed and the reason it keeps requesting is a check in the 
Sender.run(long now) method.

    public void run(long now) {
        Cluster cluster = metadata.fetch();
        // get the list of partitions with data ready to send
        RecordAccumulator.ReadyCheckResult result = 
this.accumulator.ready(cluster, now);

        // if there are any partitions whose leaders are not known yet, force 
metadata update
        if (result.unknownLeadersExist)
            this.metadata.requestUpdate();

It looks like the this.accumulator.ready(cluster, now) method checks the leader 
for each partition in the response against what it already had. In this case 
the original metadata request had the empty topic list so got information for 
all partitions but after using the producer the cluster only has the one topic 
in it which means this check sets unknownLeadersExist = true.
            Node leader = cluster.leaderFor(part);
            if (leader == null) {
                unknownLeadersExist = true;

As you can see above the Sender.run methd checks for this in the result and 
then calls this.metadata.requestUpdate() which tirggers the metadata to be 
requested again. And of course the same thing happens when checking the next 
response and we're suddenly in the loop forever.

Hope this helps,

Phil


-----Original Message-----
From: isma...@gmail.com [mailto:isma...@gmail.com] On Behalf Of Ismael Juma
Sent: 12 April 2016 16:24
To: users@kafka.apache.org
Subject: Re: KafkaProducer 0.9.0.1 continually sends metadata requests

Hi Jonathan,

That makes sense, thanks.

Phil, do you know if your metadata requests are succeeding? Retrying every 
retry.backoff.ms is expected if they fail, but not if they succeed. In terms of 
posting a comment to JIRA, you can create an account if you don't have one:

https://issues.apache.org/jira/secure/Signup!default.jspa

Ismael

On Tue, Apr 12, 2016 at 3:33 PM, Jonathan Bond <jb...@netflix.com.invalid>
wrote:

> Ismael, In our case KAFKA-3358 caused frequent metadata requests 
> because we ended up swamping the broker.  Getting all metadata is a 
> very expensive request, once all requests threads spent their majority 
> of time handling metadata requests, they'd start timing out. Timing 
> out would cause a new request to be made, which would increase the 
> needed metadata level - so when a response finally did arrive it was 
> never recent enough. To resolve we had to shut down the 7000 clients 
> and let the brokers catch-up, for other messages to start flowing 
> again. I can't remember if clients were sending at retry.backoff.ms or 
> timeout rate (it wasn't a fun day, and I don't want to recreate to 
> find out). We now make sure we don't start up producers until a message is 
> ready to be sent.
>
> On Tue, Apr 12, 2016 at 6:59 AM Ismael Juma <ism...@juma.me.uk> wrote:
>
> > Sorry, I should say that KAFKA-3306 will fix the issue where we 
> > request data for all topics instead of no topics. However, it seems 
> > like there is an additional issue where it seems like we are 
> > performing metadata
> requests
> > too frequently. Phil, can you please add this information to 
> > KAFKA-3358
> so
> > that we can figure out if more needs to be done?
> >
> > Ismael
> >
> > On Tue, Apr 12, 2016 at 2:52 PM, Ismael Juma <ism...@juma.me.uk> wrote:
> >
> > > Note that this should be fixed as part of
> > > https://issues.apache.org/jira/browse/KAFKA-3306
> > >
> > > Ismael
> > >
> > > On Tue, Apr 12, 2016 at 2:17 PM, Phil Luckhurst < 
> > > phil.luckhu...@encycle.com> wrote:
> > >
> > >> Thanks Jonathan, I didn't spot that JIRA.
> > >>
> > >> Phil
> > >>
> > >> -----Original Message-----
> > >> From: Jonathan Bond [mailto:jb...@netflix.com.INVALID]
> > >> Sent: 12 April 2016 14:08
> > >> To: users@kafka.apache.org
> > >> Subject: Re: KafkaProducer 0.9.0.1 continually sends metadata 
> > >> requests
> > >>
> > >> Phil,
> > >> In our case this bug placed significant load on our brokers. We
> raised a
> > >> bug  https://issues.apache.org/jira/browse/KAFKA-3358 to get this 
> > >> resolved.
> > >>
> > >> On Tue, Apr 12, 2016 at 5:39 AM Phil Luckhurst < 
> > >> phil.luckhu...@encycle.com>
> > >> wrote:
> > >>
> > >> > With debug logging turned on we've sometimes seen our logs 
> > >> > filling
> up
> > >> > with the kafka producer sending metadata requests every 100ms e.g.
> > >> >
> > >> > 2016-04-08 10:39:33,592 DEBUG [kafka-producer-network-thread | 
> > >> > phil-pa-1]
> > >> > org.apache.kafka.clients.NetworkClient: Sending metadata 
> > >> > request ClientRequest(expectResponse=true, callback=null,
> > >> >
> request=RequestSend(header={api_key=3,api_version=0,correlation_id=249
> > >> > ,client_id=phil-pa-1}, 
> > >> > body={topics=[phil-pa-1-device-update]}),
> > >> > isInitiatedByNetworkClient, createdTimeMs=1460108373592,
> sendTimeMs=0)
> > >> > to node 0
> > >> > 2016-04-08 10:39:33,592 DEBUG [kafka-producer-network-thread | 
> > >> > phil-pa-1]
> > >> > org.apache.kafka.clients.Metadata: Updated cluster metadata 
> > >> > version
> > >> > 248 to Cluster(nodes = [Node(0, ta-eng-kafka2, 9092)], 
> > >> > partitions = [Partition(topic = phil-pa-1-device-update, 
> > >> > partition = 0, leader =
> 0,
> > >> > replicas = [0,], isr = [0,]])
> > >> > 2016-04-08 10:39:33,698 DEBUG [kafka-producer-network-thread | 
> > >> > phil-pa-1]
> > >> > org.apache.kafka.clients.NetworkClient: Sending metadata 
> > >> > request ClientRequest(expectResponse=true, callback=null,
> > >> >
> request=RequestSend(header={api_key=3,api_version=0,correlation_id=250
> > >> > ,client_id=phil-pa-1}, 
> > >> > body={topics=[phil-pa-1-device-update]}),
> > >> > isInitiatedByNetworkClient, createdTimeMs=1460108373698,
> sendTimeMs=0)
> > >> > to node 0
> > >> > 2016-04-08 10:39:33,698 DEBUG [kafka-producer-network-thread | 
> > >> > phil-pa-1]
> > >> > org.apache.kafka.clients.Metadata: Updated cluster metadata 
> > >> > version
> > >> > 249 to Cluster(nodes = [Node(0, ta-eng-kafka2, 9092)], 
> > >> > partitions = [Partition(topic = phil-pa-1-device-update, 
> > >> > partition = 0, leader =
> 0,
> > >> > replicas = [0,], isr = [0,]])
> > >> >
> > >> > These metadata requests continue to be sent every 100ms
> > >> > (retry.backoff.ms) until we stop the process.
> > >> >
> > >> > This only seems to happen if the KafkaProducer instance is 
> > >> > created
> but
> > >> > not used to publish a message for 5 minutes. After 5 minutes (
> > >> > metadata.max.age.ms) the producer thread sends a metadata 
> > >> > request
> to
> > >> > the server that has an empty topics list and the server 
> > >> > responds
> with
> > >> > the partition information for *all* topics hosted on the server.
> > >> >
> > >> > 2016-04-11 14:16:39,320 DEBUG [kafka-producer-network-thread | 
> > >> > phil-pa-1]
> > >> > org.apache.kafka.clients.NetworkClient: Sending metadata 
> > >> > request ClientRequest(expectResponse=true, callback=null,
> > >> >
> request=RequestSend(header={api_key=3,api_version=0,correlation_id=0,c
> > >> > lient_id=phil-pa-1}, body={topics=[]}), 
> > >> > isInitiatedByNetworkClient, createdTimeMs=1460380599289,
> > >> > sendTimeMs=0) to node -1
> > >> >
> > >> > If we then use that KafkaProducer instance to send a message 
> > >> > the
> next
> > >> > 'Sending meta request' will just be for the topic we have sent 
> > >> > the message to and this then triggers the flood of retry 
> > >> > requests as
> noted
> > >> above.
> > >> >
> > >> > If we ensure we send the first message within the time set by 
> > >> > metadata.max.age.ms (default 5 minutes) then everything works 
> > >> > as expected and the metadata requests do not continually get retried.
> > >> >
> > >> > In many cases I can understand that creating a KafkaProducer 
> > >> > and
> then
> > >> > not using it within 5 minutes is not usual but in our case 
> > >> > we're creating it when our REST based application starts up and 
> > >> > we can't guarantee that a message will be published within that 
> > >> > time. To get around this we are currently posting a test 
> > >> > message to the topic
> right
> > >> > after creating the KafkaProducer prevents it happening.
> > >> >
> > >> > I'm not sure if this is worthy of a fix but I thought I'd post 
> > >> > it
> here
> > >> > in case someone else hits the same problem.
> > >> >
> > >> > Regards,
> > >> > Phil Luckhurst
> > >> >
> > >>
> > >
> > >
> >
>

Reply via email to