With debug logging turned on we've sometimes seen our logs filling up with the
kafka producer sending metadata requests every 100ms e.g.
2016-04-08 10:39:33,592 DEBUG [kafka-producer-network-thread | phil-pa-1]
org.apache.kafka.clients.NetworkClient: Sending metadata request
ClientRequest(expectResponse=true, callback=null,
request=RequestSend(header={api_key=3,api_version=0,correlation_id=249,client_id=phil-pa-1},
body={topics=[phil-pa-1-device-update]}), isInitiatedByNetworkClient,
createdTimeMs=1460108373592, sendTimeMs=0) to node 0
2016-04-08 10:39:33,592 DEBUG [kafka-producer-network-thread | phil-pa-1]
org.apache.kafka.clients.Metadata: Updated cluster metadata version 248 to
Cluster(nodes = [Node(0, ta-eng-kafka2, 9092)], partitions = [Partition(topic =
phil-pa-1-device-update, partition = 0, leader = 0, replicas = [0,], isr =
[0,]])
2016-04-08 10:39:33,698 DEBUG [kafka-producer-network-thread | phil-pa-1]
org.apache.kafka.clients.NetworkClient: Sending metadata request
ClientRequest(expectResponse=true, callback=null,
request=RequestSend(header={api_key=3,api_version=0,correlation_id=250,client_id=phil-pa-1},
body={topics=[phil-pa-1-device-update]}), isInitiatedByNetworkClient,
createdTimeMs=1460108373698, sendTimeMs=0) to node 0
2016-04-08 10:39:33,698 DEBUG [kafka-producer-network-thread | phil-pa-1]
org.apache.kafka.clients.Metadata: Updated cluster metadata version 249 to
Cluster(nodes = [Node(0, ta-eng-kafka2, 9092)], partitions = [Partition(topic =
phil-pa-1-device-update, partition = 0, leader = 0, replicas = [0,], isr =
[0,]])
These metadata requests continue to be sent every 100ms (retry.backoff.ms)
until we stop the process.
This only seems to happen if the KafkaProducer instance is created but not used
to publish a message for 5 minutes. After 5 minutes (metadata.max.age.ms) the
producer thread sends a metadata request to the server that has an empty topics
list and the server responds with the partition information for *all* topics
hosted on the server.
2016-04-11 14:16:39,320 DEBUG [kafka-producer-network-thread | phil-pa-1]
org.apache.kafka.clients.NetworkClient: Sending metadata request
ClientRequest(expectResponse=true, callback=null,
request=RequestSend(header={api_key=3,api_version=0,correlation_id=0,client_id=phil-pa-1},
body={topics=[]}), isInitiatedByNetworkClient, createdTimeMs=1460380599289,
sendTimeMs=0) to node -1
If we then use that KafkaProducer instance to send a message the next 'Sending
meta request' will just be for the topic we have sent the message to and this
then triggers the flood of retry requests as noted above.
If we ensure we send the first message within the time set by
metadata.max.age.ms (default 5 minutes) then everything works as expected and
the metadata requests do not continually get retried.
In many cases I can understand that creating a KafkaProducer and then not using
it within 5 minutes is not usual but in our case we're creating it when our
REST based application starts up and we can't guarantee that a message will be
published within that time. To get around this we are currently posting a test
message to the topic right after creating the KafkaProducer prevents it
happening.
I'm not sure if this is worthy of a fix but I thought I'd post it here in case
someone else hits the same problem.
Regards,
Phil Luckhurst