Hi community My team is faced with a very weird issue and still can't find the cause of it. Once per two weeks our producers get timeouts during sending data/metadata into Kafka cluster. In most cases, the issue was with leaders of the same kafka broker. During this issue, we found nothing in our monitoring systems for all kafka brokers. Logs had no any errors or unusual records.
Producers errors like 18-05-11 00:03:47:099 [ERROR] [kafka-producer-network-thread | producer-1] KafkaSenderLog Error occurred at sending an event to the bus: error='Batch containing 6 record(s) expired due to timeout while requesting metadata from brokers for events-1' , key='1' , time=40 , event={***} 18-04-29 02:18:17,112 ApplicationMsg=Error occurred at sending an event to the bus: error='Expiring 1 record(s) for events-8: 5007 ms has passed since batch creation plus linger time' , key='14337/aba54b157688238988054301ba15372b' , time=5018 , event={***} 18-05-09 15:43:56.031 KafkaSenderLog kafka-producer-network-thread | producer-52 [ERROR] Error occurred at sending an event to the bus: error='Batch containing 1 record(s) expired due to timeout while requesting metadata from brokers for tmm_events-6' , key='97031345b5f3c9f464eb2313a8febd148b21c354-46d74f8c9cee' , time=10017 , event={***} I have removed events content from my listing above. We are using Apache Kafka 0.10.2.1 with official java Kafka client lib. I tried to reproduce this with console producer script which comes with kafka package. I copied our production producers settings and used special topic with special data rotation settings. I reproduced it. I was getting timeouts only during sending data and again it was happening rarely and there were no issues with kafka brokers and their instances. [2018-05-12 06:50:42,133] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-8: 5023 ms has passed since last append [2018-05-12 06:50:42,133] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-9: 5036 ms has passed since last append [2018-05-12 06:50:42,134] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-9: 5006 ms has passed since last append [2018-05-12 06:50:42,134] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-2: 5022 ms has passed since last append [2018-05-12 06:50:42,134] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-3: 5035 ms has passed since last append [2018-05-12 06:50:42,134] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-3: 5006 ms has passed since last append [2018-05-12 06:50:42,134] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-4: 5015 ms has passed since last append [2018-05-12 06:50:42,134] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-5: 5016 ms has passed since last append [2018-05-14 23:45:07,973] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-0: 5010 ms has passed since last append [2018-05-14 23:45:07,998] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-1: 5039 ms has passed since last append [2018-05-14 23:45:07,998] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-10: 5037 ms has passed since last append [2018-05-14 23:45:08,008] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-11: 5056 ms has passed since last append [2018-05-14 23:45:08,008] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-11: 5006 ms has passed since last append [2018-05-14 23:45:08,011] ERROR Error when sending message to topic itops-test with key: null, value: 10240 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback) org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for itops-test-6: 5012 ms has passed since last append my test script contains head -c1000000 /dev/urandom | tr -dc 'a-zA-Z0-9~!@#$%^&*_-' | fold -w10240 | /usr/local/kafka/bin/kafka-console-producer.sh --broker-list broker1.aws:9093,broker2.aws:9093,broker3.aws:9093 --max-block-ms 3000 --metadata-expiry-ms 1000 --request-timeout-ms 5000 --message-send-max-retries 10 --max-memory-bytes 104857600 --batch-size 65536 --topic itops-test 2>>/var/log/kafka-timeouts-test/stderr.log >>/var/log/kafka-timeouts-test/stdout.log and runs every 30 seconds Could you please help me with this investigation or give some advice which can help me with this. Thank you