Kafka client troubles when restarting a broker

Andrey Dyachkov Thu, 27 Dec 2018 07:14:54 -0800

Hi guys,

When broker goes down due to restart, running and new consumers
starting to fail with:
[Consumer clientId=consumer-42169, groupId=] Connection to node
67108872 could not be established. Broker may not be available.
[Consumer clientId=consumer-46213, groupId=] Connection to node -4
could not be established. Broker may not be available.
It continues until the broker is up and running.


After broker goes up, producer is not able to publish messages:
Caused by: org.apache.kafka.common.errors.TimeoutException: Failed to
update metadata after 5000 ms.
Recreating producer helps in that case.

Moreover, if I restart only one broker I see everything described
above and cluster comes back to normal state. If i do rolling restart,
Kafka consumer is not able to recover from it ('...Failed to update
metadata...') and I have to restart machines with consumers
completely.

Along the way, I continuously see:
INFO Created a new error FetchContext for session id 2070234568: no
such session ID found.
INFO [ReplicaFetcher replicaId=67108868, leaderId=67108871,
fetcherId=2] Node 67108871 was unable to process the fetch request
with (sessionId=1942914170, epoch=8716): FETCH_SESSION_ID_NOT_FOUND.
(org.apache.kafka.clients.FetchSessionHandler)

Could you please clarify what wrong is here, and how I can improve the
cluster state, when restarting one broker? I have noticed this
behaviour for version 1.1.1, previously running 0.10.1.1 was switching
leadership just fine and metadata was updated in time.

Cluster is 6 nodes with replication factor of 3, min insync replicas
of 2 and acks is all.

-- 
Thanks,
Andrey

Kafka client troubles when restarting a broker

Reply via email to