Produce request failed due to NotLeaderForPartitionException (leader epoch is old)

Ivan Balashov Tue, 16 Jun 2015 17:55:56 -0700

Hi,

During a round of kafka data discrepancy investigation I came across a
bunch of recurring errors below:


producer.log
>

2015-06-14 13:06:25,591 WARN  [task-thread-9]
> (k.p.a.DefaultEventHandler:83) - Produce request with correlation id 624
> failed due to [mytopic,21]: kafka.common.NotLeaderForPartitionException


kafka.log

[2015-06-14 13:05:13,025] 418953499 [request-expiration-task] WARN
> kafka.server.ReplicaManager
> - [Replica Manager on Broker 61]: Fetch request with correlation id 1 from 
> client fetchReq on partition [mytopic,21] failed due to Leader not local for 
> partition [mytopic,21] on broker 61
>



> state-change.log
> [2015-06-14 13:05:11,495] WARN Broker 29 ignoring LeaderAndIsr request
> from controller 45 with correlation id 41799 epoch 27 for partition
> [mytopic,21] since its associated leader epoch 191 is old. Current leader
> epoch is 191 (state.change.logger)


The warnings keep repeating several times during a day, and sometimes they
coincide with timestamps of presumably missing records.

As far as I understand occasional NotLeaderForPartitionException are fine,
but does the same apply to "old leader epoch" warning?

Could it be caused by any zk issue? However, I don't seem to find anything
particularly interesting in zk logs, except "likely client has closed
socket" or "Unexpected Exception:
java.nio.channels.CancelledKeyException". The former is all over the log
(must be the client issue) and the latter are rare and not correlated with
the original warnings.

Thanks,

Here are some bits of configuration:
Kafka 0.8.1.2, 3 brokers + 3 zk, 2x replication
request.required.acks=1
retry.backoff.ms=1000

Produce request failed due to NotLeaderForPartitionException (leader epoch is old)

Reply via email to