Hi All,
For one of the Kafka producers that I have, I see that the Producer Record
Error rate is non-zero i.e. out of the expected 3000 messages per second which
I a expect to be producing to the topic, I can see that this metric shows a
rate of about 200.
Does this indicate that the records failed to be sent to the Kafka topic, or
does this metric show up even for each retry in the Producer.Send operation ?
Notes :
1) I have distributed 8 brokers equally across 2 sites. Using rack-awareness,
I am making Kafka position replicas equally across both sites. My min.isr=2 and
replication factor = 4. This makes 2 replicas to be located in each site.
2) The scenario I am testing is that of shutting down a set of 4 brokers in one
site (out of 8) for an extended period of time and then bringing them back up
after say 2 hours. This causes the the follower replicas on those brokers to
try and catch-up with the leader replicas on the other brokers. The error rate
that I am referring to shows up under this scenario of restarting the brokers.
It does not show up when I have just the other set of (4) brokers.
To be specific, here are the errors that I see in the Kafka producer log file:
2022-04-03 15:56:39.613 WARN --- [-thread | producer-1] o.a.k.c.p.i.Sender
: [Producer clientId=producer-1] Got error produce response
with correlation id 16512434 on topic-partition input-topic-114, retrying
(2147483646 attempts left). Error: OUT_OF_ORDER_SEQUENCE_NUMBER
2022-04-03 15:56:39.613 WARN --- [-thread | producer-1] o.a.k.c.p.i.Sender
: [Producer clientId=producer-1] Got error produce response
with correlation id 16512434 on topic-partition input-topic-58, retrying
(2147483646 attempts left). Error: OUT_OF_ORDER_SEQUENCE_NUMBER
2022-04-03 15:56:39.613 INFO --- [-thread | producer-1]
o.a.k.c.p.i.TransactionManager : [Producer clientId=producer-1]
ProducerId set to 2040 with epoch 159
2022-04-03 15:56:39.613 INFO --- [-thread | producer-1]
o.a.k.c.p.i.ProducerBatch : Resetting sequence number of batch
with current sequence 3 for partition input-topic-114 to 0
2022-04-03 15:56:39.613 INFO --- [-thread | producer-1]
o.a.k.c.p.i.ProducerBatch : Resetting sequence number of batch
with current sequence 5 for partition input-topic-114 to 2
2022-04-03 15:56:39.613 INFO --- [-thread | producer-1]
o.a.k.c.p.i.ProducerBatch : Resetting sequence number of batch
with current sequence 6 for partition input-topic-114 to 3
2022-04-03 15:56:39.613 INFO --- [-thread | producer-1]
o.a.k.c.p.i.ProducerBatch : Resetting sequence number of batch
with current sequence 1 for partition input-topic-58 to 0
2022-04-03 15:56:39.739 WARN --- [-thread | producer-1] o.a.k.c.p.i.Sender
: [Producer clientId=producer-1] Got error produce response
with correlation id 16512436 on topic-partition input-topic-82, retrying
(2147483646 attempts left). Error: OUT_OF_ORDER_SEQUENCE_NUMBER
Regards,
Neeraj