kafka 1.0/0.11.0.1 log message upgrade: Error processing fetch operation on partition __consumer_offsets-21 offset 200349244

Brett Rann Fri, 15 Dec 2017 00:33:36 -0800

on `kafka_2.11-1.0.1-d04daf570` we are upgrading the log format from
0.9.0.1 to 0.11.0.1 and after the upgrade have set


inter.broker.protocol.version=1.0
log.message.format.version=0.11.0.1

We have applied this upgrade to 5 clusters by upgrading broker 1, leaving
it for a day, then coming back when happy to upgrade the remaining brokers.


4 of those upgrades went without issue.


However in one, when we upgraded the remaining brokers, we now start seeing
these errors on broker1:

Error processing fetch operation on partition __consumer_offsets-21
offset 200349244

For 4 consumer offset partitions, all which happen to be led by 1.

kafka-request-handler-3 72 ERROR kafka.server.ReplicaManager
2017-12-15T07:39:40.380+0000 [ReplicaManager broker=1] Error
processing fetch operation on partition __consumer_offsets-21  offset
200349244
kafka-request-handler-3 72 ERROR kafka.server.ReplicaManager
2017-12-15T07:39:40.381+0000 [ReplicaManager broker=1] Error
processing fetch operation on partition __consumer_offsets-11  offset
188709568
kafka-request-handler-3 72 ERROR kafka.server.ReplicaManager
2017-12-15T07:39:40.381+0000 [ReplicaManager broker=1] Error
processing fetch operation on partition __consumer_offsets-1  offset
2045483676
kafka-request-handler-5 74 ERROR kafka.server.ReplicaManager
2017-12-15T07:39:41.672+0000 [ReplicaManager broker=1] Error
processing fetch operation on partition __consumer_offsets-31  offset
235294887

Every second or so.

If we stop that broker, those errors simply shift to the next leader for
those 4 partitions. And moving the partitions to completely new brokers
just moves the errors.

We only see this on kafka1. not the other 9 brokers which had the log
message fromat upgraded a day or two later.

Any suggestion on how to proceed? I'm not even sure yet if this is isolated
to the cluster, or if it's related to a consumer misbehaving.  Since our
multiple clusters /should/ have the same set of producers/consumers working
on them, I'm doubtful that it's a misbehaving client.

kafka 1.0/0.11.0.1 log message upgrade: Error processing fetch operation on partition __consumer_offsets-21 offset 200349244

Reply via email to