Zane Zhang created KAFKA-4071: --------------------------------- Summary: Corruptted replication-offset-checkpoint leads to kafka server disfunctional Key: KAFKA-4071 URL: https://issues.apache.org/jira/browse/KAFKA-4071 Project: Kafka Issue Type: Bug Components: clients, offset manager Affects Versions: 0.9.0.1 Environment: Red Hat Enterprise 6.7 Reporter: Zane Zhang
For an unknown reason, [kafka data root]/replication-offset-checkpoint was corrupted. First Kafka reported an NumberFormatException in kafka sever.out. And then it reported "error when handling request Name: FetchRequest; ... " ERRORs repeatedly (ERROR details below). As a result, clients cannot read from or write to Kafka on several partitions until replication-offset-checkpoint was manually deleted. ERROR [KafkaApi-7] error when handling request java.lang.NumberFormatException: For input string: " N?-; O" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:77) at java.lang.Integer.parseInt(Integer.java:493) at java.lang.Integer.parseInt(Integer.java:539) at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272) at scala.collection.immutable.StringOps.toInt(StringOps.scala:30) at kafka.server.OffsetCheckpoint.read(OffsetCheckpoint.scala:78) at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:93) at kafka.cluster.Partition$$anonfun$4$$anonfun$apply$2.apply(Partition.scala:173) at kafka.cluster.Partition$$anonfun$4$$anonfun$apply$2.apply(Partition.scala:173) at scala.collection.immutable.Set$Set2.foreach(Set.scala:111) at kafka.cluster.Partition$$anonfun$4.apply(Partition.scala:173) ERROR [KafkaApi-7] error when handling request Name: FetchRequest; Version: 1; CorrelationId: 0; ClientId: ReplicaFetcherThread-1-7; ReplicaId: 6; MaxWait: 500 ms; MinBytes: 1 bytes; RequestInfo: [prodTopicDal09E,166] -> PartitionFetchInfo(7123666,20971520),[prodTopicDal09E,118] -> PartitionFetchInfo(7128188,20971520),[prodTopicDal09E,238] -> -- This message was sent by Atlassian JIRA (v6.3.4#6332)