[ https://issues.apache.org/jira/browse/KAFKA-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15431056#comment-15431056 ]
Alexey Ozeritskiy commented on KAFKA-4071: ------------------------------------------ Thanks [~zanezhang]. Could you attach replication-offset-checkpoint ? It is interesting to see its original binary content. > Corruptted replication-offset-checkpoint leads to kafka server disfunctional > ---------------------------------------------------------------------------- > > Key: KAFKA-4071 > URL: https://issues.apache.org/jira/browse/KAFKA-4071 > Project: Kafka > Issue Type: Bug > Components: clients, offset manager > Affects Versions: 0.9.0.1 > Environment: Red Hat Enterprise 6.7 > Reporter: Zane Zhang > Priority: Critical > > For an unknown reason, [kafka data root]/replication-offset-checkpoint was > corrupted. First Kafka reported an NumberFormatException in kafka sever.out. > And then it reported "error when handling request Name: FetchRequest; ... " > ERRORs repeatedly (ERROR details below). As a result, clients cannot read > from or write to Kafka on several partitions until > replication-offset-checkpoint was manually deleted. > Can Kafka broker handle this error and survive from it? > And what's the reason this file was corrupted? - Only one file was corrupted > and no noticeable disk failure was detected. > ERROR [KafkaApi-7] error when handling request > java.lang.NumberFormatException: For input string: " N?-; O" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:77) > at java.lang.Integer.parseInt(Integer.java:493) > at java.lang.Integer.parseInt(Integer.java:539) > at > scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272) > at scala.collection.immutable.StringOps.toInt(StringOps.scala:30) > at kafka.server.OffsetCheckpoint.read(OffsetCheckpoint.scala:78) > at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:93) > at > kafka.cluster.Partition$$anonfun$4$$anonfun$apply$2.apply(Partition.scala:173) > at > kafka.cluster.Partition$$anonfun$4$$anonfun$apply$2.apply(Partition.scala:173) > at scala.collection.immutable.Set$Set2.foreach(Set.scala:111) > at kafka.cluster.Partition$$anonfun$4.apply(Partition.scala:173) > ERROR [KafkaApi-7] error when handling request Name: FetchRequest; Version: > 1; CorrelationId: 0; ClientId: ReplicaFetcherThread-1-7; ReplicaId: 6; > MaxWait: 500 ms; MinBytes: 1 bytes; RequestInfo: [prodTopicDal09E,166] -> > PartitionFetchInfo(7123666,20971520),[prodTopicDal09E,118] -> > PartitionFetchInfo(7128188,20971520),[prodTopicDal09E,238] -> -- This message was sent by Atlassian JIRA (v6.3.4#6332)