[
https://issues.apache.org/jira/browse/KAFKA-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Gustafson resolved KAFKA-8722.
------------------------------------
Resolution: Fixed
Fix Version/s: (was: 0.10.2.2)
0.10.2.3
> Data crc check repair
> ---------------------
>
> Key: KAFKA-8722
> URL: https://issues.apache.org/jira/browse/KAFKA-8722
> Project: Kafka
> Issue Type: Improvement
> Components: log
> Affects Versions: 0.10.2.2
> Reporter: ChenLin
> Priority: Major
> Fix For: 0.10.2.3
>
> Attachments: image-2019-07-27-14-50-08-128.png,
> image-2019-07-27-14-50-58-300.png, image-2019-07-27-14-56-25-610.png,
> image-2019-07-27-14-57-06-687.png, image-2019-07-27-15-05-12-565.png,
> image-2019-07-27-15-06-07-123.png, image-2019-07-27-15-10-21-709.png,
> image-2019-07-27-15-18-22-716.png, image-2019-07-30-11-39-01-605.png
>
>
> In our production environment, when we consume kafka's topic data in an
> operating program, we found an error:
> org.apache.kafka.common.KafkaException: Record for partition
> rl_dqn_debug_example-49 at offset 2911287689 is invalid, cause: Record is
> corrupt (stored crc = 3580880396, computed crc = 1701403171)
> at
> org.apache.kafka.clients.consumer.internals.Fetcher.parseRecord(Fetcher.java:869)
> at
> org.apache.kafka.clients.consumer.internals.Fetcher.parseCompletedFetch(Fetcher.java:788)
> at
> org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:480)
> at
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1188)
> at
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1046)
> at kafka.consumer.NewShinyConsumer.receive(BaseConsumer.scala:88)
> at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:120)
> at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:75)
> at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:50)
> at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
> At this point we used the kafka.tools.DumpLogSegments tool to parse the disk
> log file and found that there was indeed dirty data:
> !image-2019-07-27-14-57-06-687.png!
> By looking at the code, I found that in some cases kafka would not verify the
> data and write it to disk, so we fixed it.
> We found that when record.offset is not equal to the offset we are
> expecting, kafka will set the variable inPlaceAssignment to false. When
> inPlaceAssignment is false, data will not be verified:
> !image-2019-07-27-14-50-58-300.png!
> !image-2019-07-27-14-50-08-128.png!
> Our repairs are as follows:
> !image-2019-07-30-11-39-01-605.png!
> We did a comparative test for this. By modifying the client-side producer
> code, we made some dirty data. For the original kafka version, it was able to
> write to the disk normally, but when it was consumed, it was reported, but
> our repaired version was written. At the time, it can be verified, so this
> producer write failed:
> !image-2019-07-27-15-05-12-565.png!
> At this time, when the client consumes, an error will be reported:
> !image-2019-07-27-15-06-07-123.png!
> When the kafka server is replaced with the repaired version, the producer
> will verify that the dirty data is written. The producer failed to write the
> data this time
> !image-2019-07-27-15-10-21-709.png!
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)