[ 
https://issues.apache.org/jira/browse/KAFKA-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17898039#comment-17898039
 ] 

Teddy Yan commented on KAFKA-9613:
----------------------------------

The replica got the following errors. Yes, it becomes an out-of-sync replica. 
We have a `min.insync.replicas` on the topic (around 2 hours), but we must wait 
2 hours to move on.  Can Kafka skip the wrong records if we allow to lose some 
data? 

```

[2024-11-13 20:09:26,028] ERROR [ReplicaFetcher replicaId=334945090, 
leaderId=191572054, fetcherId=0] Error for partition df-flow-3 at offset 
185972321 (kafka.server.ReplicaFetcherThread)
org.apache.kafka.common.errors.CorruptRecordException: This message has failed 
its CRC checksum, exceeds the valid size, has a null key for a compacted topic, 
or is otherwise corrupt.

```

I tried to delete records to skip the error log on the error disk. But I got a 
timeout error. I don't know why it's a timeout. *Might we fix it?* If there is 
a slight offset, it won't time out. Looks like we can't delete the wrong 
records, but retention can.

 

It's easy to reproduce. Break the log using the following command.
```

root@a1:/home/support# \{ printf '\x00'; tail -c +0 00000000000185492119.log; } 
> 00000000000185492119.log

```

!image-2024-11-13-14-02-45-768.png!

> CorruptRecordException: Found record size 0 smaller than minimum record 
> overhead
> --------------------------------------------------------------------------------
>
>                 Key: KAFKA-9613
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9613
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 2.6.2
>            Reporter: Amit Khandelwal
>            Assignee: hudeqi
>            Priority: Major
>         Attachments: image-2024-11-13-14-02-45-768.png
>
>
> 20200224;21:01:38: [2020-02-24 21:01:38,615] ERROR [ReplicaManager broker=0] 
> Error processing fetch with max size 1048576 from consumer on partition 
> SANDBOX.BROKER.NEWORDER-0: (fetchOffset=211886, logStartOffset=-1, 
> maxBytes=1048576, currentLeaderEpoch=Optional.empty) 
> (kafka.server.ReplicaManager)
> 20200224;21:01:38: org.apache.kafka.common.errors.CorruptRecordException: 
> Found record size 0 smaller than minimum record overhead (14) in file 
> /data/tmp/kafka-topic-logs/SANDBOX.BROKER.NEWORDER-0/00000000000000000000.log.
> 20200224;21:05:48: [2020-02-24 21:05:48,711] INFO [GroupMetadataManager 
> brokerId=0] Removed 0 expired offsets in 1 milliseconds. 
> (kafka.coordinator.group.GroupMetadataManager)
> 20200224;21:10:22: [2020-02-24 21:10:22,204] INFO [GroupCoordinator 0]: 
> Member 
> xxxxxxxx_011-9e61d2c9-ce5a-4231-bda1-f04e6c260dc0-StreamThread-1-consumer-27768816-ee87-498f-8896-191912282d4f
>  in group yyyyyyyyy_011 has failed, removing it from the group 
> (kafka.coordinator.group.GroupCoordinator)
>  
> [https://stackoverflow.com/questions/60404510/kafka-broker-issue-replica-manager-with-max-size#]
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to