[ 
https://issues.apache.org/jira/browse/KAFKA-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402740#comment-15402740
 ] 

Jun Rao commented on KAFKA-4009:
--------------------------------

[~aganesan], thanks for reporting this. In your test, did the producer use 
ack=all? Also, did N1 detect the corruption during log recovery when restarting 
the broker or during appending to the log? 

> Data corruption or EIO leads to data loss
> -----------------------------------------
>
>                 Key: KAFKA-4009
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4009
>             Project: Kafka
>          Issue Type: Bug
>          Components: log
>    Affects Versions: 0.9.0.0
>            Reporter: Aishwarya Ganesan
>
> I have a 3 node kafka cluster (N1,N2 and N3) with 
> log.flush.interval.messages=1, min.insync.replicas=3 and 
> unclean.leader.election.enable=false and a single Zookeeper node. My workload 
> inserts few messages and on completion of the workload, the 
> recovery-point-offset-checkpoint reflects the latest offset of the messages 
> committed. 
> I have a small testing tool that drives distributed applications into corner 
> cases by simulating possible error conditions like EIO, ENOSPC and EDQUOT 
> that can be encountered in all modern file systems such as ext4. The tool 
> also simulates on-disk silent data corruption. 
> When I introduce silent data corruption in a node (say N1) in the ISR, Kafka 
> is able to detect corruption using checksum and ignores the log entry from 
> that point onwards. Even though N1 has lost log entries and 
> recovery-point-offset-checkpoint file in N1 indicates the latest offsets, N1 
> is allowed to become the leader because it is in the ISR.  Also, the other 
> nodes N2 and N3 crash with the following log message:
> FATAL [ReplicaFetcherThread-0-1], Halting because log truncation is not 
> allowed for topic my-topic1, Current leader 1's latest offset 0 is less than 
> replica 3's latest offset 1 (kafka.server.ReplicaFetcherThread)
> The end result is that a silent data corruption leads to data loss because 
> querying the cluster returns only messages before the corrupted entry. Note 
> that the cluster at this point has only N1. This situation could have been 
> avoided if the node N1 which had to ignore the log entry wasn't allowed to 
> become the leader. This scenario wouldn't happen in a majority based leader 
> election as other nodes (N2 or N3) would have denied vote for N1 because N1's 
> log is not complete compared to N2 or N3's log.
> If this scenario happens in any of the followers, it ignores the log entry 
> and copies data from the leader and there is no data loss.
> Encountering an EIO thrown by the file system for a particular block results 
> in the same consequence of data loss on querying the cluster and the 
> remaining two nodes crash. An EIO on read could be thrown for a variety of 
> reasons including a latent sector error of one or more sectors on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to