Hui Fei created HDFS-16060:
------------------------------

             Summary: There is an inconsistent between replicas of datanodes 
when hardware is abnormal
                 Key: HDFS-16060
                 URL: https://issues.apache.org/jira/browse/HDFS-16060
             Project: Hadoop HDFS
          Issue Type: Bug
    Affects Versions: 3.4.0
            Reporter: Hui Fei


We find the following case on production env.
 * replicas of the same block are stored in dn1, dn2.
 * replicas of dn1 and dn2 are different
 * Verify meta & data for replica successfully on dn1, and the same on dn2.

User code is just copyfromlocal.

Find some error log on datanode at first

{quote}

2021-05-27 04:54:20,471 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
Checksum error in block 
BP-1453431581-x.x.x.x-1531302155027:blk_13892199285_12902824176 from 
/y.y.y.y:47960
org.apache.hadoop.fs.ChecksumException: Checksum error: 
DFSClient_NONMAPREDUCE_-1760730985_129 at 0 exp: 37939694 got: -1180138774
 at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(Native 
Method)
 at 
org.apache.hadoop.util.NativeCrc32.verifyChunkedSumsByteArray(NativeCrc32.java:69)
 at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:347)
 at org.apache.hadoop.util.DataChecksum.verifyChunkedSums(DataChecksum.java:294)
 at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.verifyChunks(BlockReceiver.java:438)
 at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:582)
 at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:885)
 at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:801)
 at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
 at java.lang.Thread.run(Thread.java:748)

{quote}

After this, new pipeline is created and then wrong data and meta written in 
disk file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to