Todd Lipcon created HDFS-3875:
---------------------------------

             Summary: Issue handling checksum errors in write pipeline
                 Key: HDFS-3875
                 URL: https://issues.apache.org/jira/browse/HDFS-3875
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: data-node, hdfs client
    Affects Versions: 2.2.0-alpha
            Reporter: Todd Lipcon


We saw this issue with one block in a large test cluster. The client is storing 
the data with replication level 2, and we saw the following:
- the second node in the pipeline detects a checksum error on the data it 
received from the first node. We don't know if the client sent a bad checksum, 
or if it got corrupted between node 1 and node 2 in the pipeline.
- this caused the second node to get kicked out of the pipeline, since it threw 
an exception. The pipeline started up again with only one replica (the first 
node in the pipeline)
- this replica was later determined to be corrupt by the block scanner, and 
unrecoverable since it is the only replica

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to