Li Junjun created HDFS-4318:
-------------------------------

             Summary: validateBlockMetadata reduce the success rate of block 
recover
                 Key: HDFS-4318
                 URL: https://issues.apache.org/jira/browse/HDFS-4318
             Project: Hadoop HDFS
          Issue Type: Wish
          Components: datanode
    Affects Versions: 1.0.1
            Reporter: Li Junjun
            Priority: Minor


some logs like "java.io.IOException: Block blk_3272028001529756059_11883841 
length is 20480 does not match block file length 21376 "
when recovery block

when datanode perform startBlockRecovery  it call validateBlockMetadata( in 
FSDataset.startBlockRecovery  ),
check the file lenth match the block's numBytes 

so , let us see how  block's numBytes was updated in datanode 
when write block in BlockReceiver.receivePacket , 
write->flush->setVisibleLength, that means 
it is  normal and reasonable  that the file length > the block's numBytes if 
write or flush throw exception . 
In startBlockRecovery( or other situation,to be check) 
we just need to guarantee  the file length < the block's numBytes never happens 
.


I suggest change the validateBlockMetadata , because it  reduced the success 
rate of block recover.

when you have a pipline , a->b->c  when a got error in network , b got error in 
write->flush , we can only 
count on c!



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to