[ https://issues.apache.org/jira/browse/HDFS-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon resolved HDFS-1227. ------------------------------- Resolution: Duplicate Going to resolve this as invalid. If you can reproduce after HDFS-1186 is committed, or provide a unit test, we can reopen. > UpdateBlock fails due to unmatched file length > ---------------------------------------------- > > Key: HDFS-1227 > URL: https://issues.apache.org/jira/browse/HDFS-1227 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node > Affects Versions: 0.20-append > Reporter: Thanh Do > > - Summary: client append is not atomic, hence, it is possible that > when retrying during append, there is an exception in updateBlock > indicating unmatched file length, making append failed. > > - Setup: > + # available datanodes = 3 > + # disks / datanode = 1 > + # failures = 2 > + failure type = bad disk > + When/where failure happens = (see below) > + This bug is non-deterministic, to reproduce it, add a sufficient sleep > before out.write() in BlockReceiver.receivePacket() in dn1 and dn2 but not dn3 > > - Details: > Suppose client appends 16 bytes to block X which has length 16 bytes at dn1, > dn2, dn3. > Dn1 is primary. The pipeline is dn3-dn2-dn1. recoverBlock succeeds. > Client starts sending data to the dn3 - the first datanode in pipeline. > dn3 forwards the packet to downstream datanodes, and starts writing > data to its disk. Suppose there is an exception in dn3 when writing to disk. > Client gets the exception, it starts the recovery code by calling > dn1.recoverBlock() again. > dn1 in turn calls dn2.getMetadataInfo() and dn1.getMetaDataInfo() to build > the syncList. > Suppose at the time getMetadataInfo() is called at both datanodes (dn1 and > dn2), > the previous packet (which is sent from dn3) has not come to disk yet. > Hence, the block Info given by getMetaDataInfo contains the length of 16 > bytes. > But after that, the packet "comes" to disk, making the block file length now > becomes 32 bytes. > Using the syncList (with contains block info with length 16 byte), dn1 calls > updateBlock at > dn2 and dn1, which will failed, because the length of new block info (given > by updateBlock, > which is 16 byte) does not match with its actual length on disk (which is 32 > byte) > > Note that this bug is non-deterministic. Its depends on the thread > interleaving > at datanodes. > This bug was found by our Failure Testing Service framework: > http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html > For questions, please email us: Thanh Do (than...@cs.wisc.edu) and > Haryadi Gunawi (hary...@eecs.berkeley.edu) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.