[ https://issues.apache.org/jira/browse/HDFS-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer resolved HDFS-1384. ------------------------------------ Resolution: Incomplete Closing as stale. > NameNode should give client the first node in the pipeline from different > rack other than that of excludedNodes list in the same rack. > --------------------------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-1384 > URL: https://issues.apache.org/jira/browse/HDFS-1384 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 0.20.1, 0.20-append > Reporter: Thanh Do > > We saw a case that NN keeps giving client nodes from the same rack, hence an > exception > from client when try to setup the pipeline. Client retries 5 times and fails. > > Here is more details. Support we have 2 rack > - Rack 0: from dn1 to dn7 > - Rack 1: from dn8 to dn14 > Client asks for 3 dns and NN replies with dn1, dn8 and dn9, for example. > Because there is network partition, so client doesn't see any node in Rack 0. > Hence, client add dn1 to excludedNodes list, and ask NN again. > Interestingly, NN picks a different node (from those in excludedNodes) in > Rack 0, > and gives back to client, and so on. Client keeps retrying and after 5 times > of retrials, > write fails. > This bug was found by our Failure Testing Service framework: > http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html > For questions, please email us: Thanh Do (than...@cs.wisc.edu) and > Haryadi Gunawi (hary...@eecs.berkeley.edu) -- This message was sent by Atlassian JIRA (v6.2#6252)