On Mon, 2008-12-29 at 03:35 -0800, Sandeep Dhawan wrote: > Hi, > > I have a setup of 2-node Hadoop cluster running on Windows using cygwin. > When I open up the web gui to view the number of Live Nodes, it shows 2. > But when I kill the slave node and refreshes the gui, it still shows the > number of Live Nodes as 2. > > Its only after some 20-30 mins, that the master node is able to detect the > failure which is then reflected in the gui. It then shows up : > > Live Node : 1 > Dead Node : 1 > > Also, after killing the slave datanode if I try to copy a file from the > local file system, it fails. > > 1. Is there a way by which we can configure the time interval after which > master node can declare a datanode as dead.
Ans: I think this can be done on the basis of heartbeat . If master node , does not able to receive the heartbeat within time interval from the datanode , it consider as problematic node . See parameter "dfs.heartbeat.interval" in hadoop-default.xml > 2. Why does the file transfer fail when one of the slave node is dead and > masternode is alive. > Ans: If one of the slave node is dead , then still the data should be stored if another slave node is alive. It will be better if you paste the error message you got while copying the data .
