Hello Hadoop Users list:
We are running Hadoop version 0.18.2. My team lead has asked me
to investigate the answer to a particular question regarding Hadoop's handling
of offline DataNodes - specifically, we would like to know how long a node can
be offline before it is totally rebuilt when it has been readded to the cluster.
From what I've been able to determine from the documentation it
appears to me that the NameNode will simply begin scheduling block replication
on its remaining cluster members. If the offline node comes back online, and it
reports all its blocks as being uncorrupted, then the NameNode just cleans up
the "extra" blocks.
In other words, there is no explicit handling based on the
length of the outage - the behavior of the cluster will depend entirely on the
outage duration.
Anyone care to shed some light on this?
Thanks!
Regards,
Joseph Hammerman