Mukul Kumar Singh created HDFS-13132:
----------------------------------------

             Summary: Ozone: Handle datanode failures in Storage Container 
Manager
                 Key: HDFS-13132
                 URL: https://issues.apache.org/jira/browse/HDFS-13132
             Project: Hadoop HDFS
          Issue Type: Sub-task
          Components: ozone
    Affects Versions: HDFS-7240
            Reporter: Mukul Kumar Singh
            Assignee: Mukul Kumar Singh
             Fix For: HDFS-7240


Currently SCM receives heartbeat from the datanodes in the cluster receiving 
container reports. Apart from this Ratis leader also receives the heartbeats 
from the nodes in a Raft ring. The ratis heartbeats are at a smaller interval 
(500 ms) whereas SCM heartbeats are at (30s), it is thereby considered safe to 
assume that a datanode is really lost when SCM missed heartbeat from such a 
node.

The pipeline recovery will follow the following steps

1) As noted earlier, SCM will identify a dead DN via the heartbeats. Current 
stale interval is (1.5m). Once a stale node has been identified, SCM will find 
the list of containers for the pipelines the datanode was part of.

2) SCM sends close container command to the datanodes, note that at this time, 
the Ratis ring has 2 nodes in the ring and consistency can still be guaranteed 
by Ratis.

3) If another node dies before the close container command succeeded, then 
ratis cannot guarantee consistency of the data being written/ close container. 
The pipeline here will be marked in a inconsistent state.

4) Closed container will be replicated via the close container replication 
protocol.
If the dead datanode comes back, as part of the re-register command, SCM will 
ask the Datanode to format all the open containers.

5) Return the healthy nodes back to the free node pool for the next pipeline 
allocation

6) Read operation to close containers will succeed however read operation to a 
open container on a single node cluster will be disallowed. It will only be 
allowed under a special flag aka ReadInconsistentData flag.


This jira will introduce the mechanism to identify and handle datanode failure.
However handling of a) 2 nodes simultaneously and b) Return the nodes to 
healthy state c) allow inconsistent data reads and d) purging of open container 
on a zombie node will be done as part of separate bugs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to