Yonghwan Kim created HDFS-4945:
----------------------------------

             Summary: A Distributed and Cooperative NameNode Cluster for a 
Highly-Available HDFS
                 Key: HDFS-4945
                 URL: https://issues.apache.org/jira/browse/HDFS-4945
             Project: Hadoop HDFS
          Issue Type: New Feature
          Components: auto-failover
    Affects Versions: HA branch (HDFS-1623)
            Reporter: Yonghwan Kim


Recently, Hadoop attracts much attention of engineers and researchers as an 
emerging and effective framework for Big Data.
HDFS(Hadoop Distributed File System) can manage huge amount of data with 
guaranteeing high performance and reliability 
with only commodity hardware. 

However, HDFS requires a single master node, called NameNode, to manage the 
entire namespace (or all the i-nodes) 
of a file system. This causes SPOF (Single Point Of Failure) problem because 
the file system becomes inaccessible 
when the NameNode fails. (HDFS-2064)

This also causes a bottleneck of efficiency since all the access requests to 
the file system have to contact the 
NameNode. Hadoop 2.0 resolves the SPOF problem by introducing manual failover 
based on two NameNodes, Active and Standby.
However, it still has the efficiency bottleneck problem since all the access 
requests have to contact the Active 
in ordinary executions. It may also lose an advantage of using commodity 
hardware since the two NameNodes have to 
share a highly-reliable sophisticated storage.

We here propose a new HDFS architecture to resolve all the problems mentioned 
above.
The proposed architecture has the following features and advantages.

1. Multiple NameNodes (not restricted to two) can be utilized to improve 
availability.  
The entire namespace of a file system is partitioned into several fragments, 
and replicas of each fragment are 
dispersed among the NameNodes.  When each fragment has k replicas, the file 
system can tolerate up to 
floor(k/2 - 1) faulty NameNodes.

2. Multiple NameNodes can be utilized to improve performance. The performance 
bottleneck caused by a single 
NameNode can be circumvented by assigning different NameNodes to different 
fragments as the primary ones 
(or the entry points).

3. The highly-reliable storage shared by the NameNodes is removed by 
introducing message-based consistency 
mechanism among the NameNodes.  The architecture requires only commodity 
hardware.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to