Re: NameNode as a single point of failure

Dmitry Salychev Mon, 06 Jul 2015 23:11:19 -0700

Hi Konstantin,

I can not reply to you about WANdisco's proprietary system right now,I'll have to talk to our team. I'm afraid that we are not ready for paidsolution, I guess. Generally, I'm looking for an entry point tocontribute to Hadoop or related projects (like HDFS, and so on). I thinkthat Apache Hadoop has to provide distributed and replicated metadatanodes out-of-the-box for anyone who wants this feature. If so, are thereany thoughts to improve HDFS or completely replace it with Giraffa?


Giraffa looks great, and I have a few questions:

1. Is it possible to run Giraffa instead of HDFS on Apache Hadoop towork with HBase?2. How can I help you with Giraffa? Are there any "hot spots" indevelopment?

3. Is there thing like "Architecture Guide" for Giraffa?

On 07/07/2015 03:31 AM, Konstantin Shvachko wrote:

Hey Dmitry,

You understood correctly that QJM with automatic failover is the current
state of the art for HDFS.
With it we still have a single active NameNode on the cluster at any given
time, which does not solve the performance bottleneck problem.
I think active-active HA would have been an improvement for HDFS, even
though the idea did not win the popularity vote in the community.

If you are looking for a commercial solution I can talk to you about
WANdisco proprietary system off this list.
If you are looking for a development opportunity I can suggest looking at
our Giraffa project, which is designed to have both data and metadata
distributed and replicated:
https://github.com/GiraffaFS/giraffa

Thanks,
--Konstantin


On Thu, Jul 2, 2015 at 8:25 AM, Dmitry Salychev <darkness....@gmail.com>
wrote:

Hi, Esteban.

Thanks for your reply. Thus, QJM automatic failover option is a cut-edge
thing. Am I right?

I think that it's a good idea to have truly equal NNs doing their work in
parallel, as Konstantin Shvachko mentioned.

On 07/02/2015 04:49 PM, Esteban Gutierrez wrote:

Hi Dmitry,

Have you looked into the QJM automatic failover mode using the
ZKFailoverController?

https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#Automatic_Failover
This is the most commonly used HA mode in production environments. Also
there is some recent work that will be in Hadoop 3 that will allow to have
more than 1 stand-by NNs: https://issues.apache.org/jira/browse/HDFS-6440

cheers,
esteban.


--
Cloudera, Inc.


On Thu, Jul 2, 2015 at 7:42 AM, Dmitry Salychev <darkness....@gmail.com>
wrote:

  Sure, I did. It's actually not what I'm looking for. I don't want to

spend
time to make dead NN alive by my hands. There should be a solution for
NN-SPOF problem.


On 07/02/2015 04:36 PM, Vinayakumar B wrote:

  Hi..

Did you look at the HDFS Namenode high availability?

-Vinay
On Jul 2, 2015 11:50 AM, "Dmitry Salychev" <darkness....@gmail.com>
wrote:

   Hello, HDFS Developers.

I know that NN is a single point of failure of an entire HDFS cluster.
If
it fails, the cluster will be unavailable no matter how many DN there.
I
know that there is an initiative <


http://www.wandisco.com/system/files/documentation/Meetup-ConsensusReplication.pdf
which introduces ConsensusNode (as I can see it looks like distributed
NN)
and related issues (HDFS-6469 <
https://issues.apache.org/jira/browse/HDFS-6469>, HADOOP-10641 <
https://issues.apache.org/jira/browse/HADOOP-10641> and HDFS-7007 <
https://issues.apache.org/jira/browse/HDFS-7007>). So, I'd like to
ask.

Has this NN-SPOF problem been solved? If it hasn't, can you show me an
entry point where I can help to solve it?

Thanks for your time.

Re: NameNode as a single point of failure

Reply via email to