Hi Sujee,
   HDFS today doesn't consider too much on data center level reliability 
(although it is supposed to extend to data center layer in topology but never 
honored in replica policemen/balancer/task scheduling policy) and performance 
is part of concern to cross data center (assume cross-dc bandwidth is lower 
than within data center). However, in future, I think we should deliver a 
solution to enable data center level disaster recovery even performance is 
downgrade. My several years experience in delivering enterprise software is: it 
is best to let customer to make trade-off decision on performance and 
reliability, and engineering effort is to provide options.
BTW, HDFS HA is a protection of key nodes from SPOF but not handle the whole 
data center shutdown.

Thanks,

Junping

----- Original Message -----
From: "Sujee Maniyam" <su...@sujee.net>
To: "hdfs-dev" <hdfs-dev@hadoop.apache.org>
Sent: Tuesday, September 11, 2012 7:29:39 AM
Subject: data center aware hadoop?

HI devs
now that hfds HA is is a reality,  how about HDFS spanning multiple
data centers?  Are there any discussions / work going on in this area?

It could be a single cluster spanning multiple data centers or having
a 'standby cluster' in another data center.

curious, and thanks for your time!

regards
Sujee Maniyam
http://sujee.net

Reply via email to