+1
On Thu, Jan 21, 2010 at 2:58 PM, Tsz Wo (Nicholas), Sze <s29752-hadoop...@yahoo.com> wrote: > +1 > Nicholas Sze > > > > > ----- Original Message ---- >> From: Stack <st...@duboce.net> >> To: hdfs-dev@hadoop.apache.org >> Cc: HBase Dev List <hbase-...@hadoop.apache.org> >> Sent: Thu, January 21, 2010 2:36:25 PM >> Subject: [VOTE -- Round 2] Commit hdfs-630 to 0.21? >> >> I'd like to propose a new vote on having hdfs-630 committed to 0.21. >> The first vote on this topic, initiated 12/14/2009, was sunk by Tsz Wo >> (Nicholas), Sze suggested improvements. Those suggestions have since >> been folded into a new version of the hdfs-630 patch. Its this new >> version of the patch -- 0001-Fix-HDFS-630-0.21-svn-2.patch -- that I'd >> like us to vote on. For background on why we -- the hbase community >> -- think hdfs-630 important, see the notes below from the original >> call-to-vote. >> >> I'm obviously +1. >> >> Thanks for you consideration, >> St.Ack >> >> P.S. Regards TRUNK, after chatting with Nicholas, TRUNK was cleaned of >> the previous versions of hdfs-630 and we'll likely apply >> 0001-Fix-HDFS-630-trunk-svn-4.patch, a version of >> 0001-Fix-HDFS-630-0.21-svn-2.patch that works for TRUNK that includes >> the Nicholas suggestions. >> >> >> On Mon, Dec 14, 2009 at 9:56 PM, stack wrote: >> > I'd like to propose a vote on having hdfs-630 committed to 0.21 (Its >> > already >> > been committed to TRUNK). >> > >> > hdfs-630 adds having the dfsclient pass the namenode the name of datanodes >> > its determined dead because it got a failed connection when it tried to >> > contact it, etc. This is useful in the interval between datanode dying and >> > namenode timing out its lease. Without this fix, the namenode can often >> > give out the dead datanode as a host for a block. If the cluster is small, >> > less than 5 or 6 nodes, then its very likely namenode will give out the >> > dead >> > datanode as a block host. >> > >> > Small clusters are common in hbase, especially when folks are starting out >> > or evaluating hbase. They'll start with three or four nodes carrying both >> > datanodes+hbase regionservers. They'll experiment killing one of the >> > slaves >> > -- datanodes and regionserver -- and watch what happens. What follows is a >> > struggling dfsclient trying to create replicas where one of the datanodes >> > passed us by the namenode is dead. DFSClient will fail and then go back >> > to >> > the namenode again, etc. (See >> > https://issues.apache.org/jira/browse/HBASE-1876 for more detailed >> > blow-by-blow). HBase operation will be held up during this time and >> > eventually a regionserver will shut itself down to protect itself against >> > dataloss if we can't successfully write HDFS. >> > >> > Thanks all, >> > St.Ack > > >