Re: [Proposal] Pluggable Namespace
Milind, Am I missing something here? This was supposed to be a discussion and am hoping thats why you started the thread. I don't see anywhere any conspiracy theory being considered or being talked about. Vinod asked some questions, if you can't or do not want to respond I suggest you skip emailing or ignore rather than making false assumptions and accusations. I hope the intent here is to contribute code and stays that way. thanks mahadev On Oct 6, 2013, at 5:58 PM, Milind Bhandarkar wrote: > Vinod, > > I have received a few emails about concerns that this effort somehow > conflicts with federated namenodes. Most of these emails are from folks > who are directly or remotely associated with Hortonworks. > > Three weeks ago, I sent emails about this effort to a few Hadoop > committers who are primarily focused on HDFS, whose email address I had. > While 2 out of those three responded to me, the third person associated > with Hortonworks, did not. > > Is Hortonworks concerned that this proposal conflicts with their > development on federated namenode ? I have explicitly stated that it does > not, and is orthogonal to federation. But I would like to know if there > are some false assumptions being made about the intent of this > development, and would like to quash any conspiracy theories right now, > before they assume a life of their own. > > Thanks, > > Milind > > > -Original Message- > From: Vinod Kumar Vavilapalli [mailto:vino...@hortonworks.com] > Sent: Sunday, October 06, 2013 12:21 PM > To: hdfs-dev@hadoop.apache.org > Subject: Re: [Proposal] Pluggable Namespace > > In order to make federation happen, the block pool management was already > separated. Isn't that the same as this effortt? > > Thanks, > +Vinod > > On Oct 6, 2013, at 9:35 AM, Milind Bhandarkar wrote: > >> Federation is orthogonal with Pluggable Namespaces. That is, one can >> use Federation if needed, even while a distributed K-V store is used >> on the backend. >> >> Limitations of Federated namenode for scaling namespace are >> well-documented in several places, including the Giraffa presentation. >> >> HBase is only one of the several namespace implementations possible. >> Thus, if HBase-based namespace implementation does not fit your >> performance needs, you have a choice of using something else. >> >> - milind >> >> -Original Message- >> From: Azuryy Yu [mailto:azury...@gmail.com] >> Sent: Saturday, October 05, 2013 6:41 PM >> To: hdfs-dev@hadoop.apache.org >> Subject: Re: [Proposal] Pluggable Namespace >> >> Hi Milind, >> >> HDFS federation can solve the NN bottle neck and memory limit problem. >> >> AbstractNameSystem design sounds good. but distributed meta storage >> using HBase should bring performance degration. >> On Oct 4, 2013 3:18 AM, "Milind Bhandarkar" >> >> wrote: >> >>> Hi All, >>> >>> Exec Summary: For the last couple of months, we, at Pivotal, along >>> with a couple of folks in the community have been working on making >>> Namespace implementation in the namenode pluggable. We have >>> demonstrated that it can be done without major surgery on the >>> namenode, and does not have noticeable performance impact. We would >>> like to contribute it back to Apache if there is sufficient interest. >>> Please let us know if you are interested, and we will create a Jira >>> and >> update the patch for in-progress work. >>> >>> >>> Rationale: >>> >>> In a Hadoop cluster, Namenode roughly has following main >> responsibilities. >>> . Catering to RPC calls from clients. >>> . Managing the HDFS namespace tree. >>> . Managing block report, heartbeat and other communication from data >> nodes. >>> >>> For Hadoop clusters having large number of files and large number of >>> nodes, name node gets bottlenecked. Mainly for two reasons . All the >>> information is kept in name node's main memory. >>> . Namenode has to cater to all the request from clients / data nodes. >>> . And also perform some operations for backup and check pointing node. >>> >>> A possible solution is to add more main memory but there are certain >>> issues with this approach . Namnenode being Java application, garbage >>> collection cycles execute periodically to reclaim unreferenced heap >>> space. When the heap space grows very large, despite of GC policy >>> chosen, application stalls during the GC activity. This creates a >>> bunch of issues since DNs and clients may perceive this stall as NN >>> crash. >>> . There will always be a practical limit on how much physical memory >>> a single machine can accommodate. >>> >>> Proposed Solution: >>> >>> Out of the three responsibilities listed above, we can refactor >>> namespace management from the namenode codebase in such a way that >>> there is provision to implement and plug other name systems other >>> than existing in-process memory-based name system. Particularly a >>> name system backed by a distributed key-value store will >>> significantly reduce na
Re: New subproject logos
+1 mahadev On 6/26/09 12:13 AM, "Chris Douglas" wrote: > +1 > > On Thu, Jun 25, 2009 at 11:42 PM, Nigel Daley wrote: >> Here are some logos for the new subprojects >> http://www.flickr.com/photos/88199...@n00/3661433605/ >> >> Please vote +1 if you like 'em and -1 if you don't. >> >> Cheers, >> Nige >>
Re: [VOTE -- Round 2] Commit hdfs-630 to 0.21?
+1 mahadev On 1/21/10 2:46 PM, "Ryan Rawson" wrote: > Scaling _down_ is a continual problem for us, and this is one of the > prime factors. It puts a bad taste in the mouth of new people who then > run away from HBase and HDFS since it is "unreliable and unstable". It > is perfectly within scope to support a cluster of about 5-6 machines > which can have an aggregate capacity of 24TB (which is a fair amount), > and people expect to start small, prove the concept/technology then > move up. > > I am also +1 > > On Thu, Jan 21, 2010 at 2:36 PM, Stack wrote: >> I'd like to propose a new vote on having hdfs-630 committed to 0.21. >> The first vote on this topic, initiated 12/14/2009, was sunk by Tsz Wo >> (Nicholas), Sze suggested improvements. Those suggestions have since >> been folded into a new version of the hdfs-630 patch. Its this new >> version of the patch -- 0001-Fix-HDFS-630-0.21-svn-2.patch -- that I'd >> like us to vote on. For background on why we -- the hbase community >> -- think hdfs-630 important, see the notes below from the original >> call-to-vote. >> >> I'm obviously +1. >> >> Thanks for you consideration, >> St.Ack >> >> P.S. Regards TRUNK, after chatting with Nicholas, TRUNK was cleaned of >> the previous versions of hdfs-630 and we'll likely apply >> 0001-Fix-HDFS-630-trunk-svn-4.patch, a version of >> 0001-Fix-HDFS-630-0.21-svn-2.patch that works for TRUNK that includes >> the Nicholas suggestions. >> >> >> On Mon, Dec 14, 2009 at 9:56 PM, stack wrote: >>> I'd like to propose a vote on having hdfs-630 committed to 0.21 (Its already >>> been committed to TRUNK). >>> >>> hdfs-630 adds having the dfsclient pass the namenode the name of datanodes >>> its determined dead because it got a failed connection when it tried to >>> contact it, etc. This is useful in the interval between datanode dying and >>> namenode timing out its lease. Without this fix, the namenode can often >>> give out the dead datanode as a host for a block. If the cluster is small, >>> less than 5 or 6 nodes, then its very likely namenode will give out the dead >>> datanode as a block host. >>> >>> Small clusters are common in hbase, especially when folks are starting out >>> or evaluating hbase. They'll start with three or four nodes carrying both >>> datanodes+hbase regionservers. They'll experiment killing one of the slaves >>> -- datanodes and regionserver -- and watch what happens. What follows is a >>> struggling dfsclient trying to create replicas where one of the datanodes >>> passed us by the namenode is dead. DFSClient will fail and then go back to >>> the namenode again, etc. (See >>> https://issues.apache.org/jira/browse/HBASE-1876 for more detailed >>> blow-by-blow). HBase operation will be held up during this time and >>> eventually a regionserver will shut itself down to protect itself against >>> dataloss if we can't successfully write HDFS. >>> >>> Thanks all, >>> St.Ack >>
Failing trunk builds for HDFS.
Hi folks, Can anyone take a look at the hdfs builds? Seems to be failing: https://builds.apache.org/job/Hadoop-Hdfs-trunk/ thanks mahadev
Re: Failing trunk builds for HDFS.
t;> >>>> (1667kB) >>>> [ivy:resolve] .. (0kB) >>>> [ivy:resolve] [SUCCESSFUL ] >>>> org.apache.hadoop#hadoop-common;0.23.0-SNAPSHOT!hadoop-common.jar >>>> (1549ms) >>>> >>>> ivy-retrieve-common: >>>> [ivy:cachepath] DEPRECATED: 'ivy.conf.file' is deprecated, use >>>> 'ivy.settings.file' instead >>>> [ivy:cachepath] :: loading settings :: file = >>>> /home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Commit/trunk/ivy/ivysettings.xml >>>> >>>> ivy-resolve-hdfs: >>>> >>>> ivy-retrieve-hdfs: >>>> >>>> ivy-resolve-test: >>>> >>>> [ivy:resolve] downloading >>>> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.23.0-SNAPSHOT/hadoop-common-0.23.0-20110815.215733-266-tests.jar >>>> ... >>>> [ivy:resolve] >>>> >>>> (876kB) >>>> >>>> [ivy:resolve] .. (0kB) >>>> [ivy:resolve] [SUCCESSFUL ] >>>> org.apache.hadoop#hadoop-common;0.23.0-SNAPSHOT!hadoop-common.jar(tests) >>>> (875ms) >>>> >>>> >>>> On Mon, Aug 15, 2011 at 3:33 PM, Eli Collins wrote: >>>>> >>>>> Hey Giri, >>>>> >>>>> This looks like a similar issue to what was hitting the main Jenkins >>>>> job, the Hdfs job isn't picking up the latest bits from common. >>>>> >>>>> Thanks, >>>>> Eli >>>>> >>>>> On Mon, Aug 15, 2011 at 3:27 PM, Giridharan Kesavan >>>>> wrote: >>>>>> Todd, >>>>>> >>>>>> Could you please take a look at this ? >>>>>> >>>>>> https://issues.apache.org/jira/browse/HDFS-2261 >>>>>> >>>>>> >>>>>> -Giri >>>>>> On Mon, Aug 15, 2011 at 3:24 PM, Todd Lipcon wrote: >>>>>> >>>>>>> Seems like some of it is a build issue where it can't find ant. >>>>>>> >>>>>>> The other is the following: >>>>>>> https://issues.apache.org/jira/browse/HADOOP-7545 >>>>>>> Please review. >>>>>>> >>>>>>> Thanks >>>>>>> -Todd >>>>>>> >>>>>>> On Mon, Aug 15, 2011 at 2:54 PM, Mahadev Konar >>>>>>> wrote: >>>>>>>> Hi folks, >>>>>>>> Can anyone take a look at the hdfs builds? Seems to be failing: >>>>>>>> >>>>>>>> https://builds.apache.org/job/Hadoop-Hdfs-trunk/ >>>>>>>> >>>>>>>> thanks >>>>>>>> mahadev >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Todd Lipcon >>>>>>> Software Engineer, Cloudera >>>>>>> >>>>>> >>>> >>> >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera
Re: Failing trunk builds for HDFS.
Thanks Alejandro. Ill leave the HDFS nightly alone now :). BTW, the last error https://builds.apache.org/job/Hadoop-Hdfs-trunk/752/console which Todd also mentioned went away when I re triggered the build. So, there is definitely an issue with pulling the common artifacts for HDFS. thanks mahadev On Aug 15, 2011, at 8:03 PM, Alejandro Abdelnur wrote: > Mahadev, > > AOP stuff is now wired yet in common Mavenization, the instrumented JAR is > not being created/deployed. > > From some of the messages, it seems related to that. > > HDFS Mavenization (HDFS-2096), which has been +1, does not attempt to run > AOP stuff either, thus it would 'fix' this build failure for now. Later, > when AOP is wired to Mavenization the fault injection test would be back to > the build. > > At the moment, per Arun's request, we are holding on HDFS-2096 until MR-279 > goes in. > > Thanks. > > Alejandro > > > On Mon, Aug 15, 2011 at 7:54 PM, Mahadev Konar wrote: > >> I just tried the hdfs build on Apache Jenkins. Its still failing: >> >> https://builds.apache.org/job/Hadoop-Hdfs-trunk/752/console >> >> Looks like something you are noticing Todd. >> >> Also, is https://issues.apache.org/jira/browse/HDFS-2261 also an issue for >> the builds? >> >> thanks >> mahadev >> On Aug 15, 2011, at 4:58 PM, Todd Lipcon wrote: >> >>> I'm having some related issues locally... seems every time the Hudson >>> publishes a new build, my local build breaks with something like this: >>> >>> >>> todd@todd-w510:~/git/hadoop-common/hdfs$ ant clean test >>> Buildfile: /home/todd/git/hadoop-common/hdfs/build.xml >>> >>> clean-contrib: >>> >>> clean: >>> >>> clean: >>>[echo] contrib: fuse-dfs >>> >>> clean-fi: >>> >>> clean-sign: >>> >>> clean: >>> [delete] Deleting directory /home/todd/git/hadoop-common/hdfs/build >>> >>> ivy-download: >>> [get] Getting: >>> >> http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0-rc1/ivy-2.2.0-rc1.jar >>> [get] To: /home/todd/git/hadoop-common/hdfs/ivy/ivy-2.2.0-rc1.jar >>> [get] Not modified - so not downloaded >>> >>> ivy-init-dirs: >>> [mkdir] Created dir: /home/todd/git/hadoop-common/hdfs/build/ivy >>> [mkdir] Created dir: /home/todd/git/hadoop-common/hdfs/build/ivy/lib >>> [mkdir] Created dir: >> /home/todd/git/hadoop-common/hdfs/build/ivy/report >>> [mkdir] Created dir: /home/todd/git/hadoop-common/hdfs/build/ivy/maven >>> >>> ivy-probe-antlib: >>> >>> ivy-init-antlib: >>> >>> ivy-init: >>> [ivy:configure] :: Ivy 2.2.0-rc1 - 20100629224905 :: >>> http://ant.apache.org/ivy/ :: >>> [ivy:configure] :: loading settings :: file = >>> /home/todd/git/hadoop-common/hdfs/ivy/ivysettings.xml >>> >>> ivy-resolve-common: >>> [ivy:resolve] downloading >>> >> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-annotations/0.23.0-SNAPSHOT/hadoop-annotations-0.23.0-20110808.090045-15.jar >>> ... >>> [ivy:resolve] ... (14kB) >>> [ivy:resolve] .. (0kB) >>> [ivy:resolve] [SUCCESSFUL ] >>> >> org.apache.hadoop#hadoop-annotations;0.23.0-SNAPSHOT!hadoop-annotations.jar >>> (458ms) >>> [ivy:resolve] downloading >>> >> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.23.0-SNAPSHOT/hadoop-common-0.23.0-20110815.225725-267.jar >>> ... >>> [ivy:resolve] >> . >>> [ivy:resolve] >> >>> (1667kB) >>> [ivy:resolve] .. (0kB) >>> [ivy:resolve] [SUCCESSFUL ] >>> org.apache.hadoop#hadoop-common;0.23.0-SNAPSHOT!hadoop-common.jar >>> (2796ms) >>> >>> ivy-retrieve-common: >>> [ivy:cachepath] DEPRECATED: 'ivy.conf.file' is deprecated, use >>> 'ivy.settings.file' instead >>> [ivy:cachepath] :: loading settings :: file = >>> /home/todd/git/hadoop-common/hdfs/ivy/ivysettings.xml >>> >>> ivy-resolve-hdfs: >>> [ivy:resolve] downloading >>> >> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-an