Re: [Proposal] Pluggable Namespace

2013-10-06 Thread Mahadev Konar
Milind,
 Am I missing something here? This was supposed to be a discussion and am 
hoping thats why you started the thread. I don't see anywhere any conspiracy 
theory being considered or being talked about. Vinod asked some questions, if 
you can't or do not want to respond I suggest you skip emailing or ignore 
rather than making false assumptions and accusations. I hope the intent here is 
to contribute code and stays that way.

thanks
mahadev

On Oct 6, 2013, at 5:58 PM, Milind Bhandarkar  wrote:

> Vinod,
> 
> I have received a few emails about concerns that this effort somehow
> conflicts with federated namenodes. Most of these emails are from folks
> who are directly or remotely associated with Hortonworks.
> 
> Three weeks ago, I sent emails about this effort to a few  Hadoop
> committers who are primarily focused on HDFS, whose email address I had.
> While 2 out of those three responded to me, the third person associated
> with Hortonworks, did not.
> 
> Is Hortonworks concerned that this proposal conflicts with their
> development on federated namenode ? I have explicitly stated that it does
> not, and is orthogonal to federation. But I would like to know if there
> are some false assumptions being made about the intent of this
> development, and would like to quash any conspiracy theories right now,
> before they assume a life of their own.
> 
> Thanks,
> 
> Milind
> 
> 
> -Original Message-
> From: Vinod Kumar Vavilapalli [mailto:vino...@hortonworks.com]
> Sent: Sunday, October 06, 2013 12:21 PM
> To: hdfs-dev@hadoop.apache.org
> Subject: Re: [Proposal] Pluggable Namespace
> 
> In order to make federation happen, the block pool management was already
> separated. Isn't that the same as this effortt?
> 
> Thanks,
> +Vinod
> 
> On Oct 6, 2013, at 9:35 AM, Milind Bhandarkar wrote:
> 
>> Federation is orthogonal with Pluggable Namespaces. That is, one can
>> use Federation if needed, even while a distributed K-V store is used
>> on the backend.
>> 
>> Limitations of Federated namenode for scaling namespace are
>> well-documented in several places, including the Giraffa presentation.
>> 
>> HBase is only one of the several namespace implementations possible.
>> Thus, if HBase-based namespace implementation does not fit your
>> performance needs, you have a choice of using something else.
>> 
>> - milind
>> 
>> -Original Message-
>> From: Azuryy Yu [mailto:azury...@gmail.com]
>> Sent: Saturday, October 05, 2013 6:41 PM
>> To: hdfs-dev@hadoop.apache.org
>> Subject: Re: [Proposal] Pluggable Namespace
>> 
>> Hi Milind,
>> 
>> HDFS federation can solve the NN bottle neck and memory limit problem.
>> 
>> AbstractNameSystem design sounds good. but distributed meta storage
>> using HBase should bring performance degration.
>> On Oct 4, 2013 3:18 AM, "Milind Bhandarkar"
>> 
>> wrote:
>> 
>>> Hi All,
>>> 
>>> Exec Summary: For the last couple of months, we, at Pivotal, along
>>> with a couple of folks in the community have been working on making
>>> Namespace implementation in the namenode pluggable. We have
>>> demonstrated that it can be done without major surgery on the
>>> namenode, and does not have noticeable performance impact. We would
>>> like to contribute it back to Apache if there is sufficient interest.
>>> Please let us know if you are interested, and we will create a Jira
>>> and
>> update the patch for in-progress work.
>>> 
>>> 
>>> Rationale:
>>> 
>>> In a Hadoop cluster, Namenode roughly has following main
>> responsibilities.
>>> . Catering to RPC calls from clients.
>>> . Managing the HDFS namespace tree.
>>> . Managing block report, heartbeat and other communication from data
>> nodes.
>>> 
>>> For Hadoop clusters having large number of files and large number of
>>> nodes, name node gets bottlenecked. Mainly for two reasons . All the
>>> information is kept in name node's main memory.
>>> . Namenode has to cater to all the request from clients / data nodes.
>>> . And also perform some operations for backup and check pointing node.
>>> 
>>> A possible solution is to add more main memory but there are certain
>>> issues with this approach . Namnenode being Java application, garbage
>>> collection cycles execute periodically to reclaim unreferenced heap
>>> space. When the heap space grows very large, despite of GC policy
>>> chosen, application stalls during the GC activity. This creates a
>>> bunch of issues since DNs and  clients may perceive this stall as NN
>>> crash.
>>> . There will always be a practical limit on how much physical memory
>>> a single machine can accommodate.
>>> 
>>> Proposed Solution:
>>> 
>>> Out of the three responsibilities listed above, we can refactor
>>> namespace management from the namenode codebase in such a way that
>>> there is provision to implement and plug other name systems other
>>> than existing in-process memory-based name system. Particularly a
>>> name system backed by a distributed key-value store will
>>> significantly reduce na

Re: New subproject logos

2009-06-26 Thread Mahadev Konar
+1 

mahadev


On 6/26/09 12:13 AM, "Chris Douglas"  wrote:

> +1
> 
> On Thu, Jun 25, 2009 at 11:42 PM, Nigel Daley wrote:
>> Here are some logos for the new subprojects
>> http://www.flickr.com/photos/88199...@n00/3661433605/
>> 
>> Please vote +1 if you like 'em and -1 if you don't.
>> 
>> Cheers,
>> Nige
>> 



Re: [VOTE -- Round 2] Commit hdfs-630 to 0.21?

2010-01-21 Thread Mahadev Konar
+1

mahadev


On 1/21/10 2:46 PM, "Ryan Rawson"  wrote:

> Scaling _down_ is a continual problem for us, and this is one of the
> prime factors. It puts a bad taste in the mouth of new people who then
> run away from HBase and HDFS since it is "unreliable and unstable". It
> is perfectly within scope to support a cluster of about 5-6 machines
> which can have an aggregate capacity of 24TB (which is a fair amount),
> and people expect to start small, prove the concept/technology then
> move up.
> 
> I am also +1
> 
> On Thu, Jan 21, 2010 at 2:36 PM, Stack  wrote:
>> I'd like to propose a new vote on having hdfs-630 committed to 0.21.
>> The first vote on this topic, initiated 12/14/2009, was sunk by Tsz Wo
>> (Nicholas), Sze suggested improvements. Those suggestions have since
>> been folded into a new version of the hdfs-630 patch.  Its this new
>> version of the patch -- 0001-Fix-HDFS-630-0.21-svn-2.patch -- that I'd
>> like us to vote on. For background on why we -- the hbase community
>> -- think hdfs-630 important, see the notes below from the original
>> call-to-vote.
>> 
>> I'm obviously +1.
>> 
>> Thanks for you consideration,
>> St.Ack
>> 
>> P.S. Regards TRUNK, after chatting with Nicholas, TRUNK was cleaned of
>> the previous versions of hdfs-630 and we'll likely apply
>> 0001-Fix-HDFS-630-trunk-svn-4.patch, a version of
>> 0001-Fix-HDFS-630-0.21-svn-2.patch that works for TRUNK that includes
>> the Nicholas suggestions.
>> 
>> 
>> On Mon, Dec 14, 2009 at 9:56 PM, stack  wrote:
>>> I'd like to propose a vote on having hdfs-630 committed to 0.21 (Its already
>>> been committed to TRUNK).
>>> 
>>> hdfs-630 adds having the dfsclient pass the namenode the name of datanodes
>>> its determined dead because it got a failed connection when it tried to
>>> contact it, etc.  This is useful in the interval between datanode dying and
>>> namenode timing out its lease.  Without this fix, the namenode can often
>>> give out the dead datanode as a host for a block.  If the cluster is small,
>>> less than 5 or 6 nodes, then its very likely namenode will give out the dead
>>> datanode as a block host.
>>> 
>>> Small clusters are common in hbase, especially when folks are starting out
>>> or evaluating hbase.  They'll start with three or four nodes carrying both
>>> datanodes+hbase regionservers.  They'll experiment killing one of the slaves
>>> -- datanodes and regionserver -- and watch what happens.  What follows is a
>>> struggling dfsclient trying to create replicas where one of the datanodes
>>> passed us by the namenode is dead.   DFSClient will fail and then go back to
>>> the namenode again, etc. (See
>>> https://issues.apache.org/jira/browse/HBASE-1876 for more detailed
>>> blow-by-blow).  HBase operation will be held up during this time and
>>> eventually a regionserver will shut itself down to protect itself against
>>> dataloss if we can't successfully write HDFS.
>>> 
>>> Thanks all,
>>> St.Ack
>> 



Failing trunk builds for HDFS.

2011-08-15 Thread Mahadev Konar
Hi folks, 
  Can anyone take a look at the hdfs builds? Seems to be failing:

https://builds.apache.org/job/Hadoop-Hdfs-trunk/

thanks
mahadev


Re: Failing trunk builds for HDFS.

2011-08-15 Thread Mahadev Konar
t;> 
>>>> (1667kB)
>>>> [ivy:resolve] .. (0kB)
>>>> [ivy:resolve]   [SUCCESSFUL ]
>>>> org.apache.hadoop#hadoop-common;0.23.0-SNAPSHOT!hadoop-common.jar
>>>> (1549ms)
>>>> 
>>>> ivy-retrieve-common:
>>>> [ivy:cachepath] DEPRECATED: 'ivy.conf.file' is deprecated, use
>>>> 'ivy.settings.file' instead
>>>> [ivy:cachepath] :: loading settings :: file =
>>>> /home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Commit/trunk/ivy/ivysettings.xml
>>>> 
>>>> ivy-resolve-hdfs:
>>>> 
>>>> ivy-retrieve-hdfs:
>>>> 
>>>> ivy-resolve-test:
>>>> 
>>>> [ivy:resolve] downloading
>>>> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.23.0-SNAPSHOT/hadoop-common-0.23.0-20110815.215733-266-tests.jar
>>>> ...
>>>> [ivy:resolve] 
>>>> 
>>>> (876kB)
>>>> 
>>>> [ivy:resolve] .. (0kB)
>>>> [ivy:resolve]   [SUCCESSFUL ]
>>>> org.apache.hadoop#hadoop-common;0.23.0-SNAPSHOT!hadoop-common.jar(tests)
>>>> (875ms)
>>>> 
>>>> 
>>>> On Mon, Aug 15, 2011 at 3:33 PM, Eli Collins  wrote:
>>>>> 
>>>>> Hey Giri,
>>>>> 
>>>>> This looks like a similar issue to what was hitting the main Jenkins
>>>>> job, the Hdfs job isn't picking up the latest bits from common.
>>>>> 
>>>>> Thanks,
>>>>> Eli
>>>>> 
>>>>> On Mon, Aug 15, 2011 at 3:27 PM, Giridharan Kesavan
>>>>>  wrote:
>>>>>> Todd,
>>>>>> 
>>>>>> Could you please take a look at this ?
>>>>>> 
>>>>>> https://issues.apache.org/jira/browse/HDFS-2261
>>>>>> 
>>>>>> 
>>>>>> -Giri
>>>>>> On Mon, Aug 15, 2011 at 3:24 PM, Todd Lipcon  wrote:
>>>>>> 
>>>>>>> Seems like some of it is a build issue where it can't find ant.
>>>>>>> 
>>>>>>> The other is the following:
>>>>>>> https://issues.apache.org/jira/browse/HADOOP-7545
>>>>>>> Please review.
>>>>>>> 
>>>>>>> Thanks
>>>>>>> -Todd
>>>>>>> 
>>>>>>> On Mon, Aug 15, 2011 at 2:54 PM, Mahadev Konar 
>>>>>>> wrote:
>>>>>>>> Hi folks,
>>>>>>>>  Can anyone take a look at the hdfs builds? Seems to be failing:
>>>>>>>> 
>>>>>>>> https://builds.apache.org/job/Hadoop-Hdfs-trunk/
>>>>>>>> 
>>>>>>>> thanks
>>>>>>>> mahadev
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Todd Lipcon
>>>>>>> Software Engineer, Cloudera
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera



Re: Failing trunk builds for HDFS.

2011-08-15 Thread Mahadev Konar
Thanks Alejandro. 

Ill leave the HDFS nightly alone now :).

BTW, the last error https://builds.apache.org/job/Hadoop-Hdfs-trunk/752/console 
which Todd also mentioned went away when I re triggered the build. So, there is 
definitely an issue with pulling the common artifacts for HDFS.

thanks
mahadev

On Aug 15, 2011, at 8:03 PM, Alejandro Abdelnur wrote:

> Mahadev,
> 
> AOP stuff is now wired yet in common Mavenization, the instrumented JAR is
> not being created/deployed.
> 
> From some of the messages, it seems related to that.
> 
> HDFS Mavenization (HDFS-2096), which has been +1, does not attempt to run
> AOP stuff either, thus it would 'fix' this build failure for now. Later,
> when AOP is wired to Mavenization the fault injection test would be back to
> the build.
> 
> At the moment, per Arun's request, we are holding on HDFS-2096 until MR-279
> goes in.
> 
> Thanks.
> 
> Alejandro
> 
> 
> On Mon, Aug 15, 2011 at 7:54 PM, Mahadev Konar wrote:
> 
>> I just tried the hdfs build on Apache Jenkins. Its still failing:
>> 
>> https://builds.apache.org/job/Hadoop-Hdfs-trunk/752/console
>> 
>> Looks like something you are noticing Todd.
>> 
>> Also, is https://issues.apache.org/jira/browse/HDFS-2261 also an issue for
>> the builds?
>> 
>> thanks
>> mahadev
>> On Aug 15, 2011, at 4:58 PM, Todd Lipcon wrote:
>> 
>>> I'm having some related issues locally... seems every time the Hudson
>>> publishes a new build, my local build breaks with something like this:
>>> 
>>> 
>>> todd@todd-w510:~/git/hadoop-common/hdfs$ ant clean test
>>> Buildfile: /home/todd/git/hadoop-common/hdfs/build.xml
>>> 
>>> clean-contrib:
>>> 
>>> clean:
>>> 
>>> clean:
>>>[echo] contrib: fuse-dfs
>>> 
>>> clean-fi:
>>> 
>>> clean-sign:
>>> 
>>> clean:
>>>  [delete] Deleting directory /home/todd/git/hadoop-common/hdfs/build
>>> 
>>> ivy-download:
>>> [get] Getting:
>>> 
>> http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.2.0-rc1/ivy-2.2.0-rc1.jar
>>> [get] To: /home/todd/git/hadoop-common/hdfs/ivy/ivy-2.2.0-rc1.jar
>>> [get] Not modified - so not downloaded
>>> 
>>> ivy-init-dirs:
>>>   [mkdir] Created dir: /home/todd/git/hadoop-common/hdfs/build/ivy
>>>   [mkdir] Created dir: /home/todd/git/hadoop-common/hdfs/build/ivy/lib
>>>   [mkdir] Created dir:
>> /home/todd/git/hadoop-common/hdfs/build/ivy/report
>>>   [mkdir] Created dir: /home/todd/git/hadoop-common/hdfs/build/ivy/maven
>>> 
>>> ivy-probe-antlib:
>>> 
>>> ivy-init-antlib:
>>> 
>>> ivy-init:
>>> [ivy:configure] :: Ivy 2.2.0-rc1 - 20100629224905 ::
>>> http://ant.apache.org/ivy/ ::
>>> [ivy:configure] :: loading settings :: file =
>>> /home/todd/git/hadoop-common/hdfs/ivy/ivysettings.xml
>>> 
>>> ivy-resolve-common:
>>> [ivy:resolve] downloading
>>> 
>> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-annotations/0.23.0-SNAPSHOT/hadoop-annotations-0.23.0-20110808.090045-15.jar
>>> ...
>>> [ivy:resolve] ... (14kB)
>>> [ivy:resolve] .. (0kB)
>>> [ivy:resolve]   [SUCCESSFUL ]
>>> 
>> org.apache.hadoop#hadoop-annotations;0.23.0-SNAPSHOT!hadoop-annotations.jar
>>> (458ms)
>>> [ivy:resolve] downloading
>>> 
>> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.23.0-SNAPSHOT/hadoop-common-0.23.0-20110815.225725-267.jar
>>> ...
>>> [ivy:resolve]
>> .
>>> [ivy:resolve]
>> 
>>> (1667kB)
>>> [ivy:resolve] .. (0kB)
>>> [ivy:resolve]   [SUCCESSFUL ]
>>> org.apache.hadoop#hadoop-common;0.23.0-SNAPSHOT!hadoop-common.jar
>>> (2796ms)
>>> 
>>> ivy-retrieve-common:
>>> [ivy:cachepath] DEPRECATED: 'ivy.conf.file' is deprecated, use
>>> 'ivy.settings.file' instead
>>> [ivy:cachepath] :: loading settings :: file =
>>> /home/todd/git/hadoop-common/hdfs/ivy/ivysettings.xml
>>> 
>>> ivy-resolve-hdfs:
>>> [ivy:resolve] downloading
>>> 
>> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-an