Re: Adding Elasticity to Hadoop MapReduce

2011-09-19 Thread Arun C Murthy
On Sep 15, 2011, at 2:26 AM, Steve Loughran wrote: > These are all good ideas. The other trick -which has been discussed recently > in the context of the Platform Scheduler- is to run HDFS across all nodes, > but switch the workload of the cluster between Hadoop jobs (MR, Graph, > Hamster), and

Re: Adding Elasticity to Hadoop MapReduce

2011-09-19 Thread Milind.Bhandarkar
> > > >These are all good ideas. The other trick -which has been discussed >recently in the context of the Platform Scheduler- is to run HDFS across >all nodes, but switch the workload of the cluster between Hadoop jobs >(MR, Graph, Hamster), and other work (Grid jobs). That way the >filesystem is

Re: Adding Elasticity to Hadoop MapReduce

2011-09-15 Thread Steve Loughran
On 15/09/11 10:14, Junping Du wrote: Hello Arun and all, I think current hadoop have a good capability of scale out but not so good at scale in. As its design for dedicated cluster and machines, there is not too much attention for "scale in" capability in a long time. However, I notic

Re: Adding Elasticity to Hadoop MapReduce

2011-09-15 Thread Steve Loughran
On 14/09/11 22:20, Ted Dunning wrote: This makes a bit of sense, but you have to worry about the inertia of the data. Adding compute resources is easy. Adding data resources, not so much. I've done it. Like Ted says, pure compute nodes generate more network traffic on both reads and writes,

Re: Adding Elasticity to Hadoop MapReduce

2011-09-15 Thread Steve Loughran
On 15/09/11 02:01, Bharath Ravi wrote: Thanks a lot, all! An end goal of mine was to make Hadoop as flexible as possible. Along the same lines, but unrelated to the above idea, was another I encountered, courtesy http://hadoopblog.blogspot.com/2010/11/hadoop-research-topics.html The blog mentio

Re: Adding Elasticity to Hadoop MapReduce

2011-09-15 Thread Junping Du
will be killed too but in a well planned way.           My 2 cents. Thanks, Junping From: Arun C Murthy To: common-dev@hadoop.apache.org Sent: Thursday, September 15, 2011 5:24 AM Subject: Re: Adding Elasticity to Hadoop MapReduce On Sep 14, 2011, at 1:27 PM, Bharath Ravi wrote: > Hi all,

Re: Adding Elasticity to Hadoop MapReduce

2011-09-14 Thread Bharath Ravi
Thanks a lot, all! An end goal of mine was to make Hadoop as flexible as possible. Along the same lines, but unrelated to the above idea, was another I encountered, courtesy http://hadoopblog.blogspot.com/2010/11/hadoop-research-topics.html The blog mentions the ability to dynamically append Inpu

Re: Adding Elasticity to Hadoop MapReduce

2011-09-14 Thread Arun C Murthy
On Sep 14, 2011, at 1:27 PM, Bharath Ravi wrote: > Hi all, > > I'm a newcomer to Hadoop development, and I'm planning to work on an idea > that I wanted to run by the dev community. > > My apologies if this is not the right place to post this. > > Amazon has an "Elastic MapReduce" Service ( >

Re: Adding Elasticity to Hadoop MapReduce

2011-09-14 Thread Ted Dunning
This makes a bit of sense, but you have to worry about the inertia of the data. Adding compute resources is easy. Adding data resources, not so much. And if the computation is not near the data, then it is likely to be much less effective. On Wed, Sep 14, 2011 at 4:27 PM, Bharath Ravi wrote: >

Re: Adding Elasticity to Hadoop MapReduce

2011-09-14 Thread Amandeep Khurana
Hi Bharath, Amazon EMR has two kinds of nodes - Task and Core. Core nodes run HDFS and MapReduce but task nodes run only MapReduce. You can only add core nodes but you can add and remove task nodes in a running cluster. In other words, you can't reduce the size of HDFS. You can only increase it.

Adding Elasticity to Hadoop MapReduce

2011-09-14 Thread Bharath Ravi
Hi all, I'm a newcomer to Hadoop development, and I'm planning to work on an idea that I wanted to run by the dev community. My apologies if this is not the right place to post this. Amazon has an "Elastic MapReduce" Service ( http://aws.amazon.com/elasticmapreduce/) that runs on Hadoop. The ser