configuring different number of slaves for MR jobs

2011-09-27 Thread bikash sharma
Hi -- Can we specify a different set of slaves for each mapreduce job run. I tried using the --config option and specify different set of slaves in slaves config file. However, it does not use the selective slaves set but the one initially configured. Any help? Thanks, Biksah

Getting the cpu, memory usage of map/reduce tasks

2011-09-23 Thread bikash sharma
Hi -- Is it possible to get the cpu and memory usage of individual map/reduces tasks when any mapreduce job is run. I came across this jira issue, but was not sure about the exact ways to access in the current hadoop distriubtion https://issues.apache.org/jira/browse/MAPREDUCE-220 Any help is high

automatic monitoring the utilization of slaves

2011-06-15 Thread bikash sharma
Hi -- Is there a way, by which a slave can get a trigger when a Hadoop jobs finished in master? The use case is as follows: I need to monitor the cpu, memory utilization utility automatically. For which, I need to know the timestamp to start and stop the sar utility corresponding to the start and f

/etc/hosts related error?

2011-06-08 Thread bikash sharma
Hi I am experiencing a lot of tasks failures while running any Hadoop application. In particular, I get the following warnings: Error initializing attempt_201106081500_0018_r_00_0: java.io.IOException: Could not obtain block: blk_-7386162385184325734_1214 file=/home/hadoop/data/mapred/system/jo

Re: hadoop cluster installation problems

2011-04-13 Thread bikash sharma
java:279) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965) On Wed, Apr 13, 2011 at 9:20 AM, bikash sharma wrote: > Hi, > I need to install hadoop on 16-node cluster. I have a couple of related > que

hadoop cluster installation problems

2011-04-13 Thread bikash sharma
Hi, I need to install hadoop on 16-node cluster. I have a couple of related questions: 1. I have installed hadoop on a shared directory, i.e., there is just one place where the whole hadoop installation files exist and all the 16 nodes use the same installation. Is that an issue or I need to instal

Re: cluster restart error

2011-04-12 Thread bikash sharma
connect to server: inti79.cse.psu.edu/130.203.58.207:54310. Already tried 2 time(s). On Tue, Apr 12, 2011 at 5:34 PM, bikash sharma wrote: > Hi, > I changed some config. parameters in core-site/mapred.xml files and then > stopped dfs, mapred services. > While restarting them again, I am una

cluster restart error

2011-04-12 Thread bikash sharma
Hi, I changed some config. parameters in core-site/mapred.xml files and then stopped dfs, mapred services. While restarting them again, I am unable to do so and looking at the logs, the following error occurs: 2011-04-12 17:27:39,343 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapt

available hadoop logs

2011-04-08 Thread bikash sharma
Hi, For research purpose, I need some real Hadoop MapReduce job traces (ideally both inter and intra-job(in terms of Hadoop job configuration parameters like mapred.io.sort.factor)). Is there some freely available Hadoop traces corresponding to some real large setup? Thanks, Bikash

Re: hadoop parameters config files location

2011-04-04 Thread bikash sharma
Thanks Harsh. On Mon, Apr 4, 2011 at 3:33 PM, Harsh Chouraria wrote: > Hello, > > On Tue, Apr 5, 2011 at 12:52 AM, bikash sharma > wrote: > > Hi, > > I want to change the following hadoop parameters before launching > TeraSort > > program:

hadoop parameters config files location

2011-04-04 Thread bikash sharma
Hi, I want to change the following hadoop parameters before launching TeraSort program: io.sort.factor io.sort.mb mapreduce.reduce.tasks io.file.buffer.size io.sort.record.percent I tried to change these properties by putting them as property values in core-site.xml file under conf directory. Howe

Re: Chukwa setup issues

2011-04-01 Thread bikash sharma
I was trying to install HICC in Chukwa, but hicc.sh does not exist in the repository. Any idea? -bikash On Fri, Apr 1, 2011 at 5:57 PM, bikash sharma wrote: > Thanks Bill. > I am able to connect via web now, actually had put wrong http port in > config file. > One following questio

Re: Chukwa setup issues

2011-04-01 Thread bikash sharma
ere anything in logs/collector.log? > > On Fri, Apr 1, 2011 at 1:09 PM, bikash sharma > wrote: > > Hi, > > I am trying to setup Chukwa for a 16-node Hadoop cluster. > > I followed the admin guide - > > http://incubator.apache.org/chukwa/docs/r0.4.0/admin.html#Agents &

Chukwa setup issues

2011-04-01 Thread bikash sharma
Hi, I am trying to setup Chukwa for a 16-node Hadoop cluster. I followed the admin guide - http://incubator.apache.org/chukwa/docs/r0.4.0/admin.html#Agents However, I ran two the following issues: 1. What should be the collector port that needs to be specified in conf/collectors file 2. Am unable t

runtime resource change of applications

2011-04-01 Thread bikash sharma
Hi, Can we dynamically vary the resource allocation/consumption (say memory, cores) of Hadoop MR applications like sort? Thanks, Bikash

observe the effect of changes to Hadoop

2011-03-25 Thread bikash sharma
Hi, For my research project, I need to add a couple of functions in JobTracker.java source file to include additional information about TaskTrackers resource usage through heartbeat messages. I made those changes to JobTracker.java file. However, I am not very clear how to see these effects. I mea

MapReduce compilation error

2011-03-18 Thread bikash sharma
Hi, When I am compiling MapReduce source code after checking-in Eclipse, I am getting the following error: The declared package "" does not match the expected package "testjar" ClassWithNoPackage.java Hadoop-MR/src/test/mapred/testjar Any thoughts? Thanks, Bikash

modification to Hadoop Jobtracker

2011-03-18 Thread bikash sharma
Hi, For my research project, I need to modify Hadoop JobTracker to collect some statistics of TaskTracker nodes. For example, I would like to piggy-back heartbeat messages sent from TaskTrackers to JobTracker with some extra information related to the resource utilization and other statistics. I am

Re: pointers to Hadoop eclipse

2011-03-17 Thread bikash sharma
I resolved these errors. I was missing tools.jar which I added as external .jar file. On Thu, Mar 17, 2011 at 8:37 PM, bikash sharma wrote: > P.S. > On building the project in eclipse, I also get the following two errors: > > Description Resource Path Location Type > The projec

Re: pointers to Hadoop eclipse

2011-03-17 Thread bikash sharma
Unknown Java Problem The type com.sun.javadoc.RootDoc cannot be resolved. It is indirectly referenced from required .class files ExcludePrivateAnnotationsJDiffDoclet.java Hadoop/src/java/org/apache/hadoop/classification/tools line 1 Java Problem On Thu, Mar 17, 2011 at 7:41 PM, bikash sharma wrote

Re: pointers to Hadoop eclipse

2011-03-17 Thread bikash sharma
pe" Can anyone please help. Thanks, Bikash On Thu, Mar 17, 2011 at 1:52 PM, bikash sharma wrote: > Thank you all. > > > On Thu, Mar 17, 2011 at 12:21 PM, bharath vissapragada < > bharathvissapragada1...@gmail.com> wrote: > >> >> http://www.cloudera.com/b

Re: pointers to Hadoop eclipse

2011-03-17 Thread bikash sharma
45 PM, Harsh J wrote: > > http://wiki.apache.org/hadoop/EclipseEnvironment > > > > On Thu, Mar 17, 2011 at 8:17 PM, bikash sharma > wrote: > >> Hi, > >> Can someone please point to any good reference that tells clearly how to > >> checkout Hadoop code base

pointers to Hadoop eclipse

2011-03-17 Thread bikash sharma
Hi, Can someone please point to any good reference that tells clearly how to checkout Hadoop code base in eclipse, make any changes and re-compile. Actually, I wanted to change some part in Hadoop, so wants to see the above effect, preferrably in eclipse. Thanks, Bikash

slot related question

2011-03-05 Thread bikash sharma
Hi, This is a conceptual question: 1. Are various resources shared across slots in Hadoop OR resources are partitioned across slots? 2. Any thoughts on experiments using Hadoop setup that can help confirm the above rationale? Thanks, Bikash

Re: conceptual question regarding slots

2011-03-02 Thread bikash sharma
Thanks Greg! On Wed, Mar 2, 2011 at 5:07 PM, Greg Roelofs wrote: > > Could someone throw some light as to how intuitively fixed-type slots in > > Hadoop have a negative impact of cluster utilization as mentioned in > Arun's > > blog? > > It's pretty simple: in the course of time, a real cluster

conceptual question regarding slots

2011-03-02 Thread bikash sharma
Hi, Could someone throw some light as to how intuitively fixed-type slots in Hadoop have a negative impact of cluster utilization as mentioned in Arun's blog? http://developer.yahoo.com/blogs/hadoop/posts/2011/02/mapreduce-nextgen/ Thanks, Bikash

disable pipelining in Hadoop

2011-03-01 Thread bikash sharma
Hi, Is there a way to disable the use of pipelining , i.e., the reduce phase is started only after the map phase is completed? -bikash

Re: measure the resource usage of each map/reduce task

2011-03-01 Thread bikash sharma
Hi, As a follow-up question, do map/reduce tasks run as threads or processes? On Tue, Feb 22, 2011 at 10:35 AM, bikash sharma wrote: > Hi, > Is there any way in which we can measure the resource usage of each > map/reduce task running? > I was trying to use sar utility to track

TaskTracker not starting on all nodes

2011-02-26 Thread bikash sharma
Hi, I have a 10 nodes Hadoop cluster, where I am running some benchmarks for experiments. Surprisingly, when I initialize the Hadoop cluster (hadoop/bin/start-mapred.sh), in many instances, only some nodes have TaskTracker process up (seen using jps), while other nodes do not have TaskTrackers. Cou

definition of slots in Hadoop scheduling

2011-02-25 Thread bikash sharma
Hi, How is task slot in Hadoop defined with respect to scheduling a map/reduce task on such slots available on TaskTrackers? Thanks, Bikash

measure the resource usage of each map/reduce task

2011-02-22 Thread bikash sharma
Hi, Is there any way in which we can measure the resource usage of each map/reduce task running? I was trying to use sar utility to track each process resource usage, however it seems these individual map/reduce tasks are not listed as processes when I do ps -ex. Thanks, Bikash