Thanks Brian ! Works great.
On Thu, Mar 19, 2009 at 3:39 PM, Brian Bockelman <bbock...@cse.unl.edu>wrote: > Hey Tamir, > > Instead of > > mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 (for > Ganglia3.1.x) > > use: > > mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 > > Java is trying to interpret the parenthetical aside as part of the class > name. > > Brian > > PS: In distributed systems (or complex systems in general), I'm always > amazed at all the different ways things can go wrong. > > > On Mar 19, 2009, at 8:35 AM, Tamir Kamara wrote: > > Hi Brian, >> >> Do you mean the hadoop-metrics file? It looks like this: >> # Configuration of the "mapred" context for ganglia >> # mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext (defalut >> for >> Ganglia3.0.x) >> mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 (for >> Ganglia3.1.x) >> mapred.period=10 >> mapred.servers=localhost:8649 >> >> I've only uncommented the last 3 lines. I think that there's a class >> called >> GangliaContext31 in >> >> /usr/local/hadoop-0.18.4/src/core/org/apache/hadoop/metrics/ganglia/GangliaContext31.java. >> >> thanks, >> Tamir >> >> On Thu, Mar 19, 2009 at 3:25 PM, Brian Bockelman <bbock...@cse.unl.edu >> >wrote: >> >> Hey Tamir, >>> >>> This is a very strange stack trace: >>> >>> java.lang.ClassNotFoundException: >>> org.apache.hadoop.metrics.ganglia.GangliaContext31 (for Ganglia3.1.x) >>> at java.net.URLClassLoader$1.run(URLClassLoader.java:200) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at java.net.URLClassLoader.findClass(URLClassLoader.java:188) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:307) >>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) >>> (blah blah blah) >>> >>> It looks like it thinks the classname is "GangliaContext31 (for >>> Ganglia3.1.x)". Is it possible you accidentally left a comment in your >>> config? >>> >>> Brian >>> >>> >>> >>> On Mar 19, 2009, at 8:09 AM, Tamir Kamara wrote: >>> >>> Hi, >>> >>>> >>>> I attached a zip with the lsof output, jobtracker log and tasktracker >>>> log >>>> (I only enabled mapred metrics). You can also see it here: >>>> http://www.sendspace.com/file/86v5jc >>>> >>>> Thanks, >>>> Tamir >>>> >>>> On Thu, Mar 19, 2009 at 2:51 PM, Brian Bockelman <bbock...@cse.unl.edu> >>>> wrote: >>>> Hey Tamir, >>>> >>>> It appears the webserver stripped off your attachment. >>>> >>>> Do you have more of a stack trace available? >>>> >>>> Brian >>>> >>>> >>>> On Mar 19, 2009, at 7:25 AM, Tamir Kamara wrote: >>>> >>>> Hi, >>>> >>>> The full lsof | grep java is attached. I see a line with the jar: >>>> /usr/local/hadoop-0.18.4/hadoop-0.18.4-dev-core.jar which is the new one >>>> the >>>> "ant clean jar" command created. >>>> >>>> >>>> On Thu, Mar 19, 2009 at 2:00 PM, Brian Bockelman <bbock...@cse.unl.edu> >>>> wrote: >>>> >>>> On Mar 19, 2009, at 6:56 AM, Tamir Kamara wrote: >>>> >>>> Hi Brian, >>>> >>>> I see GangliaContext31.class in the jar and GangliaContext31.java in the >>>> src >>>> folder. >>>> >>>> By the way, I only used the last version of each patch. Should I apply >>>> the >>>> different files per patch from the earliest to the latest ? >>>> >>>> Nope. >>>> >>>> Can you perform "lsof" on the running process and see if it's perhaps >>>> using the wrong JAR? >>>> >>>> Brian >>>> >>>> >>>> >>>> >>>> Thanks, >>>> Tamir >>>> >>>> On Thu, Mar 19, 2009 at 1:38 PM, Brian Bockelman <bbock...@cse.unl.edu >>>> >>>>> wrote: >>>>> >>>> >>>> Hey Tamir, >>>> >>>> Can you see the file GangliaContext31.java in your jar? In the source >>>> directory? >>>> >>>> Brian >>>> >>>> >>>> On Mar 19, 2009, at 2:33 AM, Tamir Kamara wrote: >>>> >>>> Hi, >>>> >>>> All my testing were fine with Ganglia 3.0, I used HADOOP-3422 patch to >>>> fix >>>> the metric names provided by hadoop and it worked. Because I had to >>>> recompile hadoop (base 0.18.3) I also used Hadoop-4675 in order to use >>>> the >>>> latest Ganglia (3.1). After changing the metrics file to report with the >>>> GangliaContext31 class I started getting a ClassNotFoundException. The >>>> command I used to recompile hadoop was "ant clean jar" and then I moved >>>> and >>>> renamed it instead of the original core jar. >>>> >>>> Do you what is wrong ? >>>> >>>> Thanks, >>>> Tamir >>>> >>>> >>>> On Tue, Mar 17, 2009 at 5:25 PM, jason hadoop <jason.had...@gmail.com >>>> wrote: >>>> >>>> Make all of your hadoop-metrics properties use the standard IP address >>>> of >>>> your master node. >>>> Then add a straight udp receive block to the gmond.conf of your master >>>> node. >>>> Then point your gmetad.conf at your master node. >>>> >>>> There are complete details in forthcoming book, and with this in it, >>>> should >>>> be available in alpha soon. >>>> >>>> On Tue, Mar 17, 2009 at 8:23 AM, Tamir Kamara <tamirkam...@gmail.com> >>>> wrote: >>>> >>>> I sent my gmond.conf in my previous email... and the address is like >>>> >>>> carlos >>>> >>>> wrote. >>>> >>>> I'll change the hadoop-metrics file and check again. >>>> However, I would prefer to use a method I'm more familiar with - like >>>> unicast tcp communication. Do you know what I need to change in ganglia >>>> >>>> and >>>> >>>> / or hadoop to use it ? >>>> >>>> Thanks. >>>> >>>> >>>> On Tue, Mar 17, 2009 at 5:16 PM, Brian Bockelman <bbock...@cse.unl.edu >>>> >>>> wrote: >>>> >>>> >>>> >>>> On Mar 17, 2009, at 10:08 AM, Carlos Valiente wrote: >>>> >>>> On Tue, Mar 17, 2009 at 14:56, Tamir Kamara <tamirkam...@gmail.com> >>>> >>>> wrote: >>>> >>>> I don't know too much about multicast... and I'm using the default >>>> >>>> gmond >>>> >>>> conf file. >>>> >>>> >>>> The default multicast address seems to be 239.2.11.71, so that's the >>>> one for your hadoop-metrics.properties. >>>> >>>> >>>> Yup, try that - although I could tell better if I had Tamir's >>>> >>>> gmond.conf, >>>> >>>> of course. >>>> >>>> >>>> >>>> Wouldn't using the multicast address mean I'll need to specify a >>>> >>>> different >>>> address for each node so that the data won't get to all nodes running >>>> gmond >>>> >>>> >>>> >>>> The design of Ganglia is such that all the data goes at all the nodes >>>> running gmond. If you don't like it, Ganglia 3.1 supports >>>> >>>> non-multicast >>>> >>>> TCP >>>> >>>> channels. >>>> >>>> For reference, our 200 node cluster has about 250KB/s of background >>>> >>>> chatter >>>> >>>> on idle nodes, which is probably Ganglia-related. It's an incredibly >>>> >>>> small >>>> >>>> perturbation on network traffic. >>>> >>>> Brian >>>> >>>> >>>> I'm not an expert, either --- I'm using the same multicast address on >>>> >>>> all nodes in my cluster. On each node, tcpdump shows incoming Ganglia >>>> traffic from every other node to the multicast address. It's usually a >>>> burst of about 200 UDP packets every 4 seconds or so (for a 6-node >>>> cluster), so the traffic overhead should be negligible. >>>> >>>> C >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> Alpha Chapters of my book on Hadoop are available >>>> http://www.apress.com/book/view/9781430219422 >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> <gout.zip> >>>> >>>> >>> >>> >