Re: Monitoring with Ganglia

Brian Bockelman Thu, 19 Mar 2009 06:25:44 -0700

Hey Tamir,

This is a very strange stack trace:

java.lang.ClassNotFoundException:org.apache.hadoop.metrics.ganglia.GangliaContext31 (for Ganglia3.1.x)

        at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        (blah blah blah)

It looks like it thinks the classname is "GangliaContext31 (forGanglia3.1.x)". Is it possible you accidentally left a comment inyour config?


Brian


On Mar 19, 2009, at 8:09 AM, Tamir Kamara wrote:

Hi,

I attached a zip with the lsof output, jobtracker log andtasktracker log (I only enabled mapred metrics). You can also see ithere: http://www.sendspace.com/file/86v5jc


Thanks,
Tamir

On Thu, Mar 19, 2009 at 2:51 PM, Brian Bockelman<bbock...@cse.unl.edu> wrote:

Hey Tamir,

It appears the webserver stripped off your attachment.

Do you have more of a stack trace available?

Brian


On Mar 19, 2009, at 7:25 AM, Tamir Kamara wrote:

Hi,

The full lsof | grep java is attached. I see a line with the jar: /usr/local/hadoop-0.18.4/hadoop-0.18.4-dev-core.jar which is the newone the "ant clean jar" command created.

On Thu, Mar 19, 2009 at 2:00 PM, Brian Bockelman<bbock...@cse.unl.edu> wrote:


On Mar 19, 2009, at 6:56 AM, Tamir Kamara wrote:

Hi Brian,

I see GangliaContext31.class in the jar and GangliaContext31.java inthe src

folder.

By the way, I only used the last version of each patch. Should Iapply the

different files per patch from the earliest to the latest ?

Nope.

Can you perform "lsof" on the running process and see if it'sperhaps using the wrong JAR?


Brian




Thanks,
Tamir

On Thu, Mar 19, 2009 at 1:38 PM, Brian Bockelman<bbock...@cse.unl.edu>wrote:


Hey Tamir,

Can you see the file GangliaContext31.java in your jar?  In the source
directory?

Brian


On Mar 19, 2009, at 2:33 AM, Tamir Kamara wrote:

Hi,

All my testing were fine with Ganglia 3.0, I used HADOOP-3422 patchto fix

the metric names provided by hadoop and it worked. Because I had to

recompile hadoop (base 0.18.3) I also used Hadoop-4675 in order touse thelatest Ganglia (3.1). After changing the metrics file to report withthe

GangliaContext31 class I started getting a ClassNotFoundException. The

command I used to recompile hadoop was "ant clean jar" and then Imoved

and
renamed it instead of the original core jar.

Do you what is wrong ?

Thanks,
Tamir


On Tue, Mar 17, 2009 at 5:25 PM, jason hadoop <jason.had...@gmail.com
wrote:

Make all of your hadoop-metrics properties use the standard IPaddress of

your master node.
Then add a straight udp receive block to the gmond.conf of your master
node.
Then point your gmetad.conf at your master node.

There are complete details in forthcoming book, and with this in it,
should
be available in alpha soon.

On Tue, Mar 17, 2009 at 8:23 AM, Tamir Kamara <tamirkam...@gmail.com>
wrote:

I sent my gmond.conf in my previous email... and the address is like

carlos

wrote.

I'll change the hadoop-metrics file and check again.
However, I would prefer to use a method I'm more familiar with - like

unicast tcp communication. Do you know what I need to change inganglia


and

/ or hadoop to use it ?

Thanks.


On Tue, Mar 17, 2009 at 5:16 PM, Brian Bockelman <bbock...@cse.unl.edu

wrote:



On Mar 17, 2009, at 10:08 AM, Carlos Valiente wrote:

On Tue, Mar 17, 2009 at 14:56, Tamir Kamara <tamirkam...@gmail.com>

wrote:

I don't know too much about multicast... and I'm using the default

gmond

conf file.


The default multicast address seems to be 239.2.11.71, so that's the
one for your hadoop-metrics.properties.


Yup, try that - although I could tell better if I had Tamir's

gmond.conf,

of course.



Wouldn't using the multicast address mean I'll need to specify a

different
address for each node so that the data won't get to all nodes running
gmond



The design of Ganglia is such that all the data goes at all the nodes
running gmond.  If you don't like it, Ganglia 3.1 supports

non-multicast

TCP

channels.

For reference, our 200 node cluster has about 250KB/s of background

chatter

on idle nodes, which is probably Ganglia-related.  It's an incredibly

small

perturbation on network traffic.

Brian


I'm not an expert, either --- I'm using the same multicast address on

all nodes in my cluster. On each node, tcpdump shows incoming Ganglia
traffic from every other node to the multicast address. It's usually a
burst of about  200 UDP packets every 4 seconds or so (for a 6-node
cluster), so the traffic overhead should be negligible.

C







--
Alpha Chapters of my book on Hadoop are available
http://www.apress.com/book/view/9781430219422







<gout.zip>

Re: Monitoring with Ganglia

Reply via email to