how to debug slowdowns from these log snippets(we know the keys being accessed as well)

Hiller, Dean Wed, 20 Feb 2013 06:10:56 -0800

Cassandra version 1.1.4

I captured all the logs of node causing timeouts (in a 6 node cluster).  We 
seem to get these slowdowns every once in a while and it causes our whole 
website to be 10 times slower.  Since PlayOrm actually logs the rows being 
accessed we know exactly which row the timeout occurred on and asytanax logged 
the node which was nice.


Each time we get this website being 10 times slower, this is in the logs

 WARN [ScheduledTasks:1] 2013-02-20 06:52:15,929 GCInspector.java (line 145) 
Heap is 0.8532751700072093 full.  You may need to reduce memtable and/or cache 
sizes.  Cassandra will now flush up to the two largest memtables to free up 
memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you 
don't want Cassandra to do this automatically
 WARN [ScheduledTasks:1] 2013-02-20 06:52:15,929 StorageService.java (line 
2855) Flushing CFS(Keyspace='databus5', ColumnFamily='IntegerIndice') to 
relieve memory pressure

I seem to remember reading about we should turn off swap which I have not 
gotten around to at this point :(…will that help in that the node is taken out 
of the cluster or?  As you can see from top, this node has been running a 
5,000,000 hours below(What's up with that….maybe the h does not actually mean 
hours) though I know it has been running for months.

Here is a top command….

top - 07:02:37 up 148 days, 21:11,  1 user,  load average: 1.10, 1.09, 1.09
Tasks: 379 total,   2 running, 377 sleeping,   0 stopped,   0 zombie
Cpu(s): 15.9%us,  1.0%sy,  2.9%ni, 79.9%id,  0.1%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:  32854680k total, 32582688k used,   271992k free,       44k buffers
Swap: 33554424k total,    52180k used, 33502244k free, 22653528k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
  701 cassandr  20   0 63.3g  10g 1.9g S 838.4 32.2  5125794h java

What can I dump to see why we need to "relieve memory pressure"?

I ran jmap like so but it can't attach for some reason

[cassandra@a4 ~]$ jmap 701 -heap:format=b
Attaching to core -heap:format=b from executable 701, please wait...
Error attaching to core file: Can't attach to the core file

Jstack is not helping much either ….

[cassandra@a4 ~]$ jstack -l 701 > threads.txt
701: Unable to open socket file: target process not responding or HotSpot VM 
not loaded
The -F option can be used when the target process is not responding
[cassandra@a4 ~]$ jstack -F -l 701 > threads.txt
Attaching to process ID 701, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 20.7-b02
java.lang.NullPointerException
at sun.jvm.hotspot.oops.InstanceKlass.computeSubtypeOf(InstanceKlass.java:426)
at sun.jvm.hotspot.oops.Klass.isSubtypeOf(Klass.java:137)
at sun.jvm.hotspot.oops.Oop.isA(Oop.java:100)
at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:93)
at sun.jvm.hotspot.runtime.DeadlockDetector.print(DeadlockDetector.java:39)
at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:52)
at sun.jvm.hotspot.tools.StackTrace.run(StackTrace.java:45)
at sun.jvm.hotspot.tools.JStack.run(JStack.java:60)
at sun.jvm.hotspot.tools.Tool.start(Tool.java:221)
at sun.jvm.hotspot.tools.JStack.main(JStack.java:86)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at sun.tools.jstack.JStack.runJStackTool(JStack.java:118)
at sun.tools.jstack.JStack.main(JStack.java:84)

Thanks,
Dean

how to debug slowdowns from these log snippets(we know the keys being accessed as well)

Reply via email to