> be correlated is the flushing of memtables tables. One of the strangest > stats I am getting when in this state is memory paging: 3727168.00 pages > scanned/second (see sar -B output). Occasionally, if I leave the process > alone (~1 h) it recovers (maybe 1 in 5 times), otherwise the only way to
Sounds to me like the Cassandra process is triggering something along the lines of fast-path page cache eviction or something similar. The fact that you see Cassandra in 100% system (as opposed to user) CPU and you have a huge number of pages scanned, certainly sounds like you're hitting an edge case or bug in the virtual memory system in the kernel. The JVM can't really do much about it if it's in a syscall that never returns... There were a couple of threads on lkml recently that may be relevant, but I have to run so I can't find the URL:s atm (todo later tonight). Is anyone aware of a way to get a kernel stack trace for a given process on a running system? Cargo cult solution: Upgrade the kernel :) -- / Peter Schuller