the ordinary
> How much swap space do you have configured?
2 GB and 24 GB of system memory.
Dan
From: Chris Goffinet [mailto:c...@chrisgoffinet.com]
Sent: December-20-10 17:32
To: user@cassandra.apache.org
Subject: Re: Severe Reliability Problems - 0.7 RC2
What kernel versio
What kernel version are you running? I have seen with I/O intense nodes with
2.6.18 to 2.6.24 the kernel has a bug where it locks the JVM and spins to
100%.
On Mon, Dec 20, 2010 at 1:14 PM, Brandon Williams wrote:
> On Mon, Dec 20, 2010 at 2:13 PM, Dan Hendry wrote:
>
>> Yes, I have tried that (
On Mon, Dec 20, 2010 at 2:13 PM, Dan Hendry wrote:
> Yes, I have tried that (although only twice). Same impact as a regular
> kill: nothing happens and I get no stacktrace output. It is however on my
> list of things to try again the next time a node dies. I am also not able to
> attach jstack to
> There were a couple of threads on lkml recently that may be relevant,
> but I have to run so I can't find the URL:s atm (todo later tonight).
Ok, I cannot figure out how to find the "first" message in a thread in
any of the lkml archives, but these two threads may be of interest,
especially if y
@cassandra.apache.org>"
mailto:user@cassandra.apache.org>>
Subject: RE: Severe Reliability Problems - 0.7 RC2
Yes, I have tried that (although only twice). Same impact as a regular kill:
nothing happens and I get no stacktrace output. It is however on my list of
things to try agai
not help) and I have now changed
disk_access_mode from auto to mmap_index_only on two of the nodes.
Dan
From: Kani [mailto:javier.canil...@gmail.com]
Sent: December-20-10 14:14
To: user@cassandra.apache.org
Subject: Re: Severe Reliability Problems - 0.7 RC2
Have you tried to send a KILL
> be correlated is the flushing of memtables tables. One of the strangest
> stats I am getting when in this state is memory paging: 3727168.00 pages
> scanned/second (see sar -B output). Occasionally, if I leave the process
> alone (~1 h) it recovers (maybe 1 in 5 times), otherwise the only way to
Have you tried to send a KILL -3 to the Cassandra process before you send
KILL -9? This way you will see what the threads are doing (and maybe
blocking). The majority of the threads may give you the right spot where to
look for the problem.
I'm not much of a good linux administrator, but when some