Thanks to everyone who responded (I think I learned a few new tricks
from seeing what you tried and how your monitor).  I didn't see any
patterns in JVM, OS, cassandra versions etc.

At this time I'm confident in saying CASSANDRA-2868 (and thus really is the culprit.

On 07/12/2011 09:28 AM, Chris Burroughs wrote:
> ### Preamble
> There have been several reports on the mailing list of the JVM running
> Cassandra using "too much" memory.  That is, the resident set size is
>>> (max java heap size + mmaped segments) and continues to grow until the
> process swaps, kernel oom killer comes along, or performance just
> degrades too far due to the lack of space for the page cache.  It has
> been unclear from these reports if there is a pattern.  My hope here is
> that by comparing JVM versions, OS versions, JVM configuration etc., we
> will find something.  Thank you everyone for your time.
> Some example reports:
>  -
>  -
>  -
>  -
>  -
> For reference theories include (in no particular order):
>  - memory fragmentation
>  - JVM bug
>  - OS/glibc bug
>  - direct memory
>  - swap induced fragmentation
>  - some other bad interaction of cassandra/jdk/jvm/os/nio-insanity.
> ### Survey
> 1. Do you think you are experiencing this problem?
> 2.  Why? (This is a good time to share a graph like
> or
> 2. Are you using mmap? (If yes be sure to have read
> , and explain how you have
> used pmap [or another tool] to rule you mmap and top decieving you.)
> 3. Are you using JNA?  Was mlockall succesful (it's in the logs on startup)?
> 4. Is swap enabled? Are you swapping?
> 5. What version of Apache Cassandra are you using?
> 6. What is the earliest version of Apache Cassandra you recall seeing
> this problem with?
> 7. Have you tried the patch from CASSANDRA-2654 ?
> 8. What jvm and version are you using?
> 9. What OS and version are you using?
> 10. What are your jvm flags?
> 11. Have you tried limiting direct memory (-XX:MaxDirectMemorySize)
> 12. Can you characterise how much GC your cluster is doing?
> 13. Approximately how many read/writes per unit time is your cluster
> doing (per node or the whole cluster)?
> 14.  How are you column families configured (key cache size, row cache
> size, etc.)?

Reply via email to