I opened #1214 about this. I hope people will take a look and provide their feedback.
https://issues.apache.org/jira/browse/CASSANDRA-1214 Thanks. On Sun, Jun 20, 2010 at 3:58 PM, James Golick <jamesgol...@gmail.com> wrote: > uh. wow. I just read up on all this again, and read the code, and I'm a > little surprised, to be honest. > > There's no attempt to manage the total size of the mmap()'d IO, and the > default buffer allocation is quite sizeable. So, basically, if you have any > data, over time, you will run out of memory, and there's no way at all to > control it. > > Can we consider changing the default? > > > On Sun, Jun 20, 2010 at 3:37 PM, James Golick <jamesgol...@gmail.com>wrote: > >> Thanks for your thoughts. Answers below: >> >> On Sun, Jun 20, 2010 at 2:21 PM, Peter Schuller < >> peter.schul...@infidyne.com> wrote: >> >>> > The memory problems I've posted about before have gotten much worse and >>> our >>> > nodes are becoming incredibly slow/unusable every 24 hours or so. >>> Basically, >>> > the JVM reports that only 14GB is committed, but the RSS of the process >>> is >>> > 22GB, and cassandra is completely unresponsive, but still having >>> requests >>> > routed to it internally, so it completely destroys performance. >>> > I'm at a loss for how to diagnose this issue. >>> >>> Sorry, I don't know the history of this (you mentioned you've alluded >>> to the problems before), so maybe I am being redundant or missing >>> something, but: >>> >>> (1) Is the machine swapping? (Actively swapping in/out as reported by >>> e.g. vmstat) >>> >> >> Yes, somewhat, although swappiness is set to 0. >> >> >>> (2) Do the logs indicate that GC is running excessively, thus >>> indicating an almost-out-of-heap condition? >>> >> >> It runs, but I wouldn't say excessively. >> >> >>> (3) mmap():ed memory that is currently resident will count towards >>> RSS; if you're using mmap():ed I/O (the default), that is to be >>> expected. >>> >> >> This is where I'm a little confused. I thought that mmap()'d IO didn't >> actually allocate memory. I thought it was just IO through a faster code >> path. >> >> >>> (4) If you are using mmap():ed I/O, that is also in and of itself >>> something which can cause trouble if the operating system decides to >>> swap your application out in favor of the mmap() >> >> (5) If you are swapping (see (1)), try switching from mmap():ed to >>> standard I/O (due to (4)), and/or try decreasing the swappyness if >>> you're on Linux (see /proc/sys/vm/swappiness). >>> >> >> I tried switching to standard IO mode, but it was very, very slow. What >> I'm confused about here is that if mmap()'d IO actually allocates memory >> that can put pressure on other processes' memory, is there no way to bound >> that? If not, how can anybody safely use mmap()'d IO on the JVM without >> risking pushing their process's important pages out of memory. >> >> swappiness is already at 0. >> >> >>> (6) Is Cassandra CPU bound or disk bound in general, regardless of >>> swapping? >>> >> >> Hard to tell because of the paging. >> >> >>> >>> -- >>> / Peter Schuller >>> >> >> >