Couple more details. I confirmed that swap space is not being used (free -m shows 0 swap) and cassandra.log has a message like "JNA mlockall successful". top shows the process having 9g in resident memory but 21.6g in virtual...What accounts for the much larger virtual number? some kind of off-heap memory?
I'm a little puzzled as to why I would get such long pauses without swapping. I uncommented all the gc logging options in cassandra-env.sh to try to see what is going on when the node freezes. Thanks Kireet On Mon, Jul 2, 2012 at 9:51 PM, feedly team <feedly...@gmail.com> wrote: > Yeah I noticed the leap second problem and ran the suggested fix, but I > have been facing these problems before Saturday and still see the > occasional failures after running the fix. > > Thanks. > > > On Mon, Jul 2, 2012 at 11:17 AM, Marcus Both <mb...@terra.com.br> wrote: > >> Yeah! Look that. >> >> http://arstechnica.com/business/2012/07/one-day-later-the-leap-second-v-the-internet-scorecard/ >> I had the same problem. The solution was rebooting. >> >> On Mon, 2 Jul 2012 11:08:57 -0400 >> feedly team <feedly...@gmail.com> wrote: >> >> > Hello, >> > I recently set up a 2 node cassandra cluster on dedicated hardware. >> In >> > the logs there have been a lot of "InetAddress xxx is now dead' or UP >> > messages. Comparing the log messages between the 2 nodes, they seem to >> > coincide with extremely long ParNew collections. I have seem some of up >> to >> > 50 seconds. The installation is pretty vanilla, I didn't change any >> > settings and the machines don't seem particularly busy - cassandra is >> the >> > only thing running on the machine with an 8GB heap. The machine has >> 64GB of >> > RAM and CPU/IO usage looks pretty light. I do see a lot of 'Heap is xxx >> > full. You may need to reduce memtable and/or cache sizes' messages. >> Would >> > this help with the long ParNew collections? That message seems to be >> > triggered on a full collection. >> >> -- >> Marcus Both >> >> >