On Mon, Oct 3, 2011 at 10:12 AM, Ramesh Natarajan <rames...@gmail.com> wrote: > I am running a cassandra cluster of 6 nodes running RHEL6 virtualized by > ESXi 5.0. Each VM is configured with 20GB of ram and 12 cores. Our test > setup performs about 3000 inserts per second. The cassandra data partition > is on a XFS filesystem mounted with options > (noatime,nodiratime,nobarrier,logbufs=8). We have no swap enabled on the VMs > and the vm.swappiness set to 0. To avoid any contention issues our cassandra > VMs are not running any other application other than cassandra. > The test runs fine for about 12 hours or so. After that the performance > starts to degrade to about 1500 inserts per sec. By 18-20 hours the inserts > go down to 300 per sec. > if i do a truncate, it starts clean, runs for a few hours (not as clean as > rebooting). > We find a direct correlation between kswapd kicking in after 12 hours or so > and the performance degradation. If i look at the cached memory it is > close to 10G. I am not getting a OOM error in cassandra. So looks like we > are not running out of memory. Can some one explain if we can optimize this > so that kswapd doesn't kick in. > > Our top output shows > top - 16:23:54 up 2 days, 23:17, 4 users, load average: 2.21, 2.08, 2.02 > Tasks: 213 total, 1 running, 212 sleeping, 0 stopped, 0 zombie > Cpu(s): 1.6%us, 0.8%sy, 0.0%ni, 90.9%id, 6.3%wa, 0.0%hi, 0.2%si, > 0.0%st > Mem: 20602812k total, 20320424k used, 282388k free, 1020k buffers > Swap: 0k total, 0k used, 0k free, 10145516k cached > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 2586 root 20 0 36.3g 17g 8.4g S 32.1 88.9 8496:37 java > > java output > root 2453 1 99 Sep30 pts/0 9-13:51:38 java -ea > -javaagent:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms10059M -Xmx10059M > -Xmn1200M -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC > -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 > -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 > -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true > -Dcom.sun.management.jmxremote.port=7199 > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.authenticate=false > -Djava.rmi.server.hostname=10.19.104.14 -Djava.net.preferIPv4Stack=true > -Dlog4j.configuration=log4j-server.properties > -Dlog4j.defaultInitOverride=true -cp > ./apache-cassandra-0.8.6/bin/../conf:./apache-cassandra-0.8.6/bin/../build/classes/main:./apache-cassandra-0.8.6/bin/../build/classes/thrift:./apache-cassandra-0.8.6/bin/../lib/antlr-3.2.jar:./apache-cassandra-0.8.6/bin/../lib/apache-cassandra-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/apache-cassandra-thrift-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-sources-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/commons-cli-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/commons-codec-1.2.jar:./apache-cassandra-0.8.6/bin/../lib/commons-collections-3.2.1.jar:./apache-cassandra-0.8.6/bin/../lib/commons-lang-2.4.jar:./apache-cassandra-0.8.6/bin/../lib/concurrentlinkedhashmap-lru-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/guava-r08.jar:./apache-cassandra-0.8.6/bin/../lib/high-scale-lib-1.1.2.jar:./apache-cassandra-0.8.6/bin/../lib/jackson-core-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jackson-mapper-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar:./apache-cassandra-0.8.6/bin/../lib/jline-0.9.94.jar:./apache-cassandra-0.8.6/bin/../lib/json-simple-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/libthrift-0.6.jar:./apache-cassandra-0.8.6/bin/../lib/log4j-1.2.16.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-examples.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-impl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-jmx.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-remote.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rimpl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rjmx.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-tools.jar:./apache-cassandra-0.8.6/bin/../lib/servlet-api-2.5-20081211.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-api-1.6.1.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-log4j12-1.6.1.jar:./apache-cassandra-0.8.6/bin/../lib/snakeyaml-1.6.jar > org.apache.cassandra.thrift.CassandraDaemon > > > Ring output > [root@CAP4-CNode4 apache-cassandra-0.8.6]# ./bin/nodetool -h 127.0.0.1 ring > Address DC Rack Status State Load Owns > Token > > 141784319550391026443072753096570088105 > 10.19.104.11 datacenter1 rack1 Up Normal 19.92 GB > 16.67% 0 > 10.19.104.12 datacenter1 rack1 Up Normal 19.3 GB > 16.67% 28356863910078205288614550619314017621 > 10.19.104.13 datacenter1 rack1 Up Normal 18.57 GB > 16.67% 56713727820156410577229101238628035242 > 10.19.104.14 datacenter1 rack1 Up Normal 19.34 GB > 16.67% 85070591730234615865843651857942052863 > 10.19.105.11 datacenter1 rack1 Up Normal 19.88 GB > 16.67% 113427455640312821154458202477256070484 > 10.19.105.12 datacenter1 rack1 Up Normal 20 GB > 16.67% 141784319550391026443072753096570088105 > [root@CAP4-CNode4 apache-cassandra-0.8.6]#
How many CFs? can you describe CF and post the configuration? Do you have row cache enabled?