I am running a cassandra cluster of 6 nodes running RHEL6 virtualized by ESXi 5.0. Each VM is configured with 20GB of ram and 12 cores. Our test setup performs about 3000 inserts per second. The cassandra data partition is on a XFS filesystem mounted with options (noatime,nodiratime,nobarrier,logbufs=8). We have no swap enabled on the VMs and the vm.swappiness set to 0. To avoid any contention issues our cassandra VMs are not running any other application other than cassandra.
The test runs fine for about 12 hours or so. After that the performance starts to degrade to about 1500 inserts per sec. By 18-20 hours the inserts go down to 300 per sec. if i do a truncate, it starts clean, runs for a few hours (not as clean as rebooting). We find a direct correlation between kswapd kicking in after 12 hours or so and the performance degradation. If i look at the cached memory it is close to 10G. I am not getting a OOM error in cassandra. So looks like we are not running out of memory. Can some one explain if we can optimize this so that kswapd doesn't kick in. Our top output shows top - 16:23:54 up 2 days, 23:17, 4 users, load average: 2.21, 2.08, 2.02 Tasks: 213 total, 1 running, 212 sleeping, 0 stopped, 0 zombie Cpu(s): 1.6%us, 0.8%sy, 0.0%ni, 90.9%id, 6.3%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 20602812k total, 20320424k used, 282388k free, 1020k buffers Swap: 0k total, 0k used, 0k free, 10145516k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2586 root 20 0 36.3g 17g 8.4g S 32.1 88.9 8496:37 java java output root 2453 1 99 Sep30 pts/0 9-13:51:38 java -ea -javaagent:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms10059M -Xmx10059M -Xmn1200M -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djava.rmi.server.hostname=10.19.104.14 -Djava.net.preferIPv4Stack=true -Dlog4j.configuration=log4j-server.properties -Dlog4j.defaultInitOverride=true -cp ./apache-cassandra-0.8.6/bin/../conf:./apache-cassandra-0.8.6/bin/../build/classes/main:./apache-cassandra-0.8.6/bin/../build/classes/thrift:./apache-cassandra-0.8.6/bin/../lib/antlr-3.2.jar:./apache-cassandra-0.8.6/bin/../lib/apache-cassandra-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/apache-cassandra-thrift-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-sources-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/commons-cli-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/commons-codec-1.2.jar:./apache-cassandra-0.8.6/bin/../lib/commons-collections-3.2.1.jar:./apache-cassandra-0.8.6/bin/../lib/commons-lang-2.4.jar:./apache-cassandra-0.8.6/bin/../lib/concurrentlinkedhashmap-lru-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/guava-r08.jar:./apache-cassandra-0.8.6/bin/../lib/high-scale-lib-1.1.2.jar:./apache-cassandra-0.8.6/bin/../lib/jackson-core-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jackson-mapper-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar:./apache-cassandra-0.8.6/bin/../lib/jline-0.9.94.jar:./apache-cassandra-0.8.6/bin/../lib/json-simple-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/libthrift-0.6.jar:./apache-cassandra-0.8.6/bin/../lib/log4j-1.2.16.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-examples.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-impl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-jmx.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-remote.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rimpl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rjmx.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-tools.jar:./apache-cassandra-0.8.6/bin/../lib/servlet-api-2.5-20081211.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-api-1.6.1.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-log4j12-1.6.1.jar:./apache-cassandra-0.8.6/bin/../lib/snakeyaml-1.6.jar org.apache.cassandra.thrift.CassandraDaemon Ring output [root@CAP4-CNode4 apache-cassandra-0.8.6]# ./bin/nodetool -h 127.0.0.1 ring Address DC Rack Status State Load Owns Token 141784319550391026443072753096570088105 10.19.104.11 datacenter1 rack1 Up Normal 19.92 GB 16.67% 0 10.19.104.12 datacenter1 rack1 Up Normal 19.3 GB 16.67% 28356863910078205288614550619314017621 10.19.104.13 datacenter1 rack1 Up Normal 18.57 GB 16.67% 56713727820156410577229101238628035242 10.19.104.14 datacenter1 rack1 Up Normal 19.34 GB 16.67% 85070591730234615865843651857942052863 10.19.105.11 datacenter1 rack1 Up Normal 19.88 GB 16.67% 113427455640312821154458202477256070484 10.19.105.12 datacenter1 rack1 Up Normal 20 GB 16.67% 141784319550391026443072753096570088105 [root@CAP4-CNode4 apache-cassandra-0.8.6]#