In order to understand what's going on you might want to first just do write test, look at the results and then do just the read tests and then do both read / write tests.
Since you mentioned high update/deletes I should also ask your CL for writes/reads? with high updates/delete + high CL I think one should expect reads to slow down when sstables have not been compacted. You have 20G space and 17G is used by your process and I also see 36G VIRT which I don't really understand why it's that high when swap is disabled. Look at sar -r output too to make sure there are no swaps occurring. Also, verify jna.jar is installed. On Mon, Oct 3, 2011 at 11:52 AM, Ramesh Natarajan <rames...@gmail.com> wrote: > I will start another test run to collect these stats. Our test model is in > the neighborhood of 4500 inserts, 8000 updates&deletes and 1500 reads every > second across 6 servers. > Can you elaborate more on reducing the heap space? Do you think it is a > problem with 17G RSS? > thanks > Ramesh > > > On Mon, Oct 3, 2011 at 1:33 PM, Mohit Anchlia <mohitanch...@gmail.com> > wrote: >> >> I am wondering if you are seeing issues because of more frequent >> compactions kicking in. Is this primarily write ops or reads too? >> During the period of test gather data like: >> >> 1. cfstats >> 2. tpstats >> 3. compactionstats >> 4. netstats >> 5. iostat >> >> You have RSS memory close to 17gb. Maybe someone can give further >> advise if that could be because of mmap. You might want to lower your >> heap size to 6-8G and see if that helps. >> >> Also, check if you have jna.jar deployed and you see malloc successful >> message in the logs. >> >> On Mon, Oct 3, 2011 at 10:36 AM, Ramesh Natarajan <rames...@gmail.com> >> wrote: >> > We have 5 CF. Attached is the output from the describe command. We >> > don't >> > have row cache enabled. >> > Thanks >> > Ramesh >> > Keyspace: MSA: >> > Replication Strategy: org.apache.cassandra.locator.SimpleStrategy >> > Durable Writes: true >> > Options: [replication_factor:3] >> > Column Families: >> > ColumnFamily: admin >> > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type >> > Default column value validator: >> > org.apache.cassandra.db.marshal.UTF8Type >> > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type >> > Row cache size / save period in seconds: 0.0/0 >> > Key cache size / save period in seconds: 200000.0/14400 >> > Memtable thresholds: 0.5671875/1440/121 (millions of >> > ops/minutes/MB) >> > GC grace seconds: 3600 >> > Compaction min/max thresholds: 4/32 >> > Read repair chance: 1.0 >> > Replicate on write: true >> > Built indexes: [] >> > ColumnFamily: modseq >> > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type >> > Default column value validator: >> > org.apache.cassandra.db.marshal.UTF8Type >> > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type >> > Row cache size / save period in seconds: 0.0/0 >> > Key cache size / save period in seconds: 500000.0/14400 >> > Memtable thresholds: 0.5671875/1440/121 (millions of >> > ops/minutes/MB) >> > GC grace seconds: 3600 >> > Compaction min/max thresholds: 4/32 >> > Read repair chance: 1.0 >> > Replicate on write: true >> > Built indexes: [] >> > ColumnFamily: msgid >> > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type >> > Default column value validator: >> > org.apache.cassandra.db.marshal.UTF8Type >> > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type >> > Row cache size / save period in seconds: 0.0/0 >> > Key cache size / save period in seconds: 500000.0/14400 >> > Memtable thresholds: 0.5671875/1440/121 (millions of >> > ops/minutes/MB) >> > GC grace seconds: 864000 >> > Compaction min/max thresholds: 4/32 >> > Read repair chance: 1.0 >> > Replicate on write: true >> > Built indexes: [] >> > ColumnFamily: participants >> > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type >> > Default column value validator: >> > org.apache.cassandra.db.marshal.UTF8Type >> > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type >> > Row cache size / save period in seconds: 0.0/0 >> > Key cache size / save period in seconds: 500000.0/14400 >> > Memtable thresholds: 0.5671875/1440/121 (millions of >> > ops/minutes/MB) >> > GC grace seconds: 3600 >> > Compaction min/max thresholds: 4/32 >> > Read repair chance: 1.0 >> > Replicate on write: true >> > Built indexes: [] >> > ColumnFamily: uid >> > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type >> > Default column value validator: >> > org.apache.cassandra.db.marshal.UTF8Type >> > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type >> > Row cache size / save period in seconds: 0.0/0 >> > Key cache size / save period in seconds: 2000000.0/14400 >> > Memtable thresholds: 0.4/1440/121 (millions of ops/minutes/MB) >> > GC grace seconds: 3600 >> > Compaction min/max thresholds: 4/32 >> > Read repair chance: 1.0 >> > Replicate on write: true >> > Built indexes: [] >> > >> > >> > >> > >> > On Mon, Oct 3, 2011 at 12:26 PM, Mohit Anchlia <mohitanch...@gmail.com> >> > wrote: >> >> >> >> On Mon, Oct 3, 2011 at 10:12 AM, Ramesh Natarajan <rames...@gmail.com> >> >> wrote: >> >> > I am running a cassandra cluster of 6 nodes running RHEL6 >> >> > virtualized >> >> > by >> >> > ESXi 5.0. Each VM is configured with 20GB of ram and 12 cores. Our >> >> > test >> >> > setup performs about 3000 inserts per second. The cassandra data >> >> > partition >> >> > is on a XFS filesystem mounted with options >> >> > (noatime,nodiratime,nobarrier,logbufs=8). We have no swap enabled on >> >> > the >> >> > VMs >> >> > and the vm.swappiness set to 0. To avoid any contention issues our >> >> > cassandra >> >> > VMs are not running any other application other than cassandra. >> >> > The test runs fine for about 12 hours or so. After that the >> >> > performance >> >> > starts to degrade to about 1500 inserts per sec. By 18-20 hours the >> >> > inserts >> >> > go down to 300 per sec. >> >> > if i do a truncate, it starts clean, runs for a few hours (not as >> >> > clean >> >> > as >> >> > rebooting). >> >> > We find a direct correlation between kswapd kicking in after 12 hours >> >> > or >> >> > so >> >> > and the performance degradation. If i look at the cached memory it >> >> > is >> >> > close to 10G. I am not getting a OOM error in cassandra. So looks >> >> > like >> >> > we >> >> > are not running out of memory. Can some one explain if we can >> >> > optimize >> >> > this >> >> > so that kswapd doesn't kick in. >> >> > >> >> > Our top output shows >> >> > top - 16:23:54 up 2 days, 23:17, 4 users, load average: 2.21, 2.08, >> >> > 2.02 >> >> > Tasks: 213 total, 1 running, 212 sleeping, 0 stopped, 0 zombie >> >> > Cpu(s): 1.6%us, 0.8%sy, 0.0%ni, 90.9%id, 6.3%wa, 0.0%hi, >> >> > 0.2%si, >> >> > 0.0%st >> >> > Mem: 20602812k total, 20320424k used, 282388k free, 1020k >> >> > buffers >> >> > Swap: 0k total, 0k used, 0k free, 10145516k >> >> > cached >> >> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> >> > >> >> > 2586 root 20 0 36.3g 17g 8.4g S 32.1 88.9 8496:37 java >> >> > >> >> > java output >> >> > root 2453 1 99 Sep30 pts/0 9-13:51:38 java -ea >> >> > -javaagent:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar >> >> > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms10059M >> >> > -Xmx10059M >> >> > -Xmn1200M -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC >> >> > -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled >> >> > -XX:SurvivorRatio=8 >> >> > -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 >> >> > -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true >> >> > -Dcom.sun.management.jmxremote.port=7199 >> >> > -Dcom.sun.management.jmxremote.ssl=false >> >> > -Dcom.sun.management.jmxremote.authenticate=false >> >> > -Djava.rmi.server.hostname=10.19.104.14 >> >> > -Djava.net.preferIPv4Stack=true >> >> > -Dlog4j.configuration=log4j-server.properties >> >> > -Dlog4j.defaultInitOverride=true -cp >> >> > >> >> > >> >> > ./apache-cassandra-0.8.6/bin/../conf:./apache-cassandra-0.8.6/bin/../build/classes/main:./apache-cassandra-0.8.6/bin/../build/classes/thrift:./apache-cassandra-0.8.6/bin/../lib/antlr-3.2.jar:./apache-cassandra-0.8.6/bin/../lib/apache-cassandra-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/apache-cassandra-thrift-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-sources-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/commons-cli-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/commons-codec-1.2.jar:./apache-cassandra-0.8.6/bin/../lib/commons-collections-3.2.1.jar:./apache-cassandra-0.8.6/bin/../lib/commons-lang-2.4.jar:./apache-cassandra-0.8.6/bin/../lib/concurrentlinkedhashmap-lru-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/guava-r08.jar:./apache-cassandra-0.8.6/bin/../lib/high-scale-lib-1.1.2.jar:./apache-cassandra-0.8.6/bin/../lib/jackson-core-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jackson-mapper-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar:./apache-cassandra-0.8.6/bin/../lib/jline-0.9.94.jar:./apache-cassandra-0.8.6/bin/../lib/json-simple-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/libthrift-0.6.jar:./apache-cassandra-0.8.6/bin/../lib/log4j-1.2.16.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-examples.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-impl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-jmx.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-remote.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rimpl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rjmx.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-tools.jar:./apache-cassandra-0.8.6/bin/../lib/servlet-api-2.5-20081211.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-api-1.6.1.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-log4j12-1.6.1.jar:./apache-cassandra-0.8.6/bin/../lib/snakeyaml-1.6.jar >> >> > org.apache.cassandra.thrift.CassandraDaemon >> >> > >> >> > >> >> > Ring output >> >> > [root@CAP4-CNode4 apache-cassandra-0.8.6]# ./bin/nodetool -h >> >> > 127.0.0.1 >> >> > ring >> >> > Address DC Rack Status State Load >> >> > Owns >> >> > Token >> >> > >> >> > 141784319550391026443072753096570088105 >> >> > 10.19.104.11 datacenter1 rack1 Up Normal 19.92 GB >> >> > 16.67% 0 >> >> > 10.19.104.12 datacenter1 rack1 Up Normal 19.3 GB >> >> > 16.67% 28356863910078205288614550619314017621 >> >> > 10.19.104.13 datacenter1 rack1 Up Normal 18.57 GB >> >> > 16.67% 56713727820156410577229101238628035242 >> >> > 10.19.104.14 datacenter1 rack1 Up Normal 19.34 GB >> >> > 16.67% 85070591730234615865843651857942052863 >> >> > 10.19.105.11 datacenter1 rack1 Up Normal 19.88 GB >> >> > 16.67% 113427455640312821154458202477256070484 >> >> > 10.19.105.12 datacenter1 rack1 Up Normal 20 GB >> >> > 16.67% 141784319550391026443072753096570088105 >> >> > [root@CAP4-CNode4 apache-cassandra-0.8.6]# >> >> >> >> How many CFs? can you describe CF and post the configuration? Do you >> >> have row cache enabled? >> > >> > > >