I will start another test run to collect these stats. Our test model is in the neighborhood of 4500 inserts, 8000 updates&deletes and 1500 reads every second across 6 servers. Can you elaborate more on reducing the heap space? Do you think it is a problem with 17G RSS?
thanks Ramesh On Mon, Oct 3, 2011 at 1:33 PM, Mohit Anchlia <mohitanch...@gmail.com>wrote: > I am wondering if you are seeing issues because of more frequent > compactions kicking in. Is this primarily write ops or reads too? > During the period of test gather data like: > > 1. cfstats > 2. tpstats > 3. compactionstats > 4. netstats > 5. iostat > > You have RSS memory close to 17gb. Maybe someone can give further > advise if that could be because of mmap. You might want to lower your > heap size to 6-8G and see if that helps. > > Also, check if you have jna.jar deployed and you see malloc successful > message in the logs. > > On Mon, Oct 3, 2011 at 10:36 AM, Ramesh Natarajan <rames...@gmail.com> > wrote: > > We have 5 CF. Attached is the output from the describe command. We > don't > > have row cache enabled. > > Thanks > > Ramesh > > Keyspace: MSA: > > Replication Strategy: org.apache.cassandra.locator.SimpleStrategy > > Durable Writes: true > > Options: [replication_factor:3] > > Column Families: > > ColumnFamily: admin > > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type > > Default column value validator: > > org.apache.cassandra.db.marshal.UTF8Type > > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type > > Row cache size / save period in seconds: 0.0/0 > > Key cache size / save period in seconds: 200000.0/14400 > > Memtable thresholds: 0.5671875/1440/121 (millions of > ops/minutes/MB) > > GC grace seconds: 3600 > > Compaction min/max thresholds: 4/32 > > Read repair chance: 1.0 > > Replicate on write: true > > Built indexes: [] > > ColumnFamily: modseq > > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type > > Default column value validator: > > org.apache.cassandra.db.marshal.UTF8Type > > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type > > Row cache size / save period in seconds: 0.0/0 > > Key cache size / save period in seconds: 500000.0/14400 > > Memtable thresholds: 0.5671875/1440/121 (millions of > ops/minutes/MB) > > GC grace seconds: 3600 > > Compaction min/max thresholds: 4/32 > > Read repair chance: 1.0 > > Replicate on write: true > > Built indexes: [] > > ColumnFamily: msgid > > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type > > Default column value validator: > > org.apache.cassandra.db.marshal.UTF8Type > > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type > > Row cache size / save period in seconds: 0.0/0 > > Key cache size / save period in seconds: 500000.0/14400 > > Memtable thresholds: 0.5671875/1440/121 (millions of > ops/minutes/MB) > > GC grace seconds: 864000 > > Compaction min/max thresholds: 4/32 > > Read repair chance: 1.0 > > Replicate on write: true > > Built indexes: [] > > ColumnFamily: participants > > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type > > Default column value validator: > > org.apache.cassandra.db.marshal.UTF8Type > > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type > > Row cache size / save period in seconds: 0.0/0 > > Key cache size / save period in seconds: 500000.0/14400 > > Memtable thresholds: 0.5671875/1440/121 (millions of > ops/minutes/MB) > > GC grace seconds: 3600 > > Compaction min/max thresholds: 4/32 > > Read repair chance: 1.0 > > Replicate on write: true > > Built indexes: [] > > ColumnFamily: uid > > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type > > Default column value validator: > > org.apache.cassandra.db.marshal.UTF8Type > > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type > > Row cache size / save period in seconds: 0.0/0 > > Key cache size / save period in seconds: 2000000.0/14400 > > Memtable thresholds: 0.4/1440/121 (millions of ops/minutes/MB) > > GC grace seconds: 3600 > > Compaction min/max thresholds: 4/32 > > Read repair chance: 1.0 > > Replicate on write: true > > Built indexes: [] > > > > > > > > > > On Mon, Oct 3, 2011 at 12:26 PM, Mohit Anchlia <mohitanch...@gmail.com> > > wrote: > >> > >> On Mon, Oct 3, 2011 at 10:12 AM, Ramesh Natarajan <rames...@gmail.com> > >> wrote: > >> > I am running a cassandra cluster of 6 nodes running RHEL6 virtualized > >> > by > >> > ESXi 5.0. Each VM is configured with 20GB of ram and 12 cores. Our > test > >> > setup performs about 3000 inserts per second. The cassandra data > >> > partition > >> > is on a XFS filesystem mounted with options > >> > (noatime,nodiratime,nobarrier,logbufs=8). We have no swap enabled on > the > >> > VMs > >> > and the vm.swappiness set to 0. To avoid any contention issues our > >> > cassandra > >> > VMs are not running any other application other than cassandra. > >> > The test runs fine for about 12 hours or so. After that the > performance > >> > starts to degrade to about 1500 inserts per sec. By 18-20 hours the > >> > inserts > >> > go down to 300 per sec. > >> > if i do a truncate, it starts clean, runs for a few hours (not as > clean > >> > as > >> > rebooting). > >> > We find a direct correlation between kswapd kicking in after 12 hours > or > >> > so > >> > and the performance degradation. If i look at the cached memory it > is > >> > close to 10G. I am not getting a OOM error in cassandra. So looks > like > >> > we > >> > are not running out of memory. Can some one explain if we can optimize > >> > this > >> > so that kswapd doesn't kick in. > >> > > >> > Our top output shows > >> > top - 16:23:54 up 2 days, 23:17, 4 users, load average: 2.21, 2.08, > >> > 2.02 > >> > Tasks: 213 total, 1 running, 212 sleeping, 0 stopped, 0 zombie > >> > Cpu(s): 1.6%us, 0.8%sy, 0.0%ni, 90.9%id, 6.3%wa, 0.0%hi, 0.2%si, > >> > 0.0%st > >> > Mem: 20602812k total, 20320424k used, 282388k free, 1020k > buffers > >> > Swap: 0k total, 0k used, 0k free, 10145516k > cached > >> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > >> > > >> > 2586 root 20 0 36.3g 17g 8.4g S 32.1 88.9 8496:37 java > >> > > >> > java output > >> > root 2453 1 99 Sep30 pts/0 9-13:51:38 java -ea > >> > -javaagent:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar > >> > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms10059M > >> > -Xmx10059M > >> > -Xmn1200M -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC > >> > -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled > >> > -XX:SurvivorRatio=8 > >> > -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 > >> > -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true > >> > -Dcom.sun.management.jmxremote.port=7199 > >> > -Dcom.sun.management.jmxremote.ssl=false > >> > -Dcom.sun.management.jmxremote.authenticate=false > >> > -Djava.rmi.server.hostname=10.19.104.14 > -Djava.net.preferIPv4Stack=true > >> > -Dlog4j.configuration=log4j-server.properties > >> > -Dlog4j.defaultInitOverride=true -cp > >> > > >> > > ./apache-cassandra-0.8.6/bin/../conf:./apache-cassandra-0.8.6/bin/../build/classes/main:./apache-cassandra-0.8.6/bin/../build/classes/thrift:./apache-cassandra-0.8.6/bin/../lib/antlr-3.2.jar:./apache-cassandra-0.8.6/bin/../lib/apache-cassandra-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/apache-cassandra-thrift-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-sources-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/commons-cli-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/commons-codec-1.2.jar:./apache-cassandra-0.8.6/bin/../lib/commons-collections-3.2.1.jar:./apache-cassandra-0.8.6/bin/../lib/commons-lang-2.4.jar:./apache-cassandra-0.8.6/bin/../lib/concurrentlinkedhashmap-lru-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/guava-r08.jar:./apache-cassandra-0.8.6/bin/../lib/high-scale-lib-1.1.2.jar:./apache-cassandra-0.8.6/bin/../lib/jackson-core-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jackson-mapper-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar:./apache-cassandra-0.8.6/bin/../lib/jline-0.9.94.jar:./apache-cassandra-0.8.6/bin/../lib/json-simple-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/libthrift-0.6.jar:./apache-cassandra-0.8.6/bin/../lib/log4j-1.2.16.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-examples.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-impl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-jmx.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-remote.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rimpl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rjmx.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-tools.jar:./apache-cassandra-0.8.6/bin/../lib/servlet-api-2.5-20081211.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-api-1.6.1.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-log4j12-1.6.1.jar:./apache-cassandra-0.8.6/bin/../lib/snakeyaml-1.6.jar > >> > org.apache.cassandra.thrift.CassandraDaemon > >> > > >> > > >> > Ring output > >> > [root@CAP4-CNode4 apache-cassandra-0.8.6]# ./bin/nodetool -h > 127.0.0.1 > >> > ring > >> > Address DC Rack Status State Load > >> > Owns > >> > Token > >> > > >> > 141784319550391026443072753096570088105 > >> > 10.19.104.11 datacenter1 rack1 Up Normal 19.92 GB > >> > 16.67% 0 > >> > 10.19.104.12 datacenter1 rack1 Up Normal 19.3 GB > >> > 16.67% 28356863910078205288614550619314017621 > >> > 10.19.104.13 datacenter1 rack1 Up Normal 18.57 GB > >> > 16.67% 56713727820156410577229101238628035242 > >> > 10.19.104.14 datacenter1 rack1 Up Normal 19.34 GB > >> > 16.67% 85070591730234615865843651857942052863 > >> > 10.19.105.11 datacenter1 rack1 Up Normal 19.88 GB > >> > 16.67% 113427455640312821154458202477256070484 > >> > 10.19.105.12 datacenter1 rack1 Up Normal 20 GB > >> > 16.67% 141784319550391026443072753096570088105 > >> > [root@CAP4-CNode4 apache-cassandra-0.8.6]# > >> > >> How many CFs? can you describe CF and post the configuration? Do you > >> have row cache enabled? > > > > >