Thanks Nate for your quick reply. We will test with different concurrent_compactors settings. It would save lot of time for others if documentation can be fixed. We spent days to come up with this setting and that too by chance.
As far as data folder and IO is concerned. I confirmed that data folders in both cases is the same hardly any reads in both cases (see below). Can you tell me what could trigger very high read repair numbers in 2.1.11 compared to 2.0.9 (10 times more in 2.1.11)? Please find tpstats and iostat for both 2.0.9 and 2.1.11: Tpstats for 2.0.9 Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 4352903 0 0 ReadStage 0 0 46282140 0 0 RequestResponseStage 0 0 12779370 0 0 ReadRepairStage 0 0 18719 0 0 ReplicateOnWriteStage 0 0 0 0 0 MiscStage 0 0 0 0 0 HintedHandoff 0 0 5 0 0 FlushWriter 0 0 91885 0 10 MemoryMeter 0 0 82032 0 0 GossipStage 0 0 457802 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 6 0 0 CompactionExecutor 0 0 993103 0 0 ValidationExecutor 0 0 0 0 0 MigrationStage 0 0 28 0 0 commitlog_archiver 0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 PendingRangeCalculator 0 0 5 0 0 MemtablePostFlusher 0 0 94496 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 0 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 Tpstats for 2.1.11 Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 1113428 0 0 ReadStage 0 0 23496750 0 0 RequestResponseStage 0 0 29951269 0 0 ReadRepairStage 0 0 3848733 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 HintedHandoff 0 0 4 0 0 GossipStage 0 0 182727 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor 0 0 89820 0 0 ValidationExecutor 0 0 0 0 0 MigrationStage 0 0 10 0 0 AntiEntropyStage 0 0 0 0 0 PendingRangeCalculator 0 0 6 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 38222 0 0 MemtablePostFlush 0 0 39814 0 0 MemtableReclaimMemory 0 0 38222 0 0 Message type Dropped READ 0 RANGE_SLICE 0 _TRACE 0 MUTATION 0 COUNTER_MUTATION 0 BINARY 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 0 IOSTAT for 2.1.11 avg-cpu: %user %nice %system %iowait %steal %idle 21.21 1.10 0.70 0.12 0.03 76.84 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util xvda 0.01 3.00 0.03 2.50 0.00 0.03 28.87 0.01 5.09 0.27 0.07 xvdb 0.00 17.05 0.03 17.52 0.00 0.49 57.06 0.03 1.79 0.41 0.71 xvdc 0.00 17.31 0.03 17.93 0.00 0.50 56.93 0.03 1.74 0.40 0.72 dm-0 0.00 0.00 0.07 56.41 0.00 0.99 35.82 0.11 2.01 0.23 1.27 IOSTAT for 2.0.9 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn xvda 3.87 3.34 211.09 89522823 5655075464 xvdb 5.37 3.82 408.26 102210432 10937070024 xvdc 5.81 4.18 435.33 111917570 11662380112 dm-0 20.35 7.99 843.59 214122034 22599449976 From: Nate McCall <n...@thelastpickle.com<mailto:n...@thelastpickle.com>> Reply-To: Cassandra Users <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Date: Friday, January 29, 2016 at 3:01 PM To: Cassandra Users <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Subject: Re: Slow performance after upgrading from 2.0.9 to 2.1.11 On Fri, Jan 29, 2016 at 12:30 PM, Peddi, Praveen <pe...@amazon.com<mailto:pe...@amazon.com>> wrote: > > Hello, > We have another update on performance on 2.1.11. compression_chunk_size > didn’t really help much but We changed concurrent_compactors from default to > 64 in 2.1.11 and read latencies improved significantly. However, 2.1.11 read > latencies are still 1.5 slower than 2.0.9. One thing we noticed in JMX metric > that could affect read latencies is that 2.1.11 is running > ReadRepairedBackground and ReadRepairedBlocking too frequently compared to > 2.0.9 even though our read_repair_chance is same on both. Could anyone shed > some light on why 2.1.11 could be running read repair 10 to 50 times more in > spite of same configuration on both clusters? > > dclocal_read_repair_chance=0.100000 AND > read_repair_chance=0.000000 AND > > Here is the table for read repair metrics for both clusters. > 2.0.9 2.1.11 > ReadRepairedBackground 5MinAvg 0.006 0.1 > 15MinAvg 0.009 0.153 > ReadRepairedBlocking 5MinAvg 0.002 0.55 > 15MinAvg 0.007 0.91 The concurrent_compactors setting is not a surprise. The default in 2.0 was the number of cores and in 2.1 is now: "the smaller of (number of disks, number of cores), with a minimum of 2 and a maximum of 8" https://github.com/apache/cassandra/blob/cassandra-2.1/conf/cassandra.yaml#L567-L568 So in your case this was "8" in 2.0 vs. "2" in 2.1 (assuming these are still the stock-ish c3.2xl mentioned previously?). Regardless, 64 is way to high. Set it back to 8. Note: this got dropped off the "Upgrading" guide for 2.1 in https://github.com/apache/cassandra/blob/cassandra-2.1/NEWS.txt though, so lots of folks miss it. Per said upgrading guide - are you sure the data directory is in the same place between the two versions and you are not pegging the wrong disk/partition? The default locations changed for data, cache and commitlog: https://github.com/apache/cassandra/blob/cassandra-2.1/NEWS.txt#L171-L180 I ask because being really busy on a single disk would cause latency and potentially dropped messages which could eventually cause a DigestMismatchException requiring a blocking read repair. Anything unusual in the node-level IO activity between the two clusters? That said, the difference in nodetool tpstats output during and after on both could be insightful. When we do perf tests internally we usually use a combination of Grafana and Riemann to monitor Cassandra internals, the JVM and the OS. Otherwise, it's guess work. -- ----------------- Nate McCall Austin, TX @zznate Co-Founder & Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com