Praveen, if you search "Read is slower in 2.1.6 than 2.0.14" in this forum,
you can find another thread I sent a while ago. The perf test I did
indicated that read is slower for 2.1.6 than 2.0.14 so we stayed with
2.0.14.

On Tue, Jan 12, 2016 at 9:35 AM, Peddi, Praveen <pe...@amazon.com> wrote:

> Thanks Jeff for your reply. Sorry for delayed response. We were running
> some more tests and wanted to wait for the results.
>
> So basically we saw higher CPU with 2.1.11 was higher compared to 2.0.9
> (see below) for the same exact load test. Memory spikes were also
> aggressive on 2.1.11.
>
> So we wanted to rule out any of our custom setting so we ended up doing
> some testing with Cassandra stress test and default Cassandra installation.
> Here are the results we saw between 2.0.9 and 2.1.11. Both are default
> installations and both use Cassandra stress test with same params. This is
> the closest apple-apple comparison we can get. As you can see both read and
> write latencies are 30 to 50% worse in 2.1.11 than 2.0.9. Since we are
> using default installation.
>
> *Highlights of the test:*
> Load: 2x reads and 1x writes
> CPU:  2.0.9 (goes upto 25%)  compared to 2.1.11 (goes upto 60%)
>
> Local read latency: 0.039 ms for 2.0.9 and 0.066 ms for 2.1.11
>
> Local write Latency: 0.033 ms for 2.0.9 Vs 0.030 ms for 2.1.11
>
> *One observation is, As the number of threads are increased, 2.1.11 read
> latencies are getting worse compared to 2.0.9 (see below table for 24
> threads vs 54 threads)*
> Not sure if anyone has done this kind of comparison before and what their
> thoughts are. I am thinking for this same reason
>
> 2.0.9 Plain  type       total ops     op/s     pk/s    row/s     mean
> med 0.95 0.99 0.999      max    time  16 threadCount  READ 66854 7205 7205
> 7205 1.6 1.3 2.8 3.5 9.6 85.3 9.3  16 threadCount  WRITE 33146 3572 3572
> 3572 1.3 1 2.6 3.3 7 206.5 9.3  16 threadCount  total 100000 10777 10777
> 10777 1.5 1.3 2.7 3.4 7.9 206.5 9.3 2.1.11 Plain                          16
> threadCount  READ 67096 6818 6818 6818 1.6 1.5 2.6 3.5 7.9 61.7 9.8  16
> threadCount  WRITE 32904 3344 3344 3344 1.4 1.3 2.3 3 6.5 56.7 9.8  16
> threadCount  total 100000 10162 10162 10162 1.6 1.4 2.5 3.2 6 61.7 9.8 2.0.9
> Plain                          24 threadCount  READ 66414 8167 8167 8167 2
> 1.6 3.7 7.5 16.7 208 8.1  24 threadCount  WRITE 33586 4130 4130 4130 1.7
> 1.3 3.4 5.4 25.6 45.4 8.1  24 threadCount  total 100000 12297 12297 12297
> 1.9 1.5 3.5 6.2 15.2 208 8.1 2.1.11 Plain                          24
> threadCount  READ 66628 7433 7433 7433 2.2 2.1 3.4 4.3 8.4 38.3 9  24
> threadCount  WRITE 33372 3723 3723 3723 2 1.9 3.1 3.8 21.9 37.2 9  24
> threadCount  total 100000 11155 11155 11155 2.1 2 3.3 4.1 8.8 38.3 9 2.0.9
> Plain                          54 threadCount  READ 67115 13419 13419
> 13419 2.8 2.6 4.2 6.4 36.9 82.4 5  54 threadCount  WRITE 32885 6575 6575
> 6575 2.5 2.3 3.9 5.6 15.9 81.5 5  54 threadCount  total 100000 19993 19993
> 19993 2.7 2.5 4.1 5.7 13.9 82.4 5 2.1.11 Plain                          54
> threadCount  READ 66780 8951 8951 8951 4.3 3.9 6.8 9.7 49.4 69.9 7.5  54
> threadCount  WRITE 33220 4453 4453 4453 3.5 3.2 5.7 8.2 36.8 68 7.5  54
> threadCount  total 100000 13404 13404 13404 4 3.7 6.6 9.2 48 69.9 7.5
>
> From: Jeff Jirsa <jeff.ji...@crowdstrike.com>
> Date: Thursday, January 7, 2016 at 1:01 AM
> To: "user@cassandra.apache.org" <user@cassandra.apache.org>, Peddi
> Praveen <pe...@amazon.com>
> Subject: Re: Slow performance after upgrading from 2.0.9 to 2.1.11
>
> Anecdotal evidence typically agrees that 2.1 is faster than 2.0 (our
> experience was anywhere from 20-60%, depending on workload).
>
> However, it’s not necessarily true that everything behaves exactly the
> same – in particular, memtables are different, commitlog segment handling
> is different, and GC params may need to be tuned differently for 2.1 than
> 2.0.
>
> When the system is busy, what’s it actually DOING? Cassandra exposes a TON
> of metrics – have you plugged any into a reporting system to see what’s
> going on? Is your latency due to pegged cpu, iowait/disk queues or gc
> pauses?
>
> My colleagues spent a lot of time validating different AWS EBS configs
> (video from reinvent at https://www.youtube.com/watch?v=1R-mgOcOSd4), 2.1
> was faster in almost every case, but you’re using an instance size I don’t
> believe we tried (too little RAM to be viable in production).  c3.2xl only
> gives you 15G of ram – most “performance” based systems want 2-4x that
> (people running G1 heaps usually start at 16G heaps and leave another
> 16-30G for page cache), you’re running fairly small hardware – it’s
> possible that 2.1 isn’t “as good” on smaller hardware.
>
> (I do see your domain, presumably you know all of this, but just to be
> sure):
>
> You’re using c3, so presumably you’re using EBS – are you using GP2? Which
> volume sizes? Are they the same between versions? Are you hitting your iops
> limits? Running out of burst tokens? Do you have enhanced networking
> enabled? At load, what part of your system is stressed? Are you cpu bound?
> Are you seeing GC pauses hurt latency? Have you tried changing
> memtable_allocation_type -> offheap objects  (available in 2.1, not in
> 2.0)?
>
> Tuning gc_grace is weird – do you understand what it does? Are you
> overwriting or deleting a lot of data in your test (that’d be unusual)? Are
> you doing a lot of compaction?
>
>
> From: "Peddi, Praveen"
> Reply-To: "user@cassandra.apache.org"
> Date: Wednesday, January 6, 2016 at 11:41 AM
> To: "user@cassandra.apache.org"
> Subject: Slow performance after upgrading from 2.0.9 to 2.1.11
>
> Hi,
> We have upgraded Cassandra from 2.0.9 to 2.1.11 in our loadtest
> environment with pretty much same yaml settings in both (removed unused
> yaml settings and renamed few others) and we have noticed performance on
> 2.1.11 is worse compared to 2.0.9. *After more investigation we found
> that the performance gets worse as we increase replication factor on 2.1.11
> where as on 2.0.9 performance is more or less same.* Has anything
> architecturally changed as far as replication is concerned in 2.1.11?
>
> All googling only suggested 2.1.11 should be FASTER than 2.0.9 so we are
> obviously doing something different. However the client code, load test is
> all identical in both cases.
>
> Details:
> Nodes: 3 ec2 c3.2x large
> R/W Consistency: QUORUM
> Renamed memtable_total_space_in_mb to memtable_heap_space_in_mb and
> removed unused properties from yaml file.
> We run compaction aggressive compaction with low gc_grace (15 mins) but
> this is true for both 2.0.9 and 2.1.11.
>
> As you can see, all p50, p90 and p99 latencies stayed with in 10%
> difference on 2.0.9 when we increased RF from 1 to 3, where as on 2.1.11
> latencies almost doubled (especially reads are much slower than writes).
>
> # Nodes  RF # of rows 2.0.9 2.1.11 READ                       P50 P90 P99
> P50 P90 P99 3 1 450 306 594 747 425 849 1085 3 3 450 358 634 877 708 1274
> 2642                   WRITE                 3 1 10 26 80 179 37 131 196 3
> 3 10 31 96 184 46 166 468
> Any pointers on how to debug performance issues will be appreciated.
>
> Praveen
>

Reply via email to