Hey folks,

Over the last 9 months Jordan and I have worked on CASSANDRA-15452 [1].
The TL;DR is that we're internalizing a read ahead buffer to allow us to do
fewer requests to disk during compaction and range reads.  This results in
far fewer system calls (roughly 16x reduction) and on systems with higher
read latency, a significant improvement in compaction throughput.  We've
tested several different EBS configurations and found it delivers up to a
10x improvement when read ahead is optimized to minimize read latency.  I
worked with AWS and the EBS team directly on this and the Best Practices
for C* on EBS [2] I wrote for them.  I've performance tested this patch
extensively with hundreds of billions of operations across several clusters
and thousands of compactions.  It has less of an impact on local NVMe,
since the p99 latency is already 10-30x less than what you see on EBS
(100micros vs 1-3ms), and you can do hundreds of thousands of IOPS vs a max
of 16K.

Related to this, Branimir wrote CASSANDRA-20092 [3], which significantly
improves compaction by avoiding reading the partition index.
CASSANDRA-20092 has been merged to trunk already [4].

I think we should merge both of these patches into 5.0, as the perf
improvement should allow teams to increase density of EBS backed C*
clusters by 2-5x, driving cost way down.  There's a lot of teams running C*
on EBS now.  I'm currently working with one that's bottlenecked on maxed
out EBS GP3 storage.  I propose we merge both, because without
CASSANDRA-20092, we won't get the performance improvements in
CASSANDRA-15452 with BTI, only BIG format.  I've tested BTI in other
situations and found it to be far more performant than BIG.

If we were looking at a small win, I wouldn't care much, but since these
patches, combined with UCS, allows more teams to run C* on EBS at > 10TB /
node, I think it's worth doing now.

Thanks in advance,
Jon

[1] https://issues.apache.org/jira/browse/CASSANDRA-15452
[2]
https://aws.amazon.com/blogs/database/best-practices-for-running-apache-cassandra-with-amazon-ebs/
[3] https://issues.apache.org/jira/browse/CASSANDRA-20092
[4]
https://github.com/apache/cassandra/commit/3078aea1cfc70092a185bab8ac5dc8a35627330f

Reply via email to