For 15452 that’s correct (and I believe also for 20092). For 15452, the trunk and 5.0 patch are basically identical.
Jordan On Thu, Feb 13, 2025 at 01:06 C. Scott Andreas <sc...@paradoxica.net> wrote: > Checking to confirm the specific patches proposed for backport – is it the > trunk commit for C-20092 and the open GitHub PR against the 5.0 branch for > C-15452 linked below? > > CASSANDRA-20092: Introduce SSTableSimpleScanner for compaction (committed > to trunk) > https://github.com/apache/cassandra/commit/3078aea1cfc70092a185bab8ac5dc8a35627330f > > CASSANDRA-15452: Improve disk access patterns during compaction and range > reads (PR available) https://github.com/apache/cassandra/pull/3606 > > Thanks, > > – Scott > > On Feb 12, 2025, at 9:45 PM, guo Maxwell <cclive1...@gmail.com> wrote: > > > Of course, I definitely hope to see it merged into 5.0.x as soon as > possible > > Jordan West <jw...@apache.org> 于2025年2月13日周四 10:48写道: > >> Regarding the buffer size, it is configurable. My personal take is that >> we’ve tested this on a variety of hardware (from laptops to large instance >> sizes) already, as well as a few different disk configs (it’s also been run >> internally, in test, at a few places) and that it has been reviewed by four >> committers and another contributor. Always love to see more numbers. if >> folks want to take it for a spin on Alibaba cloud, azure, etc and determine >> the best buffer size that’s awesome. We could document which is suggested >> for the community. I don’t think it’s necessary to block on that however. >> >> Also I am of course +1 to including this in 5.0. >> >> Jordan >> >> On Wed, Feb 12, 2025 at 19:50 guo Maxwell <cclive1...@gmail.com> wrote: >> >>> What I understand is that there will be some differences in block >>> storage among various cloud platforms. More intuitively, the default >>> read-ahead size will be the same. For example, AWS ebs seems to be 256K, >>> and Alibaba Cloud seems to be 512K(If I remember correctly). >>> >>> Just like 19488, give the test method, see who can assist in the test , >>> and provide the results. >>> >>> Jon Haddad <j...@rustyrazorblade.com> 于2025年2月13日周四 08:30写道: >>> >>>> Can you elaborate why? This would be several hundred hours of work and >>>> would cost me thousands of $$ to perform. >>>> >>>> Filesystems and block devices are well understood. Could you give me >>>> an example of what you think might be different here? This is already one >>>> of the most well tested and documented performance patches ever contributed >>>> to the project. >>>> >>>> On Wed, Feb 12, 2025 at 4:26 PM guo Maxwell <cclive1...@gmail.com> >>>> wrote: >>>> >>>>> I think it should be tested on most cloud platforms(at least >>>>> aws、azure、gcp) before merged into 5.0 . Just like CASSANDRA-19488. >>>>> >>>>> Paulo Motta <pa...@apache.org>于2025年2月13日 周四上午6:10写道: >>>>> >>>>>> I'm looking forward to these improvements, compaction needs tlc. :-) >>>>>> A couple of questions: >>>>>> >>>>>> Has this been tested only on EBS, or also EC2/bare-metal/Azure/etc? My >>>>>> only concern is if this is an optimization for EBS that can be a >>>>>> deoptimization for other environments. >>>>>> >>>>>> Are there reproducible scripts that anyone can run to verify the >>>>>> improvements in their own environments ? This could help alleviate any >>>>>> concerns and gain confidence to introduce a perf. improvement in a >>>>>> patch release. >>>>>> >>>>>> I have not read the ticket in detail, so apologies if this was already >>>>>> discussed there or elsewhere. >>>>>> >>>>>> On Wed, Feb 12, 2025 at 3:01 PM Jon Haddad <j...@rustyrazorblade.com> >>>>>> wrote: >>>>>> > >>>>>> > Hey folks, >>>>>> > >>>>>> > Over the last 9 months Jordan and I have worked on CASSANDRA-15452 >>>>>> [1]. The TL;DR is that we're internalizing a read ahead buffer to allow >>>>>> us >>>>>> to do fewer requests to disk during compaction and range reads. This >>>>>> results in far fewer system calls (roughly 16x reduction) and on systems >>>>>> with higher read latency, a significant improvement in compaction >>>>>> throughput. We've tested several different EBS configurations and found >>>>>> it >>>>>> delivers up to a 10x improvement when read ahead is optimized to minimize >>>>>> read latency. I worked with AWS and the EBS team directly on this and >>>>>> the >>>>>> Best Practices for C* on EBS [2] I wrote for them. I've performance >>>>>> tested >>>>>> this patch extensively with hundreds of billions of operations across >>>>>> several clusters and thousands of compactions. It has less of an impact >>>>>> on >>>>>> local NVMe, since the p99 latency is already 10-30x less than what you >>>>>> see >>>>>> on EBS (100micros vs 1-3ms), and you can do hundreds of thousands of IOPS >>>>>> vs a max of 16K. >>>>>> > >>>>>> > Related to this, Branimir wrote CASSANDRA-20092 [3], which >>>>>> significantly improves compaction by avoiding reading the partition >>>>>> index. >>>>>> CASSANDRA-20092 has been merged to trunk already [4]. >>>>>> > >>>>>> > I think we should merge both of these patches into 5.0, as the perf >>>>>> improvement should allow teams to increase density of EBS backed C* >>>>>> clusters by 2-5x, driving cost way down. There's a lot of teams running >>>>>> C* >>>>>> on EBS now. I'm currently working with one that's bottlenecked on maxed >>>>>> out EBS GP3 storage. I propose we merge both, because without >>>>>> CASSANDRA-20092, we won't get the performance improvements in >>>>>> CASSANDRA-15452 with BTI, only BIG format. I've tested BTI in other >>>>>> situations and found it to be far more performant than BIG. >>>>>> > >>>>>> > If we were looking at a small win, I wouldn't care much, but since >>>>>> these patches, combined with UCS, allows more teams to run C* on EBS at > >>>>>> 10TB / node, I think it's worth doing now. >>>>>> > >>>>>> > Thanks in advance, >>>>>> > Jon >>>>>> > >>>>>> > [1] https://issues.apache.org/jira/browse/CASSANDRA-15452 >>>>>> > [2] >>>>>> https://aws.amazon.com/blogs/database/best-practices-for-running-apache-cassandra-with-amazon-ebs/ >>>>>> > [3] https://issues.apache.org/jira/browse/CASSANDRA-20092 >>>>>> > [4] >>>>>> https://github.com/apache/cassandra/commit/3078aea1cfc70092a185bab8ac5dc8a35627330f >>>>>> > >>>>>> >>>>> >