[
https://issues.apache.org/jira/browse/CASSANDRA-19987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18047081#comment-18047081
]
Sam Lightfoot edited comment on CASSANDRA-19987 at 12/22/25 4:53 PM:
---------------------------------------------------------------------
h2. Direct I/O vs Buffered I/O Read Latency During Compaction
*Test Conditions:*
* 10k read requests/s against a 1GB dataset
* Unthrottled compaction of two 80GB SSTables
* 4KB block device read-ahead
* Page cache filled prior to latency capture
||Metric||Buffered||Direct I/O||Improvement||
|*Mean*|0.45ms|0.33ms|*27%*|
|*Std Dev*|3.91|1.48|*62%*|
|*p50*|0.22ms|0.21ms|5%|
|*p90*|0.42ms|0.40ms|5%|
|*p95*|0.53ms|0.50ms|6%|
|*p99*|4.18ms|2.90ms|*31%*|
|*p99.9*|43.52ms|22.41ms|*48%*|
|*p99.99*|252.71ms|59.77ms|*76%*|
|*p99.999*|375.39ms|74.97ms|*80%*|
|*Max*|402.65ms|202.38ms|*50%*|
!image-2025-12-22-16-43-54-374.png|width=814,height=219!
Scenario:
was (Author: JIRAUSER302824):
h2. Direct I/O vs Buffered I/O Read Latency During Compaction
*Test Conditions:*
* 10k read requests/s against a 1GB dataset
* Unthrottled compaction of two 80GB SSTables
* 4KB block device read-ahead
* Page cache filled prior to latency capture
||Metric||Buffered||Direct I/O||Improvement||
|*Mean*|0.45ms|0.33ms|*27% lower*|
|*Std Dev*|3.91|1.48|*62% lower*|
|*p50*|0.22ms|0.21ms|5%|
|*p90*|0.42ms|0.40ms|5%|
|*p95*|0.53ms|0.50ms|6%|
|*p99*|1.59ms|0.74ms|*53%*|
|*p99.9*|29.88ms|8.22ms|*72%*|
|*p99.99*|88.08ms|57.67ms|*35%*|
|*p99.999*|360.71ms|72.35ms|*80%*|
|*Max*|402.65ms|202.38ms|*50%*|
!image-2025-12-22-16-43-54-374.png|width=814,height=219!
Scenario:
> Direct IO support for compaction reads
> --------------------------------------
>
> Key: CASSANDRA-19987
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19987
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Local/Compaction
> Reporter: Jon Haddad
> Assignee: Sam Lightfoot
> Priority: Normal
> Fix For: 5.x
>
> Attachments: image-2025-12-22-16-43-54-374.png
>
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
> If we use direct io to read SSTables during compaction, we can avoid
> polluting the page cache with data we're about to delete. As another side
> effect, we also evict pages to make room for whatever we're putting in. This
> unnecessary churn leads to higher CPU overhead and can cause dips in client
> read latency, as we're going to be evicting pages that could be used to serve
> those reads.
> This is most notable with STCS as the SSTables get larger, potentially
> evicting the entire hot dataset out of cache, but is affected by every
> compaction strategy.
> This is a follow up to be done after CASSANDRA-15452 since we will have an
> internal buffer.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]