[ 
https://issues.apache.org/jira/browse/CASSANDRA-19987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18047081#comment-18047081
 ] 

Sam Lightfoot edited comment on CASSANDRA-19987 at 12/22/25 4:53 PM:
---------------------------------------------------------------------

h2. Direct I/O vs Buffered I/O Read Latency During Compaction

*Test Conditions:*
 * 10k read requests/s against a 1GB dataset
 * Unthrottled compaction of two 80GB SSTables
 * 4KB block device read-ahead
 * Page cache filled prior to latency capture

 
||Metric||Buffered||Direct I/O||Improvement||
|*Mean*|0.45ms|0.33ms|*27%*|
|*Std Dev*|3.91|1.48|*62%*|
|*p50*|0.22ms|0.21ms|5%|
|*p90*|0.42ms|0.40ms|5%|
|*p95*|0.53ms|0.50ms|6%|
|*p99*|4.18ms|2.90ms|*31%*|
|*p99.9*|43.52ms|22.41ms|*48%*|
|*p99.99*|252.71ms|59.77ms|*76%*|
|*p99.999*|375.39ms|74.97ms|*80%*|
|*Max*|402.65ms|202.38ms|*50%*|

 

!image-2025-12-22-16-43-54-374.png|width=814,height=219!

 

Scenario:

 

 


was (Author: JIRAUSER302824):
h2. Direct I/O vs Buffered I/O Read Latency During Compaction

*Test Conditions:*
 * 10k read requests/s against a 1GB dataset
 * Unthrottled compaction of two 80GB SSTables
 * 4KB block device read-ahead
 * Page cache filled prior to latency capture

 
||Metric||Buffered||Direct I/O||Improvement||
|*Mean*|0.45ms|0.33ms|*27% lower*|
|*Std Dev*|3.91|1.48|*62% lower*|
|*p50*|0.22ms|0.21ms|5%|
|*p90*|0.42ms|0.40ms|5%|
|*p95*|0.53ms|0.50ms|6%|
|*p99*|1.59ms|0.74ms|*53%*|
|*p99.9*|29.88ms|8.22ms|*72%*|
|*p99.99*|88.08ms|57.67ms|*35%*|
|*p99.999*|360.71ms|72.35ms|*80%*|
|*Max*|402.65ms|202.38ms|*50%*|

 

!image-2025-12-22-16-43-54-374.png|width=814,height=219!

 

Scenario:

 

 

> Direct IO support for compaction reads
> --------------------------------------
>
>                 Key: CASSANDRA-19987
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19987
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Local/Compaction
>            Reporter: Jon Haddad
>            Assignee: Sam Lightfoot
>            Priority: Normal
>             Fix For: 5.x
>
>         Attachments: image-2025-12-22-16-43-54-374.png
>
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> If we use direct io to read SSTables during compaction, we can avoid 
> polluting the page cache with data we're about to delete.  As another side 
> effect, we also evict pages to make room for whatever we're putting in.  This 
> unnecessary churn leads to higher CPU overhead and can cause dips in client 
> read latency, as we're going to be evicting pages that could be used to serve 
> those reads.
> This is most notable with STCS as the SSTables get larger, potentially 
> evicting the entire hot dataset out of cache, but is affected by every 
> compaction strategy.
> This is a follow up to be done after CASSANDRA-15452 since we will have an 
> internal buffer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to