[ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771421#comment-17771421
 ] 

Amit Pawar edited comment on CASSANDRA-18464 at 10/6/23 3:22 AM:
-----------------------------------------------------------------

Thank you for reviewing the patch. Ideally having it into 5.0 helps those users 
where high insert operations affecting node performance. It is possible that 
this feature avoids adding more nodes during peak time.

 

>>I added comments to [the latest published 
>>branch|https://github.com/driftx/cassandra/tree/CASSANDRA-18464-trunk] with 
>>some suggested changes. I am curious, if the NIO option is constructed 
>>correctly (with aligned direct buffers, possibly also issuing the writes to 
>>be page-aligned and containing whole pages), is it still copying to internal 
>>buffers?

No, it does not. [FileChannel (Java Platform SE 7 ) 
(oracle.com)|https://docs.oracle.com/javase%2F7%2Fdocs%2Fapi%2F%2F/java/nio/channels/FileChannel.html#force(boolean)]
 does not mention about copying. Explored jdk11 sources to find out the same. 
As per the sources, it creates a temporary buffer using (allocated using 
allocateDirect and aligned to boundry) and then copies. 

 

Would like to highlight to another difference here. Java forces 4096 bytes to 
be written with native APIs for Direct-IO. JNA allows minimum block size to be 
used as per the disk. Testing showed 512 bytes. This affects disk health during 
following two conditions.
 # Head and tail parts needs to be aligned to 4K. In worst case, 8K bytes are 
written un-necessarily to commit the head/tail part of the buffer. With JNA 
maximum 1024 bytes are written.
 # Under low activity, still 4K bytes are written. Even for 100 bytes during 
periodic timeout. Here JNA may write maximum 512 bytes.

Direct-IO feature is very much needed either through native or JNA based. It 
overall reduces kernel IO activity for commitlog files.

 


was (Author: JIRAUSER299956):
Thank you for reviewing the patch. Ideally having it into 5.0 helps those users 
where high insert operations affecting node performance. It is possible that 
this feature avoids adding more nodes during peak time.

 

>>I added comments to [the latest published 
>>branch|https://github.com/driftx/cassandra/tree/CASSANDRA-18464-trunk] with 
>>some suggested changes. I am curious, if the NIO option is constructed 
>>correctly (with aligned direct buffers, possibly also issuing the writes to 
>>be page-aligned and containing whole pages), is it still copying to internal 
>>buffers?

No, it does not. [FileChannel (Java Platform SE 7 ) 
(oracle.com)|https://docs.oracle.com/javase%2F7%2Fdocs%2Fapi%2F%2F/java/nio/channels/FileChannel.html#force(boolean)]
 does not mention about copying. Sorry, it was my misunderstanding. 

 

Would like to highlight to another difference here. Java forces 4096 bytes to 
be written with native APIs for Direct-IO. JNA allows minimum block size to be 
used as per the disk. Testing showed 512 bytes. This affects disk health during 
following two conditions.
 # Head and tail parts needs to be aligned to 4K. In worst case, 8K bytes are 
written un-necessarily to commit the head/tail part of the buffer. With JNA 
maximum 1024 bytes are written.
 # Under low activity, still 4K bytes are written. Even for 100 bytes during 
periodic timeout. Here JNA may write maximum 512 bytes.

Direct-IO feature is very much needed either through native or JNA based. It 
overall reduces kernel IO activity for commitlog files.

 

> Enable Direct I/O For CommitLog Files
> -------------------------------------
>
>                 Key: CASSANDRA-18464
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Local/Commit Log
>            Reporter: Josh McKenzie
>            Assignee: Amit Pawar
>            Priority: Normal
>             Fix For: 5.0.x, 5.x
>
>         Attachments: CommitLogStressTest.patch, 
> EnableDirectIOForCommitLogUsingNativeAPI.patch, 
> PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, 
> UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments are not supported yet and it can be enabled later 
> based on the Community feedback.
>  Following improvement are seen with Direct I/O enablement.
>  # 32 cores >= ~15%
>  # 64 cores >= ~80%
>  Also, another observation would like to share here. Reading Commitlog files 
> with Direct I/O might help in reducing node bring-up time after the node 
> crash.
>  Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07
>  The attached patch enables Direct I/O feature for Commitlog files. Please 
> check and share your feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to