I took the liberty of creating https://issues.apache.org/jira/browse/CASSANDRA-18464 linking to this email thread w/the contents of your email and applying the patch to that ticket. Probably want to have some lower level discussions there when we find you a reviewer.
On Tue, Apr 18, 2023, at 2:10 PM, Pawar, Amit wrote: > [Public] > > > Hi, > > I shared my investigation about Commitlog I/O issue on large core count > system in my previous email dated July-22 and link to the thread is given > below. > https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n > > Basically, two solutions looked possible to improve the CommitLog I/O. > 1. Multi-threaded syncing > 2. Using Direct-IO through JNA > > I worked on 2nd option considering the following benefit compared to the > first one > 1. Direct I/O read/write throughput is very high compared to non-Direct I/O. > Learnt through FIO benchmarking. > 2. Reduces kernel file cache uses which in-turn reduces kernel I/O activity > for Commitlog files only. > 3. Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < > 30% for Commitlog syncer thread with Direct I/O feature > 4. Direct I/O implementation is easier compared to multi-threaded > > As per the community suggestion, less in code complex is good to have. Direct > I/O enablement looked promising but there was one issue. > Java version 8 does not have native support to enable Direct I/O. So, JNA > library usage is must. The same implementation should also work across other > versions of Java (like 11 and beyond). > > I have completed Direct I/O implementation and summary of the attached patch > changes are given below. > 1. This implementation is not using Java file channels and file is opened > through JNA to use Direct I/O feature. > 2. New Segment are defined named “DirectIOSegment” for Direct I/O and > “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose > only). > 3. JNA write call is used to flush the changes. > 4. New helper functions are defined in NativeLibrary.java and platform > specific file. Currently tested on Linux only. > 5. Patch allows user to configure optimum block size and alignment if > default values are not OK for CommitLog disk. > 6. Following configuration options are provided in Cassandra.yaml file > 1. use_jna_for_commitlog_io : to use jna feature > 2. use_direct_io_for_commitlog : to use Direct I/O feature. > 3. direct_io_minimum_block_alignment: 512 (default) > 4. nvme_disk_block_size: 32MiB (default and can be changed as per the > required size) > > Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark > was tested. It works with both Java 8 and 11 versions. Compressed and > Encrypted based segments are not supported yet and it can be enabled later > based on the Community feedback. > > Following improvement are seen with Direct I/O enablement. > 1. 32 cores >= ~15% > 2. 64 cores >= ~80% > > Also, another observation would like to share here. Reading Commitlog files > with Direct I/O might help in reducing node bring-up time after the node > crash. > > Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07 > > The attached patch enables Direct I/O feature for Commitlog files. Please > check and share your feedback. > > Thanks, > Amit > > *Attachments:* > • UseDirectIOFeatureForCommitLogFiles.patch