*"Glad you brought up compaction here - I think there would be a significant benefit to moving compaction to direct i/o."*
Support Direct I/O for SSTable writing https://issues.apache.org/jira/browse/CASSANDRA-19707 On Mon, 16 Oct 2023 at 17:38, Jon Haddad <rustyrazorbl...@apache.org> wrote: > Glad you brought up compaction here - I think there would be a significant > benefit to moving compaction to direct i/o. > > > On 2023/10/16 16:14:28 Benedict wrote: > > I have some plans to (eventually) use the commit log as memtable payload > storage (ie memtables would reference the commit log entries directly, > storing only indexing info), and to back first level of sstables by > reference to commit log entries. This will permit us to deliver not only > much bigger memtables (cutting compaction throughput requirements by the > ratio of size increase - so pretty dramatically), and faster flushing (so > better behaviour ling write bursts), but also a fairly cheap and simple way > to support MVCC - which will be helpful for transaction throughput. > > > > There is also a new commit log (“journal”) coming with Accord, that the > rest of C* may or may not transition to. > > > > I only say this because this makes the utility of direct IO for commit > log suspect, as we will be reading from the files as a matter of course > should this go ahead; and we may end up relying on a different commit log > implementation before long anyway. > > > > This is obviously a big suggestion and is not guaranteed to transpire, > and probably won’t within the next year or so, but it should perhaps form > some minimal part of any calculus. If the patch is otherwise simple and > beneficial I don’t have anything against it, and the use of direct IO could > well be of benefit eg in compaction - and also in future if we manage to > bring a page management in process. So laying foundations there could be of > benefit, even if the commit log eventually does not use it. > > > > > On 16 Oct 2023, at 17:00, Jon Haddad <rustyrazorbl...@apache.org> > wrote: > > > > > > I haven't looked at the patch, but at a high level, defaulting to > direct I/O for commit logs makes a lot of sense to me. > > > > > >> On 2023/10/16 06:34:05 "Pawar, Amit" wrote: > > >> [Public] > > >> > > >> Hi, > > >> > > >> CommitLog uses mmap (memory mapped ) segments by default. Direct-IO > feature is proposed through new PR[1] to improve the CommitLog IO speed. > Enabling this by default could be useful feature to address IO bottleneck > seen during peak load. > > >> > > >> Need your input regarding changing this default. Please suggest. > > >> > > >> https://issues.apache.org/jira/browse/CASSANDRA-18464 > > >> > > >> thanks, > > >> Amit Pawar > > >> > > >> [1] - https://github.com/apache/cassandra/pull/2777 > > >> > > >