Let’s please not change the default at the same time the feature is introduced.
Making the capability available will allow users to evaluate and quantify the benefit of the work, as well as to call out any unintended consequences. As users and the project gain confidence in the results, we can evaluate changing the default. – Scott On Oct 17, 2023, at 4:25 AM, guo Maxwell <cclive1...@gmail.com> wrote:
-1
I still think we should keep it as it is until the direct io for commitlog (read and write) is ready and relatively stable. And then we may change the default value to direct io from mmap in a future version, such as 5.2, or 6.0.
[AMD Official Use Only - General]
Thank you all for your input. Received total 6 replies and below is the summary.
1. Mmap : 2/6
2. Direct-I/O : 4/6
Default should be changed to Direct-IO then ? please confirm.
Thanks,
Amit
Strongly agree with this point of view that direct IO can bring great benefits.
I have reviewed part of the code, and my preliminary judgment is that it is not very common and limited in some situations, for example, it works for commitlog's write path only for this patch.So I suggest that
the default value should not be modified until the entire function is comprehensive and stable, and then modified in a future version.
Glad you brought up compaction here - I think there would be a significant benefit to moving compaction to direct i/o.
+1. Would love to see this get traction.
Glad you brought up compaction here - I think there would be a significant benefit to moving compaction to direct i/o.
On 2023/10/16 16:14:28 Benedict wrote:
> I have some plans to (eventually) use the commit log as memtable payload storage (ie memtables would reference the commit log entries directly, storing only indexing info), and to back first level of sstables by reference to commit log entries. This will
permit us to deliver not only much bigger memtables (cutting compaction throughput requirements by the ratio of size increase - so pretty dramatically), and faster flushing (so better behaviour ling write bursts), but also a fairly cheap and simple way to
support MVCC - which will be helpful for transaction throughput.
>
> There is also a new commit log (“journal”) coming with Accord, that the rest of C* may or may not transition to.
>
> I only say this because this makes the utility of direct IO for commit log suspect, as we will be reading from the files as a matter of course should this go ahead; and we may end up relying on a different commit log implementation before long anyway.
>
> This is obviously a big suggestion and is not guaranteed to transpire, and probably won’t within the next year or so, but it should perhaps form some minimal part of any calculus. If the patch is otherwise simple and beneficial I don’t have anything against
it, and the use of direct IO could well be of benefit eg in compaction - and also in future if we manage to bring a page management in process. So laying foundations there could be of benefit, even if the commit log eventually does not use it.
>
> > On 16 Oct 2023, at 17:00, Jon Haddad <rustyrazorbl...@apache.org> wrote:
> >
> > I haven't looked at the patch, but at a high level, defaulting to direct I/O for commit logs makes a lot of sense to me.
> >
> >> On 2023/10/16 06:34:05 "Pawar, Amit" wrote:
> >> [Public]
> >>
> >> Hi,
> >>
> >> CommitLog uses mmap (memory mapped ) segments by default. Direct-IO feature is proposed through new PR[1] to improve the CommitLog IO speed. Enabling this by default could be useful feature to address IO bottleneck seen during peak load.
> >>
> >> Need your input regarding changing this default. Please suggest.
> >>
> >>
https://issues.apache.org/jira/browse/CASSANDRA-18464
> >>
> >> thanks,
> >> Amit Pawar
> >>
> >> [1] -
https://github.com/apache/cassandra/pull/2777
> >>
>
--
you are the apple of my eye !
-- you are the apple of my eye !
|