Although given we have an in process page cache[1] now this may not be needed anymore? This is only for the data file though. I think its been years? since we showed it helped so perhaps someone should show if this is still working/helping in the real world.
[1] https://issues.apache.org/jira/browse/CASSANDRA-5863 On Tue, Oct 18, 2016 at 11:59 AM, Michael Kjellman < mkjell...@internalcircle.com> wrote: > Specifically regarding the behavior in different kernels, from `man > posix_fadvise`: "In kernels before 2.6.6, if len was specified as 0, then > this was interpreted literally as "zero bytes", rather than as meaning "all > bytes through to the end of the file"." > > On Oct 18, 2016, at 8:57 AM, Michael Kjellman < > mkjell...@internalcircle.com<mailto:mkjell...@internalcircle.com>> wrote: > > Right, so in SSTableReader#GlobalTidy$tidy it does: > // don't ideally want to dropPageCache for the file until all instances > have been released > CLibrary.trySkipCache(desc.filenameFor(Component.DATA), 0, 0); > CLibrary.trySkipCache(desc.filenameFor(Component.PRIMARY_INDEX), 0, 0); > > It seems to me every time the reference is released on a new sstable we > would immediately tidy() it and then call posix_fadvise with > POSIX_FADV_DONTNEED with an offset of 0 and a length of 0 (which I'm > thinking is doing so in respect to the API behavior in modern Linux kernel > builds?). Am I reading things correctly here? Sorta hard as there are many > different code paths the reference could have tidy() called. > > Why would we want to drop the segment we just write from the page cache -- > wouldn't that most likely be the most hot data, and even if it turned out > not to be wouldn't it be better in this case to have kernel be smart at > what it's best at? > > best, > kjellman > > On Oct 18, 2016, at 8:50 AM, Jake Luciani <jak...@gmail.com<mailto:jaker > s...@gmail.com>> wrote: > > The main point is to avoid keeping things in the page cache that are no > longer needed like compacted data that has been early opened elsewhere. > > On Oct 18, 2016 11:29 AM, "Michael Kjellman" <mkjell...@internalcircle.com > <mailto:mkjell...@internalcircle.com>> > wrote: > > We use posix_fadvise in a bunch of places, and in stereotypical Cassandra > fashion no comments were provided. > > There is a check the OS is Linux (okay, a start) but it turns out the > behavior of providing a length of 0 to posix_fadvise changed in some 2.6 > kernels. We don't check the kernel version -- or even note it. > > What is the *expected* outcome of our use of posix_fadvise -- not what > does it do or not do today -- but what problem was it added to solve and > what's the expected behavior regardless of kernel versions. > > best, > kjellman > > Sent from my iPhone > > > -- http://twitter.com/tjake