On Tue, 06 Mar 2007 12:10:49 +0000 P__draig Brady <[EMAIL PROTECTED]> wrote:
> Andrew Morton wrote: > > Yes. Let's flesh it out the backup program policy some more: > > > > - Unconditionally invalidate output files > > > > - on entry to read(), probe pagecache, record which pages in the range are > > present > > > > - on entry to next read(), shoot down those pages from the previous read > > which weren't in pagecache. > > > > - But we can do better! LRU the page's files up to a certain number of > > pages. > > > > - Once that point is exceeded, we need to reclaim some pages. Which > > ones? Well, we've been observing all reads, so we can record which pages > > were referenced once, and which ones were referenced multiple times so we > > can do arbitrarily complex page aging in there. > > > > - On close(), nuke all pages which weren't in core during open(), even if > > this app referenced them multiple times. > > > > - If the backup program decided to read its input files with mmap we're > > rather screwed. We can't intercept pagefaults so the best we can do is > > to restore the file's pagecache to its previous state on close(). > > > > Or if it's really a problem, get control in there somehow and > > periodically poll the pagecache occupancy via mincore(), use madvise() > > then fadvise() to trim it back. > > > > That all sounds reasonably doable. It'd be pretty complex to do it > > in-kernel but we could do it there too. Problem is if course that the > > above strategy is explicitly optimised for the backup program and if it's > > in-kernel it becomes applicable to all other workloads. > > I can see the above being possible, but I can't see the reason > for exposing that complexity to userspace. That's sophistication, not complexity. It doesn't have to do all that stuff to be effective. > If I'm the target > audience for that API then it's broken as I'd mess it up, > or would take too long to get it right. > > Can't we just fix the posix_fadvise() implementation to > only evict pages paged in by the current process. The kernel doesn't have that information. > Perhaps one could possibly just evict pages with _mapcount==0 ? That is the present fadvise(FADV_DONTNEED) behaviour. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/