On 25/11/13 10:38, Johannes Weiner wrote:
> The VM maintains cached filesystem pages on two types of lists.  One
> list holds the pages recently faulted into the cache, the other list
> holds pages that have been referenced repeatedly on that first list.
> The idea is to prefer reclaiming young pages over those that have
> shown to benefit from caching in the past.  We call the recently used
> list "inactive list" and the frequently used list "active list".
> 
> Currently, the VM aims for a 1:1 ratio between the lists, which is the
> "perfect" trade-off between the ability to *protect* frequently used
> pages and the ability to *detect* frequently used pages.  This means
> that working set changes bigger than half of cache memory go
> undetected and thrash indefinitely, whereas working sets bigger than
> half of cache memory are unprotected against used-once streams that
> don't even need caching.
> 
> Historically, every reclaim scan of the inactive list also took a
> smaller number of pages from the tail of the active list and moved
> them to the head of the inactive list.  This model gave established
> working sets more gracetime in the face of temporary use-once streams,
> but ultimately was not significantly better than a FIFO policy and
> still thrashed cache based on eviction speed, rather than actual
> demand for cache.
> 
> This patch solves one half of the problem by decoupling the ability to
> detect working set changes from the inactive list size.  By
> maintaining a history of recently evicted file pages it can detect
> frequently used pages with an arbitrarily small inactive list size,
> and subsequently apply pressure on the active list based on actual
> demand for cache, not just overall eviction speed.
> 
> Every zone maintains a counter that tracks inactive list aging speed.
> When a page is evicted, a snapshot of this counter is stored in the
> now-empty page cache radix tree slot.  On refault, the minimum access
> distance of the page can be assesed, to evaluate whether the page
> should be part of the active list or not.
> 
> This fixes the VM's blindness towards working set changes in excess of
> the inactive list.  And it's the foundation to further improve the
> protection ability and reduce the minimum inactive list size of 50%.
> 
> Signed-off-by: Johannes Weiner <han...@cmpxchg.org>
> ---

<snip>

> + *   fault ------------------------+
> + *                                 |
> + *              +--------------+   |            +-------------+
> + *   reclaim <- |   inactive   | <-+-- demotion |    active   | <--+
> + *              +--------------+                +-------------+    |
> + *                     |                                           |
> + *                     +-------------- promotion ------------------+
> + *
> + *
> + *           Access frequency and refault distance
> + *
> + * A workload is trashing when its pages are frequently used but they

"thrashing".

~Ryan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to