On Mon, Mar 18, 2013 at 07:58:27PM +0800, Wanpeng Li wrote: > On Sun, Mar 17, 2013 at 01:04:13PM +0000, Mel Gorman wrote: > >Historically, kswapd used to congestion_wait() at higher priorities if it > >was not making forward progress. This made no sense as the failure to make > >progress could be completely independent of IO. It was later replaced by > >wait_iff_congested() and removed entirely by commit 258401a6 (mm: don't > >wait on congested zones in balance_pgdat()) as it was duplicating logic > >in shrink_inactive_list(). > > > >This is problematic. If kswapd encounters many pages under writeback and > >it continues to scan until it reaches the high watermark then it will > >quickly skip over the pages under writeback and reclaim clean young > >pages or push applications out to swap. > > > >The use of wait_iff_congested() is not suited to kswapd as it will only > >stall if the underlying BDI is really congested or a direct reclaimer was > >unable to write to the underlying BDI. kswapd bypasses the BDI congestion > >as it sets PF_SWAPWRITE but even if this was taken into account then it > >would cause direct reclaimers to stall on writeback which is not desirable. > > > >This patch sets a ZONE_WRITEBACK flag if direct reclaim or kswapd is > >encountering too many pages under writeback. If this flag is set and > >kswapd encounters a PageReclaim page under writeback then it'll assume > >that the LRU lists are being recycled too quickly before IO can complete > >and block waiting for some IO to complete. > > > >Signed-off-by: Mel Gorman <mgor...@suse.de> > >--- > > include/linux/mmzone.h | 8 ++++++++ > > mm/vmscan.c | 29 ++++++++++++++++++++++++----- > > 2 files changed, 32 insertions(+), 5 deletions(-) > > > >diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > >index edd6b98..c758fb7 100644 > >--- a/include/linux/mmzone.h > >+++ b/include/linux/mmzone.h > >@@ -498,6 +498,9 @@ typedef enum { > > ZONE_DIRTY, /* reclaim scanning has recently found > > * many dirty file pages > > */ > >+ ZONE_WRITEBACK, /* reclaim scanning has recently found > >+ * many pages under writeback > >+ */ > > } zone_flags_t; > > > > static inline void zone_set_flag(struct zone *zone, zone_flags_t flag) > >@@ -525,6 +528,11 @@ static inline int zone_is_reclaim_dirty(const struct > >zone *zone) > > return test_bit(ZONE_DIRTY, &zone->flags); > > } > > > >+static inline int zone_is_reclaim_writeback(const struct zone *zone) > >+{ > >+ return test_bit(ZONE_WRITEBACK, &zone->flags); > >+} > >+ > > static inline int zone_is_reclaim_locked(const struct zone *zone) > > { > > return test_bit(ZONE_RECLAIM_LOCKED, &zone->flags); > >diff --git a/mm/vmscan.c b/mm/vmscan.c > >index 493728b..7d5a932 100644 > >--- a/mm/vmscan.c > >+++ b/mm/vmscan.c > >@@ -725,6 +725,19 @@ static unsigned long shrink_page_list(struct list_head > >*page_list, > > > > if (PageWriteback(page)) { > > /* > >+ * If reclaim is encountering an excessive number of > >+ * pages under writeback and this page is both under > > Is the comment should changed to "encountered an excessive number of > pages under writeback or this page is both under writeback and PageReclaim"? > See below: >
I intended to check for PageReclaim as well but it got lost in a merge error. Fixed now. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/