On 11/5/20 9:55 AM, Alex Shi wrote:
Currently, compaction would get the lru_lock and then do page isolation
which works fine with pgdat->lru_lock, since any page isoltion would
compete for the lru_lock. If we want to change to memcg lru_lock, we
have to isolate the page before getting lru_lock, thus isoltion would
block page's memcg change which relay on page isoltion too. Then we
could safely use per memcg lru_lock later.

The new page isolation use previous introduced TestClearPageLRU() +
pgdat lru locking which will be changed to memcg lru lock later.

Hugh Dickins <[email protected]> fixed following bugs in this patch's
early version:

Fix lots of crashes under compaction load: isolate_migratepages_block()
must clean up appropriately when rejecting a page, setting PageLRU again
if it had been cleared; and a put_page() after get_page_unless_zero()
cannot safely be done while holding locked_lruvec - it may turn out to
be the final put_page(), which will take an lruvec lock when PageLRU.
And move __isolate_lru_page_prepare back after get_page_unless_zero to
make trylock_page() safe:
trylock_page() is not safe to use at this time: its setting PG_locked
can race with the page being freed or allocated ("Bad page"), and can
also erase flags being set by one of those "sole owners" of a freshly
allocated page who use non-atomic __SetPageFlag().

Suggested-by: Johannes Weiner <[email protected]>
Signed-off-by: Alex Shi <[email protected]>
Acked-by: Hugh Dickins <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: [email protected]
Cc: [email protected]

Acked-by: Vlastimil Babka <[email protected]>

A question below:

@@ -979,10 +995,6 @@ static bool too_many_isolated(pg_data_t *pgdat)
                                        goto isolate_abort;
                        }
- /* Recheck PageLRU and PageCompound under lock */
-                       if (!PageLRU(page))
-                               goto isolate_fail;
-
                        /*
                         * Page become compound since the non-locked check,
                         * and it's on LRU. It can only be a THP so the order
@@ -990,16 +1002,13 @@ static bool too_many_isolated(pg_data_t *pgdat)
                         */
                        if (unlikely(PageCompound(page) && !cc->alloc_contig)) {
                                low_pfn += compound_nr(page) - 1;
-                               goto isolate_fail;
+                               SetPageLRU(page);
+                               goto isolate_fail_put;
                        }

IIUC the danger here is khugepaged will collapse a THP. For that, __collapse_huge_page_isolate() has to succeed isolate_lru_page(). Under the new scheme, it shouldn't be possible, right? If that's correct, we can remove this part?

Reply via email to