Oops... resend because of omitting everyone on CC. 2014-07-30 18:56 GMT+09:00 Vlastimil Babka <vba...@suse.cz>: > On 07/30/2014 10:39 AM, Joonsoo Kim wrote: >> >> On Tue, Jul 29, 2014 at 05:34:37PM +0200, Vlastimil Babka wrote: >>> >>> Could do it in isolate_migratepages() for whole pageblocks only (as >>> David's patch did), but that restricts the usefulness. Or maybe do >>> it fine grained by calling isolate_migratepages_block() multiple >>> times. But the overhead of multiple calls would probably suck even >>> more for lower-order compactions. For CMA the added overhead is >>> basically only checks for next_capture_pfn that will be always >>> false, so predictable. And mostly just in branches where isolation >>> is failing, which is not the CMA's "fast path" I guess? >> >> >> You can do it find grained with compact_control's migratepages list >> or new private list. If some pages are isolated and added to this list, >> you can check pfn of page on this list and determine appropriate capture >> candidate page. This approach can give us more flexibility for >> choosing capture candidate without adding more complexity to >> common function. For example, you can choose capture candidate if >> there are XX isolated pages in certain range. > > > Hm I see. But the logic added by page capture was also a prerequisity for > the "[RFC PATCH V4 15/15] mm, compaction: do not migrate pages when that > cannot satisfy page fault allocation" > http://marc.info/?l=linux-mm&m=140551859423716&w=2 > > And that could be hardly done by a post-isolation inspection of the > migratepages list. And I haven't given up on that idea yet :)
Okay. I didn't look at that patch yet. I will try later :) > >>>> In __isolate_free_page(), we check zone_watermark_ok() with order 0. >>>> But normal allocation logic would check zone_watermark_ok() with >>>> requested >>>> order. Your capture logic uses __isolate_free_page() and it would >>>> affect compaction success rate significantly. And it means that >>>> capture logic allocates high order page on page allocator >>>> too aggressively compared to other component such as normal high order >>> >>> >>> It's either that, or the extra lru drain that makes the different. >>> But the "aggressiveness" would in fact mean better accuracy. >>> Watermark checking may be inaccurate. Especially when memory is >>> close to the watermark and there is only a single high-order page >>> that would satisfy the allocation. >> >> >> If this "aggressiveness" means better accuracy, fixing general >> function, watermark_ok() is better than adding capture logic. > > > That's if fixing the function wouldn't add significant overhead to all the > callers. And making it non-racy and not prone to per-cpu counter drifts > would certainly do that :( > > >> But, I guess that there is a reason that watermark_ok() is so >> conservative. If page allocator aggressively provides high order page, >> future atomic high order page request cannot succeed easily. For >> preventing this situation, watermark_ok() should be conservative. > > > I don't think it's intentionally conservative, just unreliable. It tests two > things together: > > 1) are there enough free pages for the allocation wrt watermarks? > 2) does it look like that there is a free page of the requested order? I don't think that watermark_ok()'s intention is checking if there is a free page of the requested order. If we want to know it, we could use more easy way something like below. X = number of total freepage - number of freepage lower than requested order If X is positive, we can conclude that there is at least one freepage of requested order and this equation is easy to compute. But, watermark_ok() doesn't do that. Instead, it uses mark value to determine if we can go further. I guess that this means that allocation/reclaim logic want to preserve certain level of high order freepages according to system memory size, although I don't know what the reason is exactly. So the "aggressiveness" on capture logic here could break what allocation/reclaim want. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/