On Sun, 18 Mar 2007, Andrew Morton wrote:
On Sun, 18 Mar 2007 11:35:36 +0000 (GMT) Mel Gorman <[EMAIL PROTECTED]> wrote:
But let me leap ahead of myself.
CONFIG_PAGE_GROUP_BY_MOBILITY
Why does this config item exist? It's not good to have some mysterious
knob which affects mm behaviour at compile time. We need to make up our
minds and stick with it.
The configuration item exists because there were concerns over the memory
footprint and cache line footprint. It was introduced to address that
concern and also so that it would be possible to compare the performance
behavior of anti-fragmentation. Your comment rang a bell though so I
searched the archives to see this comment from Andi Kleen;
===
If anything this should be a boot time option or perhaps sysctl, not a
config. In general CONFIGs that change runtime behaviour are evil - just
makes changing the option more painful, causes problems for distribution
users, doesn't make much sense, etc.etc.
Also #ifdef as a documentation device is a really really scary concept.
Yuck.
===
A sysctl would avoid any cache line footprint but not the memory overhead
because the freelists in struct zone as those freelists would still exist.
I could make the option depend on CONFIG_EMBEDDED for the zone overhead.
Would that make sense or would it be preferable to ditch the option
altogether?
I'll start looking at doing a sysctl so it can be disabled at runtime if
necessary. I strongly suspect that it cannot be enabled again once
disabled but I don't see that as a problem as such.
How much additional memory consumption are we expecting here?
Short answer, about 1.5KB on a 1GB system of which 1.3KB is statically
defined in the 3 struct zones on a 1 node x86 system.
Longer answer that I hopefully have not made any mistakes in - There is
the zone overhead which is statically sized and a runtime overhead which
depends on the amount of memory in the system. The additional zone
overhead is the overhead for additional freelists (larger struct
free_area) and is as follows;
(MIGRATE_TYPES-1) * sizeof(list_head) * (MAX_ORDER-1)
so, on 32 bit in general, thats
4 * 8 * 10 = 320 bytes per zone (would be 240 bytes if MIGRATE_RESERVE is
sufficient for higher order allocations
instead of MIGRATE_HIGHALLOC)
on x86 with DMA, Normal and HighMem, thats 1280 bytes. On a NUMA system,
it's 1280 bytes per node. On 64 bit, it would be double because of the
larger pointer size. At worst, I guess you are looking at 3KB per node.
The size of the bitmap for the flags depends on the amount of memory. On
FLATMEM and DISCONTIG, it's 3 bits per MAX_ORDER_NR_PAGES spanned by the
zones (note spanned, not present). On a 1GiB system without holes on an
x86 with MAX_ORDER of 11, that would be 96 bytes. The calculation is more
complex for sparsemem but will be at least 4 bytes per active section for
a pointer and another 4 bytes for most bitmaps. I have a patch that uses
the pointer itself when the bitmap is small enough to be contained in 4
bytes (which is usually is) so it could be just 4 bytes per active section
(4 bytes per 16MB on powerpc for example).
Whether it's runtime or compile-time, the optionality is not good.
The main situation where I think it would be desirable to disable
anti-fragmentation is for zones that are smaller than
MIGRATE_TYPES*MAX_ORDER_NR_PAGES and this is for performance reasons to
avoid regularly entering __rmqueue_fallback(). However, that situation
could be automatically detected and handled by having
allocflags_to_migratetype() always return MIGRATE_UNMOVABLE for small
zones and similar for get_pageblock_migratetype(). There would be no need
for a sysctl.
If a sysctl was available for cases where profiles showed a problem with
the bitmap for example, it would work by having
allocflags_to_migratetype() and get_pageblock_migratetype() always return
MIGRATE_UNMOVABLE so it would not be totally different code paths that are
being executed in the enabled/disabled cases.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/