On Wed, 25 Apr 2007 00:42:14 -0700 Andrew Morton <[EMAIL PROTECTED]> wrote:
> On Wed, 25 Apr 2007 12:19:46 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> > wrote: > > > Make zonelist policy selectable from sysctl. > > > > Assume 2 node NUMA, only node(0) has ZONE_DMA (ZONE_DMA32). > > > > In this case, default (node0's) zonelist order is > > > > Node(0)'s NORMAL -> Node(0)'s DMA -> Node(1)"s NORMAL. > > > > This means Node(0)'s DMA is used before Node(1)'s NORMAL. > > > > In some server, some application uses large memory allcation. > > This exhaust memory in the above order. > > Then....sometimes OOM_KILL will occur when 32bit device requires memory. > > > > This patch adds sysctl for rebuilding zonelist after boot and doesn't change > > default zonelist order. > > hm. Why don't we use that ordering all the time? Does the present ordering > have > any advantage? > I don't know ;) maybe some high-end NUMA hardware has IOMMU and zoning by memory address has no meaning. > > command: > > %echo 0 > /proc/sys/vm/better_locality > > Who could resist having better locality? ;) > how about changing this name to strict_zone_order and if strict_zone_order = 1 Node(0)'NORMAL -> Node(1)'Normal -> Node(0)'DMA if strict_zone_order = 0 Node(0)'NORMAL -> Node(0)'DMA -> Node(1)'NORMAL If someone thinks of better name, please teach me. > > extern int percpu_pagelist_fraction; > > extern int compat_log; > > +#ifdef CONFIG_NUMA > > +extern int sysctl_better_locality; > > +#endif > > The ifdef isn't needed here. If something went wrong, we'll find out at > link-time. > Okay. > > /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and > > GID */ > > static int maxolduid = 65535; > > @@ -845,6 +848,15 @@ static ctl_table vm_table[] = { > > .extra1 = &zero, > > .extra2 = &one_hundred, > > }, > > + { > > + .ctl_name = VM_BETTER_LOCALITY, > > Please don't add new sysctls: use CTL_UNNUMBERED here. > Oh, I didn't know about CTL_UNNUMBERED. looks useful. I'll try. > > +static void build_zonelists(pg_data_t *pgdat) > > +{ > > + if (sysctl_better_locality) { > > + build_zonelists_locality_aware(pgdat); > > + } else { > > + build_zonelists_zone_aware(pgdat); > > + } > > Remove all the braces please. Okay. > > > @@ -207,6 +207,7 @@ enum > > VM_PANIC_ON_OOM=33, /* panic at out-of-memory */ > > VM_VDSO_ENABLED=34, /* map VDSO into new processes? */ > > VM_MIN_SLAB=35, /* Percent pages ignored by zone reclaim */ > > + VM_BETTER_LOCALITY=36, /* create locality-preference zonelist */ > > This can go away. > Okay. I'll wait for other replies and post updated one tomorrow. Thank you, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/