Re: [linux-next-20130422] Bug in SLAB?

2013-05-03 Thread Christoph Lameter
On Sat, 4 May 2013, Tetsuo Handa wrote: > Subject: slab: Return NULL for oversized allocations > (Date: Fri, 3 May 2013 15:43:18 +) > > and > > Subject: Fix bootstrap creation of kmalloc caches > (Date: Fri, 3 May 2013 18:04:18 +) > > on linux-next-20130426 made my kernel boots fin

Re: [linux-next-20130422] Bug in SLAB?

2013-05-06 Thread Christoph Lameter
On Sat, 4 May 2013, Tetsuo Handa wrote: > Christoph Lameter wrote: > > Ok could I see the kernel logs with the warnings? > Sure. These are exclusively from the module load. So the kernel seems to be clean of large kmalloc's ? -- To unsubscribe from this list: send the line &

Re: [linux-next-20130422] Bug in SLAB?

2013-05-06 Thread Christoph Lameter
On Mon, 6 May 2013, Pekka Enberg wrote: > This doesn't seem to apply against slab/next branch. What tree did you > use to generate the patch? Slab/next from a couple of weeks ago. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vge

Re: [linux-next-20130422] Bug in SLAB?

2013-05-07 Thread Christoph Lameter
On Tue, 7 May 2013, Tetsuo Handa wrote: > > These are exclusively from the module load. So the kernel seems to be > > clean of large kmalloc's ? > > > There are modules (e.g. TOMOYO) which do not check for KMALLOC_MAX_SIZE limit > and expect kmalloc() larger than KMALLOC_MAX_SIZE bytes to return N

Re: [linux-next-20130422] Bug in SLAB?

2013-05-07 Thread Christoph Lameter
On Mon, 6 May 2013, Pekka Enberg wrote: > On Fri, May 3, 2013 at 9:04 PM, Christoph Lameter wrote: > > - for (i = KMALLOC_SHIFT_LOW; i < KMALLOC_SHIFT_HIGH; i++) > > This didn't match what I had in my tree. I fixed it by hand but please > verify the end result:

Re: [linux-next-20130422] Bug in SLAB?

2013-05-07 Thread Christoph Lameter
On Tue, 7 May 2013, Pekka Enberg wrote: > On Tue, May 7, 2013 at 5:23 PM, Christoph Lameter wrote: > > Well this is because you did not take the patch that changed the way > > KMALLOC_SHIFT_HIGH is treated. > > Is that still needed? I only took the ones Tetsuo said were need

Re: [GIT PULL] SLAB changes for v3.10

2013-05-08 Thread Christoph Lameter
On Tue, 7 May 2013, Tony Lindgren wrote: > OK got it narrowed down to CONFIG_DEBUG_SPINLOCK=y causing the problem > with commit 8a965b3b. Ain't nothing like bisecting and booting and then > diffing .config files on top of that. > > > Without reverting 8a965b3b I'm getting: The patch (commit 8a965

Re: [GIT PULL] SLAB changes for v3.10

2013-05-08 Thread Christoph Lameter
> The 1.4.0 verion in ubuntu 13.04 s not good enough? qemu 1.4.0 reproduces the bug here on arm. And Chris Mason;s patch fixes it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.k

Re: [GIT PULL] SLAB changes for v3.10

2013-05-08 Thread Christoph Lameter
specific sizes, it will only use those during very early bootstrap. The later creation of the array must skip those. You correctly moved the checks out of the if (!kmalloc_cacheS()) condition so that the caches are created properly. Acked-by: Christoph Lameter -- To unsubscribe from this list:

Re: [PATCH 09/22] mm: page allocator: Allocate/free order-0 pages from a per-zone magazine

2013-05-08 Thread Christoph Lameter
On Wed, 8 May 2013, Mel Gorman wrote: > 1. IRQs do not have to be disabled to access the lists reducing IRQs >disabled times. The per cpu structure access also would not need to disable irq if the fast path would be using this_cpu ops. > 2. As the list is protected by a spinlock, it is not n

Re: [GIT PULL] SLAB changes for v3.10

2013-05-08 Thread Christoph Lameter
On Wed, 8 May 2013, Chris Mason wrote: > > You correctly moved the checks out of the if (!kmalloc_cacheS()) > > condition so that the caches are created properly. > > But if the ordering is required at all, why is it ok to create cache 2 > after cache 6 instead of after cache 7? The power of two

Re: [linux-next-20130422] Bug in SLAB?

2013-05-09 Thread Christoph Lameter
goes too far in initializing values in kmalloc_caches because it assumed that the size of the kmalloc array goes up to MAX_ORDER. However, the size of the kmalloc array for SLAB may be restricted due to increased page sizes or CONFIG_FORCE_MAX_ZONEORDER. Reported-by: Tetsuo Handa Signed-off-by:

Re: [PATCH 09/22] mm: page allocator: Allocate/free order-0 pages from a per-zone magazine

2013-05-09 Thread Christoph Lameter
On Thu, 9 May 2013, Mel Gorman wrote: > > > > The per cpu structure access also would not need to disable irq if the > > fast path would be using this_cpu ops. > > > > How does this_cpu protect against preemption due to interrupt? this_read() > itself only disables preemption and it's explicitly

Re: [RFC PATCH 00/22] Per-cpu page allocator replacement prototype

2013-05-09 Thread Christoph Lameter
On Thu, 9 May 2013, Dave Hansen wrote: > BTW, I really like the 'magazine' name. It's not frequently used in > this kind of context and it conjures up a nice mental image whether it > be of stacks of periodicals or firearm ammunition clips. The term "magazine" was prominently used in the Bonwick

Re: [PATCH 09/22] mm: page allocator: Allocate/free order-0 pages from a per-zone magazine

2013-05-09 Thread Christoph Lameter
On Thu, 9 May 2013, Mel Gorman wrote: > > I would be useful if the allocator would hand out pages from the > > same physical area first. This would reduce fragmentation as well and > > since it is likely that numerous pages are allocated for some purpose > > (given that that the page sizes of 4k a

Re: [linux-next-20130422] Bug in SLAB?

2013-05-10 Thread Christoph Lameter
array goes up to MAX_ORDER. However, the size of the kmalloc array for SLAB may be restricted due to increased page sizes or CONFIG_FORCE_MAX_ZONEORDER. Reported-by: Tetsuo Handa Signed-off-by: Christoph Lameter Index: linux/mm/slab.c

Re: [PATCH 2/2] nohz: Add basic tracing

2013-04-23 Thread Christoph Lameter
On Mon, 22 Apr 2013, Frederic Weisbecker wrote: > It's not obvious to find out why the full dynticks subsystem > doesn't always stop the tick: whether this is due to kthreads, > posix timers, perf events, etc... > > These new tracepoints are here to help the user diagnose > the failures and test

Re: OOM-killer and strange RSS value in 3.9-rc7

2013-04-24 Thread Christoph Lameter
On Wed, 24 Apr 2013, Michal Hocko wrote: > [CCing SL.B people and linux-mm list] > > Just for quick summary. The reporter sees OOM situations with almost > whole memory filled with slab memory. This is a powerpc machine with 4G > RAM. Boot with "slub_debug" or enable slab debugging. > /proc/slab

Re: OOM-killer and strange RSS value in 3.9-rc7

2013-04-25 Thread Christoph Lameter
On Thu, 25 Apr 2013, Han Pingtian wrote: > I have enabled "slub_debug" and here is the > /sys/kernel/slab/kmalloc-512/alloc_calls contents: > > 50 .__alloc_workqueue_key+0x90/0x5d0 age=113630/116957/119419 pid=1-1730 > cpus=0,6-8,13,24,26,44,53,57,60,68 nodes=1 > 11 .__alloc_workqueue_k

Re: OOM-killer and strange RSS value in 3.9-rc7

2013-04-25 Thread Christoph Lameter
On Thu, 25 Apr 2013, Han Pingtian wrote: > > A dump of the other fields in /sys/kernel/slab/kmalloc*/* would also be > > useful. > > > I have dumpped all /sys/kernel/slab/kmalloc*/* in kmalloc.tar.xz and > will attach it to this mail. Ok that looks like a lot of objects were freed from slab pages

Re: OOM-killer and strange RSS value in 3.9-rc7

2013-04-26 Thread Christoph Lameter
On Fri, 26 Apr 2013, Han Pingtian wrote: > Could you give me some hints about how to verify them? Only I can do is > adding two printk() statements to print the vaules in those two > functions: Ok thats good. nr->partial needs to be bigger than min_partial in order for frees to occur. So they do

Re: OOM-killer and strange RSS value in 3.9-rc7

2013-04-29 Thread Christoph Lameter
On Sat, 27 Apr 2013, Will Huck wrote: > Hi Christoph, > On 04/26/2013 01:17 AM, Christoph Lameter wrote: > > On Thu, 25 Apr 2013, Han Pingtian wrote: > > > > > I have enabled "slub_debug" and here is the > > > /sys/kernel/slab/kmalloc-

Re: OOM-killer and strange RSS value in 3.9-rc7

2013-04-29 Thread Christoph Lameter
On Sat, 27 Apr 2013, Han Pingtian wrote: > and it is called so many times that the boot cannot be finished. So > maybe the memory isn't freed even though __free_slab() get called? Ok that suggests an issue with the page allocator then. -- To unsubscribe from this list: send the line "unsubscribe

Re: [linux-next-20130422] Bug in SLAB?

2013-04-29 Thread Christoph Lameter
On Mon, 29 Apr 2013, Glauber Costa wrote: > >> causes no warning at compile time and returns NULL at runtime. But > >> > >> unsigned int size = 8 * 1024 * 1024; > >> kmalloc(size, GFP_KERNEL); > >> > >> causes compile time warning > >> > >> include/linux/slab_def.h:136: warning: array subscr

Re: [linux-next-20130422] Bug in SLAB?

2013-04-29 Thread Christoph Lameter
On Mon, 29 Apr 2013, Glauber Costa wrote: > On 04/29/2013 06:59 PM, Christoph Lameter wrote: > > The code in kmalloc_index() creates a BUG() and preferentially should > > create a compile time failure when a number that is too big is passed to it. > > > > What is MA

Re: [linux-next-20130422] Bug in SLAB?

2013-04-29 Thread Christoph Lameter
f.h:136: warning: array subscript is above array bounds > > and > > BUG: unable to handle kernel NULL pointer dereference at 0058 > IP: [] kmem_cache_alloc+0x26/0xb0 > > . > > Christoph Lameter wrote: > > What is MAX_ORDER on the architecture? > > In my

Re: [PATCH 2/3] mm, slub: count freed pages via rcu as this task's reclaimed_slab

2013-04-11 Thread Christoph Lameter
On Thu, 11 Apr 2013, Simon Jeons wrote: > It seems that I need to simple my question. > All pages which order >=1 are compound pages? In the slub allocator that is true. One can request and free a series of contiguous pages that are not compound pages from the page allocator and a couple of subs

Re: [RT LATENCY] 249 microsecond latency caused by slub's unfreeze_partials() code.

2013-04-11 Thread Christoph Lameter
On Thu, 11 Apr 2013, Steven Rostedt wrote: > I was wondering if you made any more forward progress with with patch > yet. When it goes into mainline, I'd like to backport it to the -rt > stable trees, and will probably make it enabled by default when > PREEMPT_RT is enabled. Sorry I did not get a

Re: [PATCH 4/4] nohz: New option to force all CPUs in full dynticks range

2013-04-11 Thread Christoph Lameter
On Thu, 11 Apr 2013, Frederic Weisbecker wrote: > If there is no performance issue with that I'm all for it. Or have a CONFIG_LOWLATENCY that boots up a kernel with the proper configuration? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to

Re: [PATCH 4/4] nohz: New option to force all CPUs in full dynticks range

2013-04-12 Thread Christoph Lameter
On Thu, 11 Apr 2013, Frederic Weisbecker wrote: > It may be too general for a naming. But I don't mind just > selecting CONFIG_RCU_NOCBS_ALL unconditionally. It's easily > changed in the future if anybody complains. I like the general nature of that config option since it removes the need to con

Re: [PATCH documentation 1/2] nohz1: Add documentation.

2013-04-15 Thread Christoph Lameter
On Fri, 12 Apr 2013, Arjan van de Ven wrote: > but arguably, that's because of HRTIMERS more than NOHZ > (e.g. I bet we still turn off periodic even for nohz as long as hrtimers are > enabled) If we are able to only get rid of one timer tick on average with dynticks then I would think that is eno

Re: [PATCH 4/4] nohz: New option to force all CPUs in full dynticks range

2013-04-15 Thread Christoph Lameter
On Mon, 15 Apr 2013, Ingo Molnar wrote: > > Ok. But all these complicated things would go away if we had an option > > CONFIG_LOWLATENCY and then everything would just follow the best setup > > possible given the hardware. Would remove a lot of guesswork and a lot of > > knobs. > > In that sense C

Re: [PATCH documentation 1/2] nohz1: Add documentation.

2013-04-15 Thread Christoph Lameter
On Mon, 15 Apr 2013, Arjan van de Ven wrote: > to put the "cost" into perspective; programming a timer in one-shot mode > is some math on the cpu (to go from kernel time to hardware time), > which is a multiply and a shift (or a divide), and then actually > programming the hardware, which is at th

Re: OOPS in perf_mmap_close()

2013-05-23 Thread Christoph Lameter
On Thu, 23 May 2013, Peter Zijlstra wrote: > Right it doesn't. I think the easiest solution for now is to not copy the VMA > on fork(). Right. Pinned pages are not inherited. If a page is unpinned then that is going to happen for all address spaces that reference the page. > But I totally missed

Re: OOPS in perf_mmap_close()

2013-05-23 Thread Christoph Lameter
On Thu, 23 May 2013, Peter Zijlstra wrote: > The patch completely fails to explain how RLIMIT_LOCKED is supposed to > deal with pinned vs locked. Perf used to account its pages against > RLIMIT_LOCKED, with the patch it compares pinned against RLIMIT_LOCKED > but completely discards any possible l

Re: OOPS in perf_mmap_close()

2013-05-23 Thread Christoph Lameter
On Thu, 23 May 2013, Peter Zijlstra wrote: > I know all that, and its completely irrelevant to the discussion. What you said in the rest of the email seems to indicate that you still do not know that and I am repeating what I have said before here. > You now have double the amount of memory you

Re: [RFC][PATCH] mm: Fix RLIMIT_MEMLOCK

2013-05-24 Thread Christoph Lameter
On Fri, 24 May 2013, Peter Zijlstra wrote: > Patch bc3e53f682 ("mm: distinguish between mlocked and pinned pages") > broke RLIMIT_MEMLOCK. Nope the patch fixed a problem with double accounting. The problem that we seem to have is to define what mlocked and pinned mean and how this relates to RLI

Re: [RFC][PATCH] mm: Fix RLIMIT_MEMLOCK

2013-05-28 Thread Christoph Lameter
On Sat, 25 May 2013, KOSAKI Motohiro wrote: > If pinned and mlocked are totally difference intentionally, why IB uses > RLIMIT_MEMLOCK. Why don't IB uses IB specific limit and why only IB raise up > number of pinned pages and other gup users don't. > I can't guess IB folk's intent. True another l

Re: [RT LATENCY] 249 microsecond latency caused by slub's unfreeze_partials() code.

2013-05-28 Thread Christoph Lameter
On Tue, 28 May 2013, Steven Rostedt wrote: > Any progress on this patch? Got a new vesion here but have not gotten too much testing done yet. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at ht

Re: [RFC][PATCH] mm: Fix RLIMIT_MEMLOCK

2013-05-28 Thread Christoph Lameter
On Mon, 27 May 2013, Peter Zijlstra wrote: > Before your patch pinned was included in locked and thus RLIMIT_MEMLOCK > had a single resource counter. After your patch RLIMIT_MEMLOCK is > applied separately to both -- more or less. Before the patch the count was doubled since a single page was cou

Re: [RT LATENCY] 249 microsecond latency caused by slub's unfreeze_partials() code.

2013-05-28 Thread Christoph Lameter
indeterminism that is not wanted in certain context (like a realtime kernel). Make it configurable. Signed-off-by: Christoph Lameter Index: linux/include/linux/slub_def.h === --- linux.orig/include/linux/slub_def.h 2013-05-20 15:21

Re: [RT LATENCY] 249 microsecond latency caused by slub's unfreeze_partials() code.

2013-03-28 Thread Christoph Lameter
_node() assumed that page->inuse was undisturbed by acquire_slab(). Save the # of objects in page->lru.next in acquire_slab() and pass it to get_partial_node() that way. I have a vague memory that Joonsoo also ran into this issue awhile back. Signed-off-by: Christoph Lameter Index: linu

Re: [RT LATENCY] 249 microsecond latency caused by slub's unfreeze_partials() code.

2013-03-28 Thread Christoph Lameter
This patch requires the earlier bug fix. Subject: slub: Make cpu partial slab support configurable cpu partial support can introduce level of indeterminism that is not wanted in certain context (like a realtime kernel). Make it configurable. Signed-off-by: Christoph Lameter Index: linux

Re: [RT LATENCY] 249 microsecond latency caused by slub's unfreeze_partials() code.

2013-04-01 Thread Christoph Lameter
l_node() that way. I have a vague memory that Joonsoo also ran into this issue awhile back. Signed-off-by: Christoph Lameter Index: linux/mm/slub.c === --- linux.orig/mm/slub.c2013-03-28 12:14:26.958358688 -0500 +++ linux/mm/

Re: [RT LATENCY] 249 microsecond latency caused by slub's unfreeze_partials() code.

2013-04-01 Thread Christoph Lameter
Make cpu partial slab support configurable V2 cpu partial support can introduce level of indeterminism that is not wanted in certain context (like a realtime kernel). Make it configurable. Signed-off-by: Christoph Lameter Index: linux/include/linux/slub_def.h

Re: system death under oom - 3.7.9

2013-04-01 Thread Christoph Lameter
On Wed, 27 Mar 2013, Ilia Mirkin wrote: > The GPF happens at +160, which is in the argument setup for the > cmpxchg in slab_alloc_node. I think it's the call to > get_freepointer(). There was a similar bug report a while back, > https://lkml.org/lkml/2011/5/23/199, and the recommendation was to ru

Re: [GIT PULL] SLAB changes for v3.9-rc1

2013-03-06 Thread Christoph Lameter
On Tue, 5 Mar 2013, Linus Torvalds wrote: > On Mon, Mar 4, 2013 at 2:35 AM, Pekka Enberg wrote: > > > > It contains more of Christoph's SLAB unification work that reduce the > > differences between different slab allocators. The code has been sitting > > in linux-next without problems. > > > > If

Re: [RESEND PATCH 4/4 v3] mm: fix possible incorrect return value of move_pages() syscall

2012-08-01 Thread Christoph Lameter
On Wed, 1 Aug 2012, Michael Kerrisk wrote: > Is the patch below acceptable? (I've attached the complete page as well.) Yes looks good. > See you in San Diego (?), Yup. I will be there too. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to m

Re: [PATCH] slub: use free_page instead of put_page for freeing kmalloc allocation

2012-08-02 Thread Christoph Lameter
verifies that the page is in a proper condition for freeing. Then it calls free_one_page(). __free_pages() decrements the refcount and then calls __free_pages_ok(). So we loose the checking and the dtor stuff with this patch. Guess that is ok? Acked-by: Christoph Lameter -- To unsubscribe

Re: [RFC PATCH 05/23 V2] mm,migrate: use N_MEMORY instead N_HIGH_MEMORY

2012-08-02 Thread Christoph Lameter
On Thu, 2 Aug 2012, Lai Jiangshan wrote: > The code here need to handle with the nodes which have memory, we should > use N_MEMORY instead. Acked-by: Christoph Lameter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a m

Re: [RFC PATCH 09/23 V2] vmstat: use N_MEMORY instead N_HIGH_MEMORY

2012-08-02 Thread Christoph Lameter
On Thu, 2 Aug 2012, Lai Jiangshan wrote: > The code here need to handle with the nodes which have memory, we should > use N_MEMORY instead. Acked-by: Christoph Lameter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a m

Re: [PATCH 19/19] mm, numa: retry failed page migrations

2012-08-02 Thread Christoph Lameter
On Tue, 31 Jul 2012, Peter Zijlstra wrote: > Keep track of how many NUMA page migrations succeeded and > failed (in a way that wants retrying later) per process. It would be good if we could also somehow determine if that migration actually made sense? Were there enough accesses to the page so t

Re: [PATCH 1/2] slub: rename cpu_partial to max_cpu_object

2012-08-24 Thread Christoph Lameter
On Sat, 25 Aug 2012, Joonsoo Kim wrote: > cpu_partial of kmem_cache struct is a bit awkward. Acked-by: Christoph Lameter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo

Re: [PATCH 2/2] slub: correct the calculation of the number of cpu objects in get_partial_node

2012-08-24 Thread Christoph Lameter
On Sat, 25 Aug 2012, Joonsoo Kim wrote: > index d597530..c96e0e4 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -1538,6 +1538,7 @@ static void *get_partial_node(struct kmem_cache *s, > { > struct page *page, *page2; > void *object = NULL; > + int cpu_slab_objects = 0, pobjects =

Re: [PATCH 2/2] slub: correct the calculation of the number of cpu objects in get_partial_node

2012-08-24 Thread Christoph Lameter
On Sat, 25 Aug 2012, JoonSoo Kim wrote: > But, when using "cpu_partial_objects", I have a coding style problem. > > if (kmem_cache_debug(s) > || cpu_slab_objects + cpu_partial_objects > > s->max_cpu_object / 2)

Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag

2012-09-18 Thread Christoph Lameter
On Tue, 18 Sep 2012, Glauber Costa wrote: > +++ b/include/linux/gfp.h > @@ -35,6 +35,11 @@ struct vm_area_struct; > #else > #define ___GFP_NOTRACK 0 > #endif > +#ifdef CONFIG_MEMCG_KMEM > +#define ___GFP_KMEMCG0x40u > +#else > +#define ___GFP_KMEMCG

Re: [PATCH v3 03/16] slab: Ignore the cflgs bit in cache creation

2012-09-18 Thread Christoph Lameter
On Tue, 18 Sep 2012, Glauber Costa wrote: > No cache should ever pass that as a creation flag, since this bit is > used to mark an internal decision of the slab about object placement. We > can just ignore this bit if it happens to be passed (such as when > duplicating a cache in the kmem memcg pa

Re: [PATCH v3 04/16] provide a common place for initcall processing in kmem_cache

2012-09-18 Thread Christoph Lameter
common.c, while creating an empty > placeholder for the SLOB. Acked-by: Christoph Lameter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 08/16] slab: allow enable_cpu_cache to use preset values for its tunables

2012-09-18 Thread Christoph Lameter
On Tue, 18 Sep 2012, Glauber Costa wrote: > SLAB allows us to tune a particular cache behavior with tunables. > When creating a new memcg cache copy, we'd like to preserve any tunables > the parent cache already had. Again the same is true for SLUB. Some generic way of preserving tuning parameter

Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-18 Thread Christoph Lameter
On Tue, 18 Sep 2012, Glauber Costa wrote: > index f2d760c..18de3f6 100644 > --- a/mm/slab.c > +++ b/mm/slab.c > @@ -3938,9 +3938,12 @@ EXPORT_SYMBOL(__kmalloc); > * Free an object which was previously allocated from this > * cache. > */ > -void kmem_cache_free(struct kmem_cache *cachep, void

Re: Taint kernel when we detect a corrupted slab.

2012-09-18 Thread Christoph Lameter
On Tue, 18 Sep 2012, Dave Jones wrote: > It doesn't seem worth adding a new taint flag for this, so just re-use > the one from 'bad page' Acked-by: Christoph Lameter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a messa

Re: [PATCH v3 13/16] slab: slab-specific propagation changes.

2012-09-18 Thread Christoph Lameter
On Tue, 18 Sep 2012, Glauber Costa wrote: > When a parent cache does tune_cpucache, we need to propagate that to the > children as well. For that, we unfortunately need to tap into the slab core. One of the todo list items for the common stuff is to have actually a common kmem_cache structure. If

Re: [PATCH v3 15/16] memcg/sl[au]b: shrink dead caches

2012-09-18 Thread Christoph Lameter
Why doesnt slab need that too? It keeps a number of free pages on the per node lists until shrink is called. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-inf

Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag

2012-09-19 Thread Christoph Lameter
On Wed, 19 Sep 2012, Glauber Costa wrote: > On 09/18/2012 07:06 PM, Christoph Lameter wrote: > > On Tue, 18 Sep 2012, Glauber Costa wrote: > > > >> +++ b/include/linux/gfp.h > >> @@ -35,6 +35,11 @@ struct vm_area_struct; > >> #else > >>

Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-19 Thread Christoph Lameter
On Wed, 19 Sep 2012, Glauber Costa wrote: > > This is an extremely hot path of the kernel and you are adding significant > > processing. Check how the benchmarks are influenced by this change. > > virt_to_cache can be a bit expensive. > Would it be enough for you to have a separate code path for >

Re: [RFC PATCH] mm: introduce N_LRU_MEMORY to distinguish between normal and movable memory

2012-08-14 Thread Christoph Lameter
On Tue, 14 Aug 2012, Hanjun Guo wrote: > N_NORMAL_MEMORY means !LRU allocs possible. Ok. I am fine with that change. However this is a significant change that needs to be mentioned prominently in the changelog and there need to be some comments explaining the meaning of these flags clearly in the

Re: [PATCH v2 04/11] kmem accounting basic infrastructure

2012-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2012, Michal Hocko wrote: > > That is not what the kernel does, in general. We assume that if he wants > > that memory and we can serve it, we should. Also, not all kernel memory > > is unreclaimable. We can shrink the slabs, for instance. Ying Han > > claims she has patches for tha

Re: [PATCH v2 04/11] kmem accounting basic infrastructure

2012-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2012, Glauber Costa wrote: > On 08/15/2012 06:47 PM, Christoph Lameter wrote: > > On Wed, 15 Aug 2012, Michal Hocko wrote: > > > >>> That is not what the kernel does, in general. We assume that if he wants > >>> that memory and we can se

Re: [PATCH] slub: try to get cpu partial slab even if we get enough objects for cpu freelist

2012-08-15 Thread Christoph Lameter
On Thu, 16 Aug 2012, Joonsoo Kim wrote: > s->cpu_partial determine the maximum number of objects kept > in the per cpu partial lists of a processor. Currently, it is used for > not only per cpu partial list but also cpu freelist. Therefore > get_partial_node() doesn't work properly according to ou

Re: [PATCH v2 04/11] kmem accounting basic infrastructure

2012-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2012, Greg Thelen wrote: > > You can already shrink the reclaimable slabs (dentries / inodes) via > > calls to the subsystem specific shrinkers. Did Ying Han do anything to > > go beyond that? > > cc: Ying > > The Google shrinker patches enhance prune_dcache_sb() to limit dentry > p

Re: [PATCH v2 04/11] kmem accounting basic infrastructure

2012-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2012, Glauber Costa wrote: > Remember we copy over the metadata and create copies of the caches > per-memcg. Therefore, a dentry belongs to a memcg if it was allocated > from the slab pertaining to that memcg. The dentry could be used by other processes in the system though. F.e. d

Re: [PATCH] slub: try to get cpu partial slab even if we get enough objects for cpu freelist

2012-08-15 Thread Christoph Lameter
On Thu, 16 Aug 2012, JoonSoo Kim wrote: > > Maybe I do not understand you correctly. Could you explain this in some > > more detail? > > I assume that cpu slab and cpu partial slab are not same thing. > > In my definition, > cpu slab is in c->page, > cpu partial slab is in c->partial Correct. >

Re: [PATCH v2 04/11] kmem accounting basic infrastructure

2012-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2012, Ying Han wrote: > > How can you figure out which objects belong to which memcg? The ownerships > > of dentries and inodes is a dubious concept already. > > I figured it out based on the kernel slab accounting. > obj->page->kmem_cache->memcg Well that is only the memcg which a

Re: [PATCH] slub: try to get cpu partial slab even if we get enough objects for cpu freelist

2012-08-16 Thread Christoph Lameter
On Thu, 16 Aug 2012, JoonSoo Kim wrote: > But, if you prefer that s->cpu_partial is for both cpu slab and cpu > partial slab, > get_partial_node() needs an another minor fix. > We should add number of objects in cpu slab when we refill cpu partial slab. > Following is my suggestion. > > @@ -1546,7

Re: SLUB: Support for statistics to help analyze allocator behavior

2008-02-05 Thread Christoph Lameter
On Tue, 5 Feb 2008, Pekka Enberg wrote: > > We could do that Any idea how to display that kind of information in a > > meaningful way. Parameter conventions for slabinfo? > > We could just print out one total summary and one summary for each CPU (and > maybe show % of total allocations/fees.

Re: [PATCH] mmu notifiers #v5

2008-02-05 Thread Christoph Lameter
On Tue, 5 Feb 2008, Andrea Arcangeli wrote: > On Tue, Feb 05, 2008 at 10:17:41AM -0800, Christoph Lameter wrote: > > The other approach will not have any remote ptes at that point. Why would > > there be a coherency issue? > > It never happens that two threads writes to t

Re: [2.6.24 regression][BUGFIX] numactl --interleave=all doesn't works on memoryless node.

2008-02-05 Thread Christoph Lameter
On Tue, 5 Feb 2008, Lee Schermerhorn wrote: > mbind(2), on the other hand, just masks off any nodes in the > nodemask that are not included in the caller's mems_allowed. Ok so we temporarily adopt these semantics for set_mempolicy. > 1) modify contextualize_policy to just remove the non-allowed

Re: [stable] OOM-killer invoked but why ?

2008-02-05 Thread Christoph Lameter
On Tue, 5 Feb 2008, Greg KH wrote: > > > commit 96990a4ae979df9e235d01097d6175759331e88c > > > Author: Christoph Lameter <[EMAIL PROTECTED]> > > > Date: Mon Jan 14 00:55:14 2008 -0800 > > > > > > quicklists: On

Re: [PATCH] mmu notifiers #v5

2008-02-05 Thread Christoph Lameter
On Tue, 5 Feb 2008, Andrea Arcangeli wrote: > > You can avoid the page-pin and the pt lock completely by zapping the > > mappings at _start and then holding off new references until _end. > > "holding off new references until _end" = per-range mutex less scalar > and more expensive than the PT l

Re: [PATCH] mmu notifiers #v5

2008-02-05 Thread Christoph Lameter
On Wed, 6 Feb 2008, Andrea Arcangeli wrote: > > You can of course setup a 2M granularity lock to get the same granularity > > as the pte lock. That would even work for the cases where you have to page > > pin now. > > If you set a 2M granularity lock, the _start callback would need to > do: >

Re: {2.6.22.y} quicklists must keep even off node pages on the quicklists until the TLB flush has been completed.

2008-02-06 Thread Christoph Lameter
On Wed, 6 Feb 2008, Dhaval Giani wrote: > Is this one also supposed to be backported? Yes. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the F

Re: {2.6.22.y} quicklists must keep even off node pages on the quicklists until the TLB flush has been completed.

2008-02-06 Thread Christoph Lameter
On Wed, 6 Feb 2008, Oliver Pinter wrote: > I use this, without errors ... but the machine is i386 desktop The fix for off node frees only applies to NUMA systems. !NUMA has no off node pages. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to

Re: SLUB: Support for statistics to help analyze allocator behavior

2008-02-06 Thread Christoph Lameter
On Wed, 6 Feb 2008, Andrew Morton wrote: > > @@ -1357,17 +1366,22 @@ static struct page *get_partial(struct k > > static void unfreeze_slab(struct kmem_cache *s, struct page *page, int > > tail) > > { > > struct kmem_cache_node *n = get_node(s, page_to_nid(page)); > > + struct kmem_cache_

Re: SLUB: statistics improvements

2008-02-06 Thread Christoph Lameter
SLUB: statistics improvements - Fix indentation in unfreeze_slab - FREE_SLAB/ALLOC_SLAB counters were slightly misplaced and counted even if the slab was kept because we were below the mininum of partial slabs. - Export per cpu statistics to user space (follow numa convention but change th

Re: SLUB: statistics improvements

2008-02-06 Thread Christoph Lameter
On Wed, 6 Feb 2008, Eric Dumazet wrote: > > + for_each_online_cpu(cpu) { > > + int x = get_cpu_slab(s, cpu)->stat[si]; > > unsigned int x = ... Ahh. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More

Re: [patch 22/27] quicklist: Set tlb->need_flush if pages are remaining in quicklist 0

2008-02-06 Thread Christoph Lameter
Correct -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH 1/2] kmemcheck v3

2008-02-07 Thread Christoph Lameter
On Thu, 7 Feb 2008, Vegard Nossum wrote: > > > */ > > > +#define SLAB_NOTRACK 0x0040UL/* Don't track use of > > > uninitialized memory */ > > > > Ok new exception for tracking. > > New exception? Please explain. SLABs can be excepted from tracking? > > H... You seem to assum

Re: [PATCH 1/2] kmemcheck v3

2008-02-07 Thread Christoph Lameter
On Thu, 7 Feb 2008, Vegard Nossum wrote: > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -28,6 +28,7 @@ > #define SLAB_DESTROY_BY_RCU 0x0008UL/* Defer freeing slabs to RCU > */ > #define SLAB_MEM_SPREAD 0x0010UL/* Spread some memory > over cpuset */ >

Re: [PATCH 1/2] kmemcheck v3

2008-02-07 Thread Christoph Lameter
On Fri, 8 Feb 2008, Vegard Nossum wrote: > The tracking that kmemcheck does is actually a byte-for-byte tracking > of whether memory has been initialized or not. Think of it as valgrind > for the kernel. We do this by "hiding" pages (marking them non-present > for for MMU) and taking the page faul

Re: parisc compile error

2008-02-07 Thread Christoph Lameter
On Thu, 7 Feb 2008, Kyle McMartin wrote: > yes, it's in my batch of fixes. So I do not have to worry about it? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.ht

[git pull] more SLUB updates for 2.6.25

2008-02-07 Thread Christoph Lameter
other patches had been removed. cmpxchg_local fastpath was stripped of support for CONFIG_PREEMPT since that uglified the code and did not seem to work right. We will be able to handle preempt much better in the future with some upcoming patches) Christoph Lameter (4): SLUB: Deal with annoying

Re: [PATCH 1/2] kmemcheck v3

2008-02-07 Thread Christoph Lameter
On Thu, 7 Feb 2008, Vegard Nossum wrote: > - DMA can be a problem since there's generally no way for kmemcheck to > determine when/if a chunk of memory is used for DMA. Ideally, DMA should be > allocated with untracked caches, but this requires annotation of the > drivers in question. There

Re: [git pull] more SLUB updates for 2.6.25

2008-02-08 Thread Christoph Lameter
On Fri, 8 Feb 2008, Eric Dumazet wrote: > And SLAB/SLUB allocators, even if only used from process context, want to > disable/re-enable interrupts... Not any more. The new fastpath does allow avoiding interrupt enable/disable and we will be hopefully able to increase the scope of that over

Re: [PATCH] mm/slub.c - Use print_hex_dump

2008-02-08 Thread Christoph Lameter
On Fri, 8 Feb 2008, Joe Perches wrote: > On Fri, 2008-02-08 at 10:07 -0800, Christoph Lameter wrote: > > On Fri, 8 Feb 2008, Joe Perches wrote: > > > Use the library function to dump memory > > Could you please compare the formatting of the output before and > > aft

Re: [PATCH] mm/slub.c - Use print_hex_dump

2008-02-08 Thread Christoph Lameter
On Fri, 8 Feb 2008, Joe Perches wrote: > Use the library function to dump memory Could you please compare the formatting of the output before and after? Last time we tried this we had issues because it became a bit ugly. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel"

Re: [PATCH] mm/slub.c - Use print_hex_dump

2008-02-08 Thread Christoph Lameter
Argh. You need to cut that into small pieces so that it is easier to review. CCing Randy who was involved with prior art in this area. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.ker

[patch 1/6] mmu_notifier: Core code

2008-02-08 Thread Christoph Lameter
fix this issue by requiring such devices to subscribe to a notification chain that will allow them to work without pinning. This patch: Core portion Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]> --- Documentation/mmu_not

[patch 5/6] mmu_notifier: Support for drivers with revers maps (f.e. for XPmem)

2008-02-08 Thread Christoph Lameter
reverse maps callbacks does not need to provide the invalidate_page() method that is called when locks are held. Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> --- include/linux/mmu_notifier.h | 65 +++ include/linux/page-flags.h | 11 +

[patch 0/6] MMU Notifiers V6

2008-02-08 Thread Christoph Lameter
This is a patchset implementing MMU notifier callbacks based on Andrea's earlier work. These are needed if Linux pages are referenced from something else than tracked by the rmaps of the kernel (an external MMU). MMU notifiers allow us to get rid of the page pinning for RDMA and various other purpo

[patch 3/6] mmu_notifier: invalidate_page callbacks

2008-02-08 Thread Christoph Lameter
callback may be be omitted. PageLock and pte lock are held when either of the functions is called. Signed-off-by: Andrea Arcangeli <[EMAIL PROTECTED]> Signed-off-by: Robin Holt <[EMAIL PROTECTED]> Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]> --- mm/rmap.c | 13 +++

<    2   3   4   5   6   7   8   9   10   11   >