[PATCH 06/16] slab: put forward freeing slab management object

2013-08-22 Thread Joonsoo Kim
We don't need to free slab management object in rcu context, because, from now on, we don't manage this slab anymore. So put forward freeing. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index b378f91..607a9b8 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -1820,8 +1820,6

[PATCH 00/16] slab: overload struct slab over struct page to reduce memory usage

2013-08-22 Thread Joonsoo Kim
in the same cache line. It is not good for performance. We can do same thing through more easy way, like as the stack. This patchset implement it and remove complex code for above algorithm. This makes slab code much cleaner. This patchset is based on v3.11-rc6, but tested on v3.10. Than

[PATCH 03/16] slab: remove colouroff in struct slab

2013-08-22 Thread Joonsoo Kim
Now there is no user colouroff, so remove it. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index 180f532..d9f81a0 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -219,7 +219,6 @@ struct slab { union { struct { struct list_head list

[PATCH 01/16] slab: correct pfmemalloc check

2013-08-22 Thread Joonsoo Kim
ust return 'struct page' of that object, not one of first page, since the SLAB don't use __GFP_COMP when CONFIG_MMU. To get 'struct page' of first page, we first get a slab and try to get it via virt_to_head_page(slab->s_mem). Cc: Mel Gorman Signed-off-by: Joonsoo Ki

[PATCH 14/16] slab: use struct page for slab management

2013-08-22 Thread Joonsoo Kim
mechanical ones and there is no functional change. Signed-off-by: Joonsoo Kim diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index ace9a5f..66ee577 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -42,18 +42,22 @@ struct page { /* First double

[PATCH 02/16] slab: change return type of kmem_getpages() to struct page

2013-08-22 Thread Joonsoo Kim
b1f9 mm/slab.o * After textdata bss dec hex filename 22074 23434 4 45512b1c8 mm/slab.o And this help following patch to remove struct slab's colouroff. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index d9eae39..180f532 100644 --- a/mm/s

[PATCH 16/16] slab: rename slab_bufctl to slab_freelist

2013-08-22 Thread Joonsoo Kim
Now, bufctl is not proper name to this array. So change it. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index 6abc069..e8ec4c5 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -2538,7 +2538,7 @@ static struct freelist *alloc_slabmgmt(struct kmem_cache *cachep, return

[PATCH 15/16] slab: remove useless statement for checking pfmemalloc

2013-08-22 Thread Joonsoo Kim
Now, virt_to_page(page->s_mem) is same as the page, because slab use this structure for management. So remove useless statement. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index cf39309..6abc069 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -750,9 +750,7 @@ static str

[PATCH 12/16] slab: remove SLAB_LIMIT

2013-08-22 Thread Joonsoo Kim
It's useless now, so remove it. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index 7216ebe..98257e4 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -163,8 +163,6 @@ */ static bool pfmemalloc_active __read_mostly; -#defineSLAB_LIMIT (((unsigned int)(~0

[PATCH 04/16] slab: remove nodeid in struct slab

2013-08-22 Thread Joonsoo Kim
We can get nodeid using address translation, so this field is not useful. Therefore, remove it. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index d9f81a0..69dc25a 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -222,7 +222,6 @@ struct slab { void *s_mem

[PATCH 13/16] slab: replace free and inuse in struct slab with newly introduced active

2013-08-22 Thread Joonsoo Kim
Now, free in struct slab is same meaning as inuse. So, remove both and replace them with active. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index 98257e4..9dcbb22 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -174,8 +174,7 @@ struct slab { struct { struct

[PATCH 10/16] slab: change the management method of free objects of the slab

2013-08-22 Thread Joonsoo Kim
is method. struct slab's free = 0 kmem_bufctl_t array: 6 3 7 2 5 4 0 1 To get free objects, we access this array with following pattern. 0 -> 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 This may help cache line footprint if slab has many objects, and, in addition, this makes code much

[PATCH 11/16] slab: remove kmem_bufctl_t

2013-08-22 Thread Joonsoo Kim
Now, we changed the management method of free objects of the slab and there is no need to use special value, BUFCTL_END, BUFCTL_FREE and BUFCTL_ACTIVE. So remove them. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index 4551d57..7216ebe 100644 --- a/mm/slab.c +++ b/mm/slab.c

[PATCH 05/16] slab: remove cachep in struct slab_rcu

2013-08-22 Thread Joonsoo Kim
We can get cachep using page in struct slab_rcu, so remove it. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index 69dc25a..b378f91 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -204,7 +204,6 @@ typedef unsigned int kmem_bufctl_t; */ struct slab_rcu { struct rcu_head head

[PATCH 08/16] slab: use well-defined macro, virt_to_slab()

2013-08-22 Thread Joonsoo Kim
This is trivial change, just use well-defined macro. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index 9e98ee0..ee03eba 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -2853,7 +2853,6 @@ static inline void verify_redzone_free(struct kmem_cache *cache, void *obj) static void

[PATCH 07/16] slab: overloading the RCU head over the LRU for RCU free

2013-08-22 Thread Joonsoo Kim
: Joonsoo Kim diff --git a/include/linux/slab.h b/include/linux/slab.h index 0c62175..b8d19b1 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -51,7 +51,14 @@ * } * rcu_read_unlock(); * - * See also the comment on struct slab_rcu in mm/slab.c. + * This is useful if we need to

[PATCH 09/16] slab: use __GFP_COMP flag for allocating slab pages

2013-08-22 Thread Joonsoo Kim
If we use 'struct page' of first page as 'struct slab', there is no advantage not to use __GFP_COMP. So use __GFP_COMP flag for all the cases. Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index ee03eba..855f481 100644 --- a/mm/slab.c +++ b/mm/slab.c

Re: [PATCH v2 12/20] mm, hugetlb: remove vma_has_reserves()

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 02:14:38PM +0530, Aneesh Kumar K.V wrote: > Joonsoo Kim writes: > > > vma_has_reserves() can be substituted by using return value of > > vma_needs_reservation(). If chg returned by vma_needs_reservation() > > is 0, it means that vma has reserves.

Re: [PATCH 02/10] sched: Factor out code to should_we_balance()

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 12:42:57PM +0200, Peter Zijlstra wrote: > > > > > > +redo: > > > > One behavioral change worth noting here is that in the redo case if a > > CPU has become idle we'll continue trying to load-balance in the > > !new-idle case. > > > > This could be unpleasant in the case wh

Re: [PATCH v2 12/20] mm, hugetlb: remove vma_has_reserves()

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 04:34:22PM +0530, Aneesh Kumar K.V wrote: > Joonsoo Kim writes: > > > On Thu, Aug 22, 2013 at 02:14:38PM +0530, Aneesh Kumar K.V wrote: > >> Joonsoo Kim writes: > >> > >> > vma_has_reserves() can be substituted by using return v

Re: [PATCH 00/16] slab: overload struct slab over struct page to reduce memory usage

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 04:47:25PM +, Christoph Lameter wrote: > On Thu, 22 Aug 2013, Joonsoo Kim wrote: > > > And this patchset change a management method of free objects of a slab. > > Current free objects management method of the slab is weird, because > > it touc

Re: [PATCH 02/16] slab: change return type of kmem_getpages() to struct page

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 05:49:43PM +, Christoph Lameter wrote: > On Thu, 22 Aug 2013, Joonsoo Kim wrote: > > > @@ -2042,7 +2042,7 @@ static void slab_destroy_debugcheck(struct kmem_cache > > *cachep, struct slab *slab > > */ > > static void slab_destroy(str

Re: [PATCH 04/16] slab: remove nodeid in struct slab

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 05:51:58PM +, Christoph Lameter wrote: > On Thu, 22 Aug 2013, Joonsoo Kim wrote: > > > @@ -1099,8 +1098,7 @@ static void drain_alien_cache(struct kmem_cache > > *cachep, > > > > static inline int cache_free_alien(struct

Re: [PATCH 05/16] slab: remove cachep in struct slab_rcu

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 05:53:00PM +, Christoph Lameter wrote: > On Thu, 22 Aug 2013, Joonsoo Kim wrote: > > > We can get cachep using page in struct slab_rcu, so remove it. > > Ok but this means that we need to touch struct page. Additional cacheline > in cache foo

Re: [PATCH 09/16] slab: use __GFP_COMP flag for allocating slab pages

2013-08-22 Thread Joonsoo Kim
On Thu, Aug 22, 2013 at 06:00:56PM +, Christoph Lameter wrote: > On Thu, 22 Aug 2013, Joonsoo Kim wrote: > > > If we use 'struct page' of first page as 'struct slab', there is no > > advantage not to use __GFP_COMP. So use __GFP_COMP flag for all the case

Re: [PATCH 05/16] slab: remove cachep in struct slab_rcu

2013-08-23 Thread JoonSoo Kim
2013/8/23 Christoph Lameter : > On Fri, 23 Aug 2013, Joonsoo Kim wrote: > >> On Thu, Aug 22, 2013 at 05:53:00PM +, Christoph Lameter wrote: >> > On Thu, 22 Aug 2013, Joonsoo Kim wrote: >> > >> > > We can get cachep using page in struct slab_rcu, so r

Re: [PATCH 05/16] slab: remove cachep in struct slab_rcu

2013-08-23 Thread JoonSoo Kim
2013/8/24 Christoph Lameter : > On Fri, 23 Aug 2013, JoonSoo Kim wrote: > >> I don't get it. This patch only affect to the rcu case, because it >> change the code >> which is in kmem_rcu_free(). It doesn't touch anything in standard case. > > In general th

[PATCH v3 0/3] optimization, clean-up about fair.c

2013-08-06 Thread Joonsoo Kim
tches, because I'm not sure they are right. Joonsoo Kim (3): sched: remove one division operation in find_buiest_queue() sched: factor out code to should_we_balance() sched: clean-up struct sd_lb_stat kernel/sched/fair.c | 326 +-- 1 file ch

[PATCH v3 3/3] sched: clean-up struct sd_lb_stat

2013-08-06 Thread Joonsoo Kim
d-off-by: Joonsoo Kim diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c6732d2..f8a9660 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4232,36 +4232,6 @@ static unsigned long task_h_load(struct task_struct *p) /** Helpers for find_busiest_

[PATCH v2 mmotm 3/3] swap: clean-up #ifdef in page_mapping()

2013-08-06 Thread Joonsoo Kim
PageSwapCache() is always false when !CONFIG_SWAP, so compiler properly discard related code. Therefore, we don't need #ifdef explicitly. Acked-by: Johannes Weiner Signed-off-by: Joonsoo Kim diff --git a/include/linux/swap.h b/include/linux/swap.h index 24db914..c03c139 100644 --- a/in

[PATCH v2 mmotm 1/3] mm, page_alloc: add unlikely macro to help compiler optimization

2013-08-06 Thread Joonsoo Kim
does right and nobody re-evaluate if gcc do proper optimization with their change, for example, it is not optimized properly on v3.10. So adding compiler hint here is reasonable. Acked-by: Johannes Weiner Signed-off-by: Joonsoo Kim diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f5c549c..04

[PATCH v3 2/3] sched: factor out code to should_we_balance()

2013-08-06 Thread Joonsoo Kim
354958aa7 kernel/sched/fair.o In addition, rename @balance to @should_balance in order to represent its purpose more clearly. Signed-off-by: Joonsoo Kim diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 52898dc..c6732d2 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c

[PATCH v2 mmotm 2/3] mm: move pgtable related functions to right place

2013-08-06 Thread Joonsoo Kim
pgtable related functions are mostly in pgtable-generic.c. So move remaining functions from memory.c to pgtable-generic.c. Signed-off-by: Joonsoo Kim diff --git a/mm/memory.c b/mm/memory.c index f2ab2a8..8fd4d42 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -374,30 +374,6 @@ void

[PATCH v3 1/3] sched: remove one division operation in find_buiest_queue()

2013-08-06 Thread Joonsoo Kim
Remove one division operation in find_buiest_queue(). Signed-off-by: Joonsoo Kim diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9565645..52898dc 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4968,7 +4968,7 @@ static struct rq *find_busiest_queue(struct lb_env *env

[PATCH] mm, page_alloc: optimize batch count in free_pcppages_bulk()

2013-08-06 Thread Joonsoo Kim
If we use a division operation, we can compute a batch count more closed to ideal value. With this value, we can finish our job within MIGRATE_PCPTYPES iteration. In addition, batching to free more pages may be helpful to cache usage. Signed-off-by: Joonsoo Kim diff --git a/mm/page_alloc.c b/mm

[PATCH 4/4] mm, page_alloc: optimize batch count in free_pcppages_bulk()

2013-08-06 Thread Joonsoo Kim
If we use a division operation, we can compute a batch count more closed to ideal value. With this value, we can finish our job within MIGRATE_PCPTYPES iteration. In addition, batching to free more pages may be helpful to cache usage. Signed-off-by: Joonsoo Kim diff --git a/mm/page_alloc.c b/mm

[PATCH 3/4] mm, rmap: minimize lock hold when unlink_anon_vmas

2013-08-06 Thread Joonsoo Kim
Currently, we free the avc objects with holding a lock. To minimize lock hold time, we just move the avc objects to another list with holding a lock. Then, iterate them and free objects without holding a lock. This makes lock hold time minimized. Signed-off-by: Joonsoo Kim diff --git a/mm

[PATCH 1/4] mm, rmap: do easy-job first in anon_vma_fork

2013-08-06 Thread Joonsoo Kim
If we fail due to some errorous situation, it is better to quit without doing heavy work. So changing order of execution. Signed-off-by: Joonsoo Kim diff --git a/mm/rmap.c b/mm/rmap.c index a149e3a..c2f51cb 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -278,19 +278,19 @@ int anon_vma_fork(struct

[PATCH 2/4] mm, rmap: allocate anon_vma_chain before starting to link anon_vma_chain

2013-08-06 Thread Joonsoo Kim
If we allocate anon_vma_chain before starting to link, we can reduce the lock hold time. This patch implement it. Signed-off-by: Joonsoo Kim diff --git a/mm/rmap.c b/mm/rmap.c index c2f51cb..1603f64 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -240,18 +240,21 @@ int anon_vma_clone(struct

Re: [PATCH 4/4] mm, page_alloc: optimize batch count in free_pcppages_bulk()

2013-08-06 Thread Joonsoo Kim
On Tue, Aug 06, 2013 at 05:43:40PM +0900, Joonsoo Kim wrote: > If we use a division operation, we can compute a batch count more closed > to ideal value. With this value, we can finish our job within > MIGRATE_PCPTYPES iteration. In addition, batching to free more pages > may be help

Re: [PATCH 1/4] mm, rmap: do easy-job first in anon_vma_fork

2013-08-07 Thread Joonsoo Kim
Hello, Johannes. On Tue, Aug 06, 2013 at 08:58:54AM -0400, Johannes Weiner wrote: > > if (anon_vma_clone(vma, pvma)) > > - return -ENOMEM; > > - > > - /* Then add our own anon_vma. */ > > - anon_vma = anon_vma_alloc(); > > - if (!anon_vma) > > - goto out_error; > > -

Re: [PATCH 2/4] mm, rmap: allocate anon_vma_chain before starting to link anon_vma_chain

2013-08-07 Thread Joonsoo Kim
On Wed, Aug 07, 2013 at 02:08:03AM -0400, Johannes Weiner wrote: > > > > list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) { > > struct anon_vma *anon_vma; > > > > - avc = anon_vma_chain_alloc(GFP_NOWAIT | __GFP_NOWARN); > > - if (unlikely(!av

Re: [PATCH 3/4] mm, rmap: minimize lock hold when unlink_anon_vmas

2013-08-07 Thread Joonsoo Kim
On Wed, Aug 07, 2013 at 02:11:38AM -0400, Johannes Weiner wrote: > On Tue, Aug 06, 2013 at 05:43:39PM +0900, Joonsoo Kim wrote: > > Currently, we free the avc objects with holding a lock. To minimize > > lock hold time, we just move the avc objects to another list > > with

Re: [PATCH 17/18] mm, hugetlb: retry if we fail to allocate a hugepage with use_reserve

2013-08-07 Thread Joonsoo Kim
On Tue, Aug 06, 2013 at 06:38:49PM -0700, Davidlohr Bueso wrote: > On Wed, 2013-08-07 at 11:03 +1000, David Gibson wrote: > > On Tue, Aug 06, 2013 at 05:18:44PM -0700, Davidlohr Bueso wrote: > > > On Mon, 2013-08-05 at 16:36 +0900, Joonsoo Kim wrote: > > > > >

Re: [PATCH 0/2] hugepage: optimize page fault path locking

2013-08-07 Thread Joonsoo Kim
On Tue, Aug 06, 2013 at 05:08:04PM -0700, Davidlohr Bueso wrote: > On Mon, 2013-07-29 at 15:18 +0900, Joonsoo Kim wrote: > > On Fri, Jul 26, 2013 at 07:27:23AM -0700, Davidlohr Bueso wrote: > > > This patchset attempts to reduce the amount of contention we i

Re: [PATCH] mm, page_alloc: optimize batch count in free_pcppages_bulk()

2013-08-07 Thread JoonSoo Kim
Hello, Andrew. 2013/8/7 Andrew Morton : > On Tue, 6 Aug 2013 17:40:40 +0900 Joonsoo Kim wrote: > >> If we use a division operation, we can compute a batch count more closed >> to ideal value. With this value, we can finish our job within >> MIGRATE_PCPTYPES iteration.

Re: [BUG] hackbench locks up with perf in 3.11-rc1 and beyond

2013-08-07 Thread Joonsoo Kim
then crashed, I changed the test to be > just: > > perf stat -r 10 ./hackbench 50 > > And kicked off ktest.pl to do the bisect. It came up with this commit as > the culprit: > > commit 318df36e57c0ca9f2146660d41ff28e8650af423 > Author: Joonsoo Kim > Date: Wed Jun 19

Re: [PATCH] hugepage: allow parallelization of the hugepage fault path

2013-07-21 Thread Joonsoo Kim
On Fri, Jul 19, 2013 at 02:24:15PM -0700, Davidlohr Bueso wrote: > On Fri, 2013-07-19 at 17:14 +1000, David Gibson wrote: > > On Thu, Jul 18, 2013 at 05:42:35PM +0900, Joonsoo Kim wrote: > > > On Wed, Jul 17, 2013 at 12:50:25PM -0700, Davidlohr Bueso wrote: > >

[PATCH v2 01/10] mm, hugetlb: move up the code which check availability of free huge page

2013-07-22 Thread Joonsoo Kim
In this time we are holding a hugetlb_lock, so hstate values can't be changed. If we don't have any usable free huge page in this time, we don't need to proceede the processing. So move this code up. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e

[PATCH v2 09/10] mm, hugetlb: remove decrement_hugepage_resv_vma()

2013-07-22 Thread Joonsoo Kim
patch implement it. Reviewed-by: Wanpeng Li Reviewed-by: Aneesh Kumar K.V Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 87e73bd..2ea6afd 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -434,25 +434,6 @@ static int is_vma_resv_set(struct vm_area_struct *vma, unsigned

[PATCH v2 08/10] mm, hugetlb: add VM_NORESERVE check in vma_has_reserves()

2013-07-22 Thread Joonsoo Kim
space. Reviewed-by: Wanpeng Li Reviewed-by: Aneesh Kumar K.V Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8a61638..87e73bd 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -464,6 +464,8 @@ void reset_vma_resv_huge_pages(struct vm_area_struct *vma) /* Returns true i

[PATCH v2 10/10] mm, hugetlb: decrement reserve count if VM_NORESERVE alloc page cache

2013-07-22 Thread Joonsoo Kim
if (q == MAP_FAILED) { fprintf(stderr, "mmap() failed: %s\n", strerror(errno)); } q[0] = 'c'; This patch solve this problem. Reviewed-by: Wanpeng Li Reviewed-by: Aneesh Kumar K.V Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.

[PATCH v2 06/10] mm, hugetlb: remove redundant list_empty check in gather_surplus_pages()

2013-07-22 Thread Joonsoo Kim
If list is empty, list_for_each_entry_safe() doesn't do anything. So, this check is redundant. Remove it. Reviewed-by: Aneesh Kumar K.V Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 3ac0a6f..7ca8733 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1017,11 +1

[PATCH v2 07/10] mm, hugetlb: do not use a page in page cache for cow optimization

2013-07-22 Thread Joonsoo Kim
on to a page in page cache, the problem is disappeared. Reviewed-by: Wanpeng Li Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7ca8733..8a61638 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2508,7 +2508,6 @@ static int hugetlb_cow(struct mm_struct *mm, struct vm

[PATCH v2 05/10] mm, hugetlb: fix and clean-up node iteration code to alloc or free

2013-07-22 Thread Joonsoo Kim
e_mask_to_[alloc|free]" and fix and clean-up node iteration code to alloc or free. This makes code more understandable. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 83edd17..3ac0a6f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -752,33 +752,

[PATCH v2 04/10] mm, hugetlb: clean-up alloc_huge_page()

2013-07-22 Thread Joonsoo Kim
We can unify some codes for succeed allocation. This makes code more readable. There is no functional difference. Reviewed-by: Aneesh Kumar K.V Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d21a33a..83edd17 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1146,12

[PATCH v2 03/10] mm, hugetlb: trivial commenting fix

2013-07-22 Thread Joonsoo Kim
The name of the mutex written in comment is wrong. Fix it. Reviewed-by: Aneesh Kumar K.V Acked-by: Hillf Danton Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d87f70b..d21a33a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -135,9 +135,9 @@ static inline struct

[PATCH v2 02/10] mm, hugetlb: remove err label in dequeue_huge_page_vma()

2013-07-22 Thread Joonsoo Kim
This label is not needed now, because there is no error handling except returing NULL. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fc4988c..d87f70b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -546,11 +546,11 @@ static struct page *dequeue_huge_page_vma(struct

[PATCH v2 00/10] mm, hugetlb: clean-up and possible bug fix

2013-07-22 Thread Joonsoo Kim
. Changes from v1. Split patch 1 into two patches to clear it's purpose. Remove useless indentation changes in 'clean-up alloc_huge_page()' Fix new iteration code bug. Add reviewed-by or acked-by. Joonsoo Kim (10): mm, hugetlb: move up the code which check availability of free hug

Re: [PATCH v2 02/10] mm, hugetlb: remove err label in dequeue_huge_page_vma()

2013-07-23 Thread Joonsoo Kim
On Mon, Jul 22, 2013 at 06:11:11PM +0200, Michal Hocko wrote: > On Mon 22-07-13 17:36:23, Joonsoo Kim wrote: > > This label is not needed now, because there is no error handling > > except returing NULL. > > > > Signed-off-by: Joonsoo Kim > > > > d

Re: [PATCH 3/9] mm, hugetlb: clean-up alloc_huge_page()

2013-07-23 Thread Joonsoo Kim
On Mon, Jul 22, 2013 at 04:51:50PM +0200, Michal Hocko wrote: > On Mon 15-07-13 18:52:41, Joonsoo Kim wrote: > > We can unify some codes for succeed allocation. > > This makes code more readable. > > There is no functional difference. > > "This patch unifies suc

Re: [PATCH v2 00/10] mm, hugetlb: clean-up and possible bug fix

2013-07-23 Thread Joonsoo Kim
On Mon, Jul 22, 2013 at 09:21:38PM +0530, Aneesh Kumar K.V wrote: > Joonsoo Kim writes: > > > First 6 patches are almost trivial clean-up patches. > > > > The others are for fixing three bugs. > > Perhaps, these problems are minor, because this codes are used > &

Re: [PATCH v2 07/10] mm, hugetlb: do not use a page in page cache for cow optimization

2013-07-24 Thread Joonsoo Kim
On Tue, Jul 23, 2013 at 01:45:50PM +0200, Michal Hocko wrote: > On Mon 22-07-13 17:36:28, Joonsoo Kim wrote: > > Currently, we use a page with mapped count 1 in page cache for cow > > optimization. If we find this condition, we don't allocate a new > > page and copy cont

Re: [PATCH v2 03/10] mm, hugetlb: trivial commenting fix

2013-07-24 Thread Joonsoo Kim
On Wed, Jul 24, 2013 at 09:00:41AM +0800, Wanpeng Li wrote: > On Mon, Jul 22, 2013 at 05:36:24PM +0900, Joonsoo Kim wrote: > >The name of the mutex written in comment is wrong. > >Fix it. > > > >Reviewed-by: Aneesh Kumar K.V > >Acked-by: Hillf Danton > >

[PATCH v2 00/20] mm, hugetlb: remove a hugetlb_instantiation_mutex

2013-08-09 Thread Joonsoo Kim
(Suggedt by Naoya) [1] http://lwn.net/Articles/558863/ "[PATCH] mm/hugetlb: per-vma instantiation mutexes" [2] https://lkml.org/lkml/2013/7/22/96 "[PATCH v2 00/10] mm, hugetlb: clean-up and possible bug fix" Joonsoo Kim (20): mm, hugetlb: protect reser

[PATCH v2 19/20] mm, hugetlb: retry if failed to allocate and there is concurrent user

2013-08-09 Thread Joonsoo Kim
get a SIGBUS signal until there is no concurrent user, and so, we can ensure that no one get a SIGBUS if there are enough hugepages. Signed-off-by: Joonsoo Kim diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index e29e28f..981c539 100644 --- a/include/linux/hugetlb.h +++ b/include

[PATCH v2 17/20] mm, hugetlb: move up anon_vma_prepare()

2013-08-09 Thread Joonsoo Kim
If we fail with a allocated hugepage, we need some effort to recover properly. So, it is better not to allocate a hugepage as much as possible. So move up anon_vma_prepare() which can be failed in OOM situation. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2372f75

[PATCH v2 20/20] mm, hugetlb: remove a hugetlb_instantiation_mutex

2013-08-09 Thread Joonsoo Kim
Now, we have prepared to have an infrastructure in order to remove a this awkward mutex which serialize all faulting tasks, so remove it. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0501fe5..f2c3a51 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2504,9 +2504,7

[PATCH v2 06/20] mm, hugetlb: return a reserved page to a reserved pool if failed

2013-08-09 Thread Joonsoo Kim
han how we need, because reserve count already decrease in dequeue_huge_page_vma(). This patch fix this situation. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6c8eec2..3f834f1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -572,6 +572,7 @@ re

[PATCH v2 18/20] mm, hugetlb: clean-up error handling in hugetlb_cow()

2013-08-09 Thread Joonsoo Kim
Current code include 'Caller expects lock to be held' in every error path. We can clean-up it as we do error handling in one place. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7e9a651..8743e5c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2500

[PATCH v2 12/20] mm, hugetlb: remove vma_has_reserves()

2013-08-09 Thread Joonsoo Kim
same as vma_has_reserves(), so remove vma_has_reserves(). Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e6c0c77..22ceb04 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -473,39 +473,6 @@ void reset_vma_resv_huge_pages(struct vm_area_struct *vma)

[PATCH v2 14/20] mm, hugetlb: call vma_needs_reservation before entering alloc_huge_page()

2013-08-09 Thread Joonsoo Kim
. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8dff972..bc666cf 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1110,13 +1110,11 @@ static void vma_commit_reservation(struct hstate *h, } static struct page *alloc_huge_page(struct vm_area_struct *vma

[PATCH v2 13/20] mm, hugetlb: mm, hugetlb: unify chg and avoid_reserve to use_reserve

2013-08-09 Thread Joonsoo Kim
Currently, we have two variable to represent whether we can use reserved page or not, chg and avoid_reserve, respectively. With aggregating these, we can have more clean code. This makes no functinoal difference. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 22ceb04

[PATCH v2 15/20] mm, hugetlb: remove a check for return value of alloc_huge_page()

2013-08-09 Thread Joonsoo Kim
Now, alloc_huge_page() only return -ENOSPEC if failed. So, we don't worry about other return value. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index bc666cf..24de2ca 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2544,7 +2544,6 @@ retry_avoidcopy: new

[PATCH v2 11/20] mm, hugetlb: make vma_resv_map() works for all mapping type

2013-08-09 Thread Joonsoo Kim
Util now, we get a resv_map by two ways according to each mapping type. This makes code dirty and unreadable. So unfiying it. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 869c3e0..e6c0c77 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -421,13 +421,24 @@ void

[PATCH v2 16/20] mm, hugetlb: move down outside_reserve check

2013-08-09 Thread Joonsoo Kim
Just move down outside_reserve check and don't check vma_need_reservation() when outside_resever is true. It is slightly optimized implementation. This makes code more readable. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 24de2ca..2372f75 100644 --- a/mm/huge

[PATCH v2 08/20] mm, hugetlb: region manipulation functions take resv_map rather list_head

2013-08-09 Thread Joonsoo Kim
To change a protection method for region tracking to find grained one, we pass the resv_map, instead of list_head, to region manipulation functions. This doesn't introduce any functional change, and it is just for preparing a next step. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c

[PATCH v2 10/20] mm, hugetlb: remove resv_map_put()

2013-08-09 Thread Joonsoo Kim
In following patch, I change vma_resv_map() to return resv_map for all case. This patch prepares it by removing resv_map_put() which doesn't works properly with following change, because it works only for HPAGE_RESV_OWNER's resv_map, not for all resv_maps. Signed-off-by: Joonsoo Kim

[PATCH v2 07/20] mm, hugetlb: unify region structure handling

2013-08-09 Thread Joonsoo Kim
re to fine grained lock, and this difference hinder it. So, before changing it, unify region structure handling. Signed-off-by: Joonsoo Kim diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index a3f868a..9bf2c4a 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -366,7 +3

[PATCH v2 01/20] mm, hugetlb: protect reserved pages when soft offlining a hugepage

2013-08-09 Thread Joonsoo Kim
Don't use the reserve pool when soft offlining a hugepage. Check we have free pages outside the reserve pool before we dequeue the huge page. Otherwise, we can steal other's reserve page. Reviewed-by: Aneesh Kumar Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/huget

[PATCH v2 09/20] mm, hugetlb: protect region tracking via newly introduced resv_map lock

2013-08-09 Thread Joonsoo Kim
ure, so it can be modified by two processes concurrently. To solve this, I introduce a lock to resv_map and make region manipulation function grab a lock before they do actual work. This makes region tracking safe. Signed-off-by: Joonsoo Kim diff --git a/include/linux/hugetlb.h b/include/linux

[PATCH v2 05/20] mm, hugetlb: grab a page_table_lock after page_cache_release

2013-08-09 Thread Joonsoo Kim
We don't need to grab a page_table_lock when we try to release a page. So, defer to grab a page_table_lock. Reviewed-by: Naoya Horiguchi Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c017c52..6c8eec2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2627,10 +26

[PATCH v2 03/20] mm, hugetlb: fix subpool accounting handling

2013-08-09 Thread Joonsoo Kim
If we alloc hugepage with avoid_reserve, we don't dequeue reserved one. So, we should check subpool counter when avoid_reserve. This patch implement it. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 12b6581..ea1ae0a 100644 --- a/mm/hugetlb.c +++ b/mm/huge

[PATCH v2 02/20] mm, hugetlb: change variable name reservations to resv

2013-08-09 Thread Joonsoo Kim
'reservations' is so long name as a variable and we use 'resv_map' to represent 'struct resv_map' in other place. To reduce confusion and unreadability, change it. Reviewed-by: Aneesh Kumar Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index

[PATCH v2 04/20] mm, hugetlb: remove useless check about mapping type

2013-08-09 Thread Joonsoo Kim
is_vma_resv_set(vma, HPAGE_RESV_OWNER) implys that this mapping is for private. So we don't need to check whether this mapping is for shared or not. This patch is just for clean-up. Signed-off-by: Joonsoo Kim diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ea1ae0a..c017c52 100644 ---

Re: [PATCH 17/18] mm, hugetlb: retry if we fail to allocate a hugepage with use_reserve

2013-08-09 Thread Joonsoo Kim
> I once attempted an approach involving an atomic counter of the number > of "in flight" hugepages, only retrying when it's non zero. Working > out a safe ordering for all the updates to get all the cases right > made my brain melt though, and I never got it working. I sent v2 few seconds before

Re: [PATCH v2 09/20] mm, hugetlb: protect region tracking via newly introduced resv_map lock

2013-08-13 Thread Joonsoo Kim
> > @@ -202,15 +199,27 @@ static long region_chg(struct resv_map *resv, long f, > > long t) > > * Subtle, allocate a new region at the position but make it zero > > * size such that we can guarantee to record the reservation. */ > > if (&rg->link == head || t < rg->from) { > > -

[PATCH] ARM: mm: clean-up in order to reduce to call kmap_high_get()

2013-03-03 Thread Joonsoo Kim
. Signed-off-by: Joonsoo Kim diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index c7e3759..b7711be 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -822,16 +822,16 @@ static void dma_cache_maint_page(struct page *page, unsigned long offset

[RFC PATCH] ARM: mm: disable kmap_high_get() for SMP

2013-03-03 Thread Joonsoo Kim
With SMP and enabling kmap_high_get(), it makes users of kmap_atomic() sequential ordered, because kmap_high_get() use global kmap_lock(). It is not welcome situation, so turn off this optimization for SMP. Cc: Nicolas Pitre Signed-off-by: Joonsoo Kim diff --git a/arch/arm/include/asm

[PATCH 2/3] mm, slub: count freed pages via rcu as this task's reclaimed_slab

2013-04-08 Thread Joonsoo Kim
de to count these pages for this task's reclaimed_slab. Cc: Christoph Lameter Cc: Pekka Enberg Cc: Matt Mackall Signed-off-by: Joonsoo Kim diff --git a/mm/slub.c b/mm/slub.c index 4aec537..16fd2d5 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1409,8 +1409,6 @@ static void __free_slab(struct

[PATCH 3/3] mm, slab: count freed pages via rcu as this task's reclaimed_slab

2013-04-08 Thread Joonsoo Kim
de to count these pages for this task's reclaimed_slab. Cc: Christoph Lameter Cc: Pekka Enberg Cc: Matt Mackall Signed-off-by: Joonsoo Kim diff --git a/mm/slab.c b/mm/slab.c index 856e4a1..4d94bcb 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -1934,8 +1934,6 @@ static void kmem_freepa

[PATCH 1/3] mm, vmscan: count accidental reclaimed pages failed to put into lru

2013-04-08 Thread Joonsoo Kim
In shrink_(in)active_list(), we can fail to put into lru, and these pages are reclaimed accidentally. Currently, these pages are not counted for sc->nr_reclaimed, but with this information, we can stop to reclaim earlier, so can reduce overhead of reclaim. Signed-off-by: Joonsoo Kim diff --

Re: [PATCH 08/10] mm: vmscan: Have kswapd shrink slab only once per priority

2013-04-08 Thread Joonsoo Kim
Hello, Mel. Sorry for too late question. On Sun, Mar 17, 2013 at 01:04:14PM +, Mel Gorman wrote: > If kswaps fails to make progress but continues to shrink slab then it'll > either discard all of slab or consume CPU uselessly scanning shrinkers. > This patch causes kswapd to only call the shri

Re: [PATCH 08/10] mm: vmscan: Have kswapd shrink slab only once per priority

2013-04-09 Thread Joonsoo Kim
Hello, Mel. On Tue, Apr 09, 2013 at 12:13:59PM +0100, Mel Gorman wrote: > On Tue, Apr 09, 2013 at 03:53:25PM +0900, Joonsoo Kim wrote: > > Hello, Mel. > > Sorry for too late question. > > > > No need to apologise at all. > > > On Sun, Mar 17, 2013 a

Re: [PATCH 08/10] mm: vmscan: Have kswapd shrink slab only once per priority

2013-04-09 Thread Joonsoo Kim
Hello, Dave. On Wed, Apr 10, 2013 at 11:07:34AM +1000, Dave Chinner wrote: > On Tue, Apr 09, 2013 at 12:13:59PM +0100, Mel Gorman wrote: > > On Tue, Apr 09, 2013 at 03:53:25PM +0900, Joonsoo Kim wrote: > > > > > I think that outside of zone loop is better

Re: [PATCH 2/3] mm, slub: count freed pages via rcu as this task's reclaimed_slab

2013-04-09 Thread Joonsoo Kim
Hello, Christoph. On Tue, Apr 09, 2013 at 02:28:06PM +, Christoph Lameter wrote: > On Tue, 9 Apr 2013, Joonsoo Kim wrote: > > > Currently, freed pages via rcu is not counted for reclaimed_slab, because > > it is freed in rcu context, not current task context. But, this fre

Re: [PATCH 1/3] mm, vmscan: count accidental reclaimed pages failed to put into lru

2013-04-09 Thread Joonsoo Kim
Hello, Minchan. On Tue, Apr 09, 2013 at 02:55:14PM +0900, Minchan Kim wrote: > Hello Joonsoo, > > On Tue, Apr 09, 2013 at 10:21:16AM +0900, Joonsoo Kim wrote: > > In shrink_(in)active_list(), we can fail to put into lru, and these pages > > are reclaimed accidentally. Curre

Re: [RT LATENCY] 249 microsecond latency caused by slub's unfreeze_partials() code.

2013-04-10 Thread Joonsoo Kim
On Wed, Apr 10, 2013 at 09:31:10AM +0300, Pekka Enberg wrote: > On Mon, Apr 8, 2013 at 3:32 PM, Steven Rostedt wrote: > >> > Index: linux/mm/slub.c > >> > === > >> > --- linux.orig/mm/slub.c2013-03-28 12:14:26.958358688 -0500 > >>

Re: [PATCH 2/3] mm, slub: count freed pages via rcu as this task's reclaimed_slab

2013-04-10 Thread JoonSoo Kim
2013/4/10 Christoph Lameter : > On Wed, 10 Apr 2013, Joonsoo Kim wrote: > >> Hello, Christoph. >> >> On Tue, Apr 09, 2013 at 02:28:06PM +, Christoph Lameter wrote: >> > On Tue, 9 Apr 2013, Joonsoo Kim wrote: >> > >> > > Currently, freed pa

[PATCH for-3.10] workqueue: correct handling of the pool spin_lock

2013-04-30 Thread Joonsoo Kim
When we fail to mutex_trylock(), we release the pool spin_lock and do mutex_lock(). After that, we should regrab the pool spin_lock, but, regrabbing is missed in current code. So correct it. Cc: Lai Jiangshan Signed-off-by: Joonsoo Kim diff --git a/kernel/workqueue.c b/kernel/workqueue.c index

<    1   2   3   4   5   6   7   8   9   10   >