On Mon, 15 Jun 2015 17:52:07 +0200 Jesper Dangaard Brouer <bro...@redhat.com> 
wrote:

> From: Christoph Lameter <c...@linux.com>
> 
> [NOTICE: Already in AKPM's quilt-queue]
> 
> First piece: acceleration of retrieval of per cpu objects
> 
> If we are allocating lots of objects then it is advantageous to disable
> interrupts and avoid the this_cpu_cmpxchg() operation to get these objects
> faster.
> 
> Note that we cannot do the fast operation if debugging is enabled, because
> we would have to add extra code to do all the debugging checks.  And it
> would not be fast anyway.
> 
> Note also that the requirement of having interrupts disabled
> avoids having to do processor flag operations.
> 
> Allocate as many objects as possible in the fast way and then fall back to
> the generic implementation for the rest of the objects.
> 
> ...
>
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2759,7 +2759,32 @@ EXPORT_SYMBOL(kmem_cache_free_bulk);
>  bool kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
>                                                               void **p)
>  {
> -     return kmem_cache_alloc_bulk(s, flags, size, p);
> +     if (!kmem_cache_debug(s)) {
> +             struct kmem_cache_cpu *c;
> +
> +             /* Drain objects in the per cpu slab */
> +             local_irq_disable();
> +             c = this_cpu_ptr(s->cpu_slab);
> +
> +             while (size) {
> +                     void *object = c->freelist;
> +
> +                     if (!object)
> +                             break;
> +
> +                     c->freelist = get_freepointer(s, object);
> +                     *p++ = object;
> +                     size--;
> +
> +                     if (unlikely(flags & __GFP_ZERO))
> +                             memset(object, 0, s->object_size);
> +             }
> +             c->tid = next_tid(c->tid);
> +
> +             local_irq_enable();

It might be worth adding

                if (!size)
                        return true;

here.  To avoid the pointless call to __kmem_cache_alloc_bulk().

It depends on the typical success rate of this allocation loop.  Do you
know what this is?

> +     }
> +
> +     return __kmem_cache_alloc_bulk(s, flags, size, p);
>  }
>  EXPORT_SYMBOL(kmem_cache_alloc_bulk);

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to