On Sun, May 04, 2014 at 04:48:24PM +0530, akash.g...@intel.com wrote:
> From: Akash Goel <akash.g...@intel.com>
> 
> This patch could help to reduce the time, 'struct_mutex' is kept
> locked during either the exec-buffer path or Page fault
> handling path as now the backing pages are requested from shmem layer
> without holding the 'struct_mutex'.

I'm not keen on having an extra pass inside execbuffer for what should
be relatively rare, but in discussion the idea of adding a flag to
create2 (I915_CREATE_POPULATE) should cover the usecase nicely (having
to pay the shmemfs allocation+zero price without holding struct_mutex).

> +void
> +i915_gem_object_shmem_preallocate(struct drm_i915_gem_object *obj)

This is unlocked. We have to be loud and clear about that

static void __i915_gem_object_shmemfs_populate(obj)

> +{
> +     int page_count, i;
> +     struct address_space *mapping;
> +     struct page *page;
> +     gfp_t gfp;
> +
> +     if (obj->pages)
> +             return;

if (READ_ONCE(obj->pages) return;

> +     if (obj->madv != I915_MADV_WILLNEED) {
> +             DRM_ERROR("Attempting to preallocate a purgeable object\n");

Ideally if (READ_ONCE(obj->madv)...

However, that requires a patch to move madv to unsigned obj->flags.

A user controllable *ERROR*? No, just make it debug.

> +             return;
> +     }
> +
> +     if (obj->base.filp) {
> +             int ret;
> +             struct inode *inode = file_inode(obj->base.filp);
> +             struct shmem_inode_info *info = SHMEM_I(inode);
> +             if (!inode)
> +                     return;

Assume it exists. If it doesn't it is an internal bug so we should just
GPF out of here.

> +             /* The alloced field stores how many data pages are allocated
> +              * to the file. If already shmem space has been allocated for
> +              * the object, then we can simply return */
> +             spin_lock(&info->lock);
> +             ret = info->alloced;
> +             spin_unlock(&info->lock);
> +             if (ret > 0) {
> +                     DRM_DEBUG_DRIVER("Already shmem space alloced for obj 
> %p, %d pages\n",
> +                                     obj, ret);
> +                     return;
> +             }

I'm convinced this is of much use though. If the page is already
allocated the shmem_read_mapping_page_gfp should be quick. Given the
suggestion that we move this to create, it is clear that this won't be
an issue.

> +     } else
> +             return;
> +
> +     BUG_ON(obj->pages_pin_count);

Requires struct_mutex; ignore.

> +
> +     /* Assert that the object is not currently in any GPU domain. As it
> +      * wasn't in the GTT, there shouldn't be any way it could have been in
> +      * a GPU cache
> +      */
> +     BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
> +     BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);

Requires struct_mutex; ignore.

> +     trace_i915_gem_obj_prealloc_start(obj, obj->base.size);

You only need to pass obj to the tracer, it can work out the size
itself.

> +     page_count = obj->base.size / PAGE_SIZE;
> +
> +     /* Get the list of pages out of our struct file
> +      * Fail silently without starting the shrinker
> +      */
> +     mapping = file_inode(obj->base.filp)->i_mapping;
> +     gfp = mapping_gfp_mask(mapping);
> +     gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD;
> +     gfp &= ~(__GFP_IO | __GFP_WAIT);

Interesting. Disabling shrinker/oom for these pages - we will hit it
later of course. But for the purposes of a quick preallocation, seems
fine.

> +     for (i = 0; i < page_count; i++) {
> +             page = shmem_read_mapping_page_gfp(mapping, i, gfp);
> +             if (IS_ERR(page)) {
> +                     DRM_DEBUG_DRIVER("Failure for obj(%p), size(%x) at 
> page(%d)\n",
> +                                             obj, obj->base.size, i);
> +                     break;
> +             }
> +             /* Decrement the extra ref count on the returned page,
> +                otherwise when 'get_pages_gtt' will be called later on
> +                in the regular path, it will also increment the ref count,
> +                which will disturb the ref count management */
> +             page_cache_release(page);
> +     }
> +
> +     trace_i915_gem_obj_prealloc_end(obj);
> +}
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to