i915: Clearing buffer objects via blitter engine

Tvrtko Ursulin Thu, 02 Jul 2015 02:31:16 -0700


On 07/01/2015 05:30 PM, Chris Wilson wrote:

On Wed, Jul 01, 2015 at 03:54:55PM +0100, Tvrtko Ursulin wrote:

+static int i915_gem_exec_flush_object(struct drm_i915_gem_object *obj,
+                                     struct intel_engine_cs *ring,
+                                     struct intel_context *ctx,
+                                     struct drm_i915_gem_request **req)
+{
+       int ret;
+
+       ret = i915_gem_object_sync(obj, ring, req);
+       if (ret)
+               return ret;
+
+       if (obj->base.write_domain & I915_GEM_DOMAIN_CPU) {
+               if (i915_gem_clflush_object(obj, false))
+                       i915_gem_chipset_flush(obj->base.dev);
+               obj->base.write_domain &= ~I915_GEM_DOMAIN_CPU;
+       }
+       if (obj->base.write_domain & I915_GEM_DOMAIN_GTT) {
+               wmb();
+               obj->base.write_domain &= ~I915_GEM_DOMAIN_GTT;
+       }


All this could be replaced with i915_gem_object_set_to_gtt_domain, no?


No. Technically this is i915_gem_execbuffer_move_to_gpu().

Aha.. I see now what was my confusion. It doesn't help thati915_gem_execbuffer_move_to_gpu and execlist_move_to_gpu are implementedat different places logically.

It would be nice to extract the loop body then call it something likei915_gem_execbuffer_move_vma_to_gpu, it would avoid at least threeinstances of the same code.

+
+       return i915.enable_execlists ?
+                       logical_ring_invalidate_all_caches(*req) :
+                       intel_ring_invalidate_all_caches(*req);


And this is done on actual submission for you by the lower levels so
you don't need to call it directly.


What submission? We don't build a batch, we are building a raw request
to do the operation from the ring.


I was confused to where execlist_move_to_gpu is in the stack.

+       lockdep_assert_held(&dev->struct_mutex);


It think there was some guidance that lockdep_assert_held is
compiled out when lockdep is not in the kernel and that WARN_ON is
preferred. In this case that would probably be WARN_ON_ONCE and
return error.


Hah, this predates that and I still disagree.

Predates or not is not relevant. :) It is not a clean cut situation Iagree. Maybe we need our own amalgamation on WARN_ON_ONCE andlockdep_assert_held but I think we either check for these things or not,or have a really good assurance of test coverage with lockdep enabledduring QA.

+       ring = &dev_priv->ring[HAS_BLT(dev) ? BCS : RCS];
+       ctx = i915_gem_context_get(file_priv, DEFAULT_CONTEXT_HANDLE);
+       /* Allocate a request for this operation nice and early. */
+       ret = i915_gem_request_alloc(ring, ctx, &req);
+       if (ret)
+               return ret;
+
+       if (ctx->ppgtt)
+               vm = &ctx->ppgtt->base;
+       else
+               vm = &dev_priv->gtt.base;
+
+       if (i915.enable_execlists && !ctx->engine[ring->id].state) {
+               ret = intel_lr_context_deferred_create(ctx, ring);


i915_gem_context_get above and this call are very similar to what
i915_gem_validate_context does. It seems to me it would be better to
call the latter function here.


No, the intel_lrc API is absolute garbage and needs to be taken out the
back and shot. Until that is done, I wouldn't bother continuing to try
and use the interface at all.

All that needs to happen here is:

req = i915_gem_request_alloc(ring, ring->default_context);

and for the request/lrc to go off and dtrt.

Well.. I the meantime why duplicate it when i915_gem_validate_contextdoes i915_gem_context_get and deferred create if needed. I don't see thedownside. Snippet from above becomes:


  ring = &dev_priv->ring[HAS_BLT(dev) ? BCS : RCS];
  ctx = i915_gem_validate_context(dev, file, ring,
                                DFAULT_CONTEXT_HANDLE);
  ... handle error...
  /* Allocate a request for this operation nice and early. */
  ret = i915_gem_request_alloc(ring, ctx, &req);

Why would this code have to know about deferred create.

+       }
+
+       ringbuf = ctx->engine[ring->id].ringbuf;
+
+       ret = i915_gem_object_pin(obj, vm, PAGE_SIZE, 0);
+       if (ret)
+               return ret;
+
+       if (obj->tiling_mode && INTEL_INFO(dev)->gen <= 3) {
+               ret = i915_gem_object_put_fence(obj);
+               if (ret)
+                       goto unpin;
+       }


Why is this needed?


Because it's a requirement of the op being used on those gen. Later gen
can keep the fence.

Could it be called unconditionally and still work?

At least I would recommend a comment explaining it.

It is ugly to sprinkle platform knowledge to the callers - I think I sawa callsites which call i915_gem_object_put_fence unconditionally so whywould that not work here?

+       if (i915.enable_execlists) {
+               if (dev_priv->info.gen >= 8) {
+                       ret = intel_logical_ring_begin(req, 8);
+                       if (ret)
+                               goto unpin;
+
+                       intel_logical_ring_emit(ringbuf, GEN8_COLOR_BLT_CMD |
+                                                        BLT_WRITE_RGBA |
+                                                        (7-2));
+                       intel_logical_ring_emit(ringbuf, BPP_32 |
+                                                        ROP_FILL_COPY |
+                                                        PAGE_SIZE);
+                       intel_logical_ring_emit(ringbuf, 0);
+                       intel_logical_ring_emit(ringbuf,
+                                               obj->base.size >> PAGE_SHIFT
+                                               << 16 | PAGE_SIZE / 4);
+                       intel_logical_ring_emit(ringbuf,
+                                               i915_gem_obj_offset(obj, vm));
+                       intel_logical_ring_emit(ringbuf, 0);
+                       intel_logical_ring_emit(ringbuf, 0);
+                       intel_logical_ring_emit(ringbuf, MI_NOOP);
+
+                       intel_logical_ring_advance(ringbuf);
+               } else {
+                       DRM_ERROR("Execlists not supported for gen %d\n",
+                                 dev_priv->info.gen);
+                       ret = -EINVAL;


I would put a WARN_ON_ONCE here, or even just return -EINVAL. If the
driver is so messed up in general that execlists are enabled < gen8
I think there is no point logging errors about it from here. Would
also save you one indentation level.


I would just rewrite this to have a logical interface to the rings. Oh
wait, I did.

That is out of my jurisdiction, but I think my comment to the above isnot an unreasonable one since it indicates total driver confusion andcould/should be handled somewhere else.


Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/4] drm/i915: Clearing buffer objects via blitter engine

Reply via email to