Re: [Mesa-dev] [PATCH 6/6] anv: Do relocations in userspace before execbuf ioctl

Jason Ekstrand Wed, 02 Nov 2016 11:43:17 -0700

On Wed, Nov 2, 2016 at 9:43 AM, Chris Wilson <ch...@chris-wilson.co.uk>
wrote:


> On Tue, Nov 01, 2016 at 03:35:13PM -0700, Jason Ekstrand wrote:
> > +static void
> > +write_reloc(const struct anv_device *device, void *p, uint64_t v)
> > +{
> > +   unsigned reloc_size = 0;
> > +   if (device->info.gen >= 8) {
> > +      /* From the Broadwell PRM Vol. 2a, MI_LOAD_REGISTER_MEM::MemoryAd
> dress:
> > +       *
> > +       *    "This field specifies the address of the memory location
> where the
> > +       *    register value specified in the DWord above will read from.
> The
> > +       *    address specifies the DWord location of the data. Range =
> > +       *    GraphicsVirtualAddress[63:2] for a DWord register
> GraphicsAddress
> > +       *    [63:48] are ignored by the HW and assumed to be in correct
> > +       *    canonical form [63:48] == [47]."
> > +       */
>
> const int shift = 63 - 47; /* != 8 :) */
>

Drp...  I'll use the const int. :-)


> > +      reloc_size = sizeof(uint64_t);
> > +      *(uint64_t *)p = (((int64_t)v) << 8) >> 8;
> > +   } else {
> > +      reloc_size = sizeof(uint32_t);
> > +      *(uint32_t *)p = v;
> > +   }
> > +
> > +   if (!device->info.has_llc)
> > +      anv_clflush_range(p, reloc_size);
> > +}
> > +
> > +static void
> > +anv_reloc_list_apply(struct anv_reloc_list *list,
> > +                     struct anv_device *device, struct anv_bo *bo)
> > +{
> > +   for (size_t i = 0; i < list->num_relocs; i++) {
> > +      void *p = bo->map + list->relocs[i].offset;
> > +
> > +      struct anv_bo *target_bo = list->reloc_bos[i];
> > +      write_reloc(device, p, target_bo->offset + list->relocs[i].delta);
> > +      list->relocs[i].presumed_offset = bo->offset;
> > +   }
> > +}
> > +
> > +/**
> > + * This function applies the relocation for a command buffer and writes
> the
> > + * actual addresses into the buffers as per what we were told by the
> kernel on
> > + * the previous execbuf2 call.  This should be safe to do because, for
> each
> > + * relocated address, we have two cases:
> > + *
> > + *  1) The target BO is inactive (as seen by the kernel).  In this
> case, it is
> > + *     not in use by the GPU so updating the address is 100% ok.  It
> won't be
> > + *     in-use by the GPU (from our context) again until the next
> execbuf2
> > + *     happens.  If the kernel decides to move it in the next execbuf2,
> it
> > + *     will have to do the relocations itself, but that's ok because it
> should
> > + *     have all of the information needed to do so.
> > + *
> > + *  2) The target BO is active (as seen by the kernel).  In this case,
> it
> > + *     hasn't moved since the last execbuffer2 call because GTT
> shuffling
> > + *     *only* happens inside the execbuffer2 ioctl.  Since the target BO
> > + *     hasn't moved, our anv_bo::offset exactly matches the BO's GTT
> address
> > + *     and the relocated value we are writing into the BO will be the
> same as
> > + *     the value that is already there.
> > + *
> > + *     There is also a possibility that the target BO is active but the
> exact
> > + *     RENDER_SURFACE_STATE object we are writing the relocation into
> isn't in
> > + *     use.  In this case, the address currently in the
> RENDER_SURFACE_STATE
> > + *     may be stale but it's still safe to write the relocation because
> that
> > + *     particular RENDER_SURFACE_STATE object isn't in-use by the GPU
> and
> > + *     won't be until the next execbuf2 call.
> > + *
> > + * By doing relocations on the CPU, we can tell the kernel that it
> doesn't
> > + * need to bother.  We want to do this because the surface state buffer
> is
> > + * used by every command buffer so, if the kernel does the relocations,
> it
> > + * will always be busy and the kernel will always stall.  This is also
> > + * probably the fastest mechanism for doing relocations since the
> kernel would
> > + * have to make a full copy of all the relocations lists.
> > + */
> > +static void
> > +relocate_cmd_buffer(struct anv_cmd_buffer *cmd_buffer)
> > +{
> > +   for (uint32_t i = 0; i < cmd_buffer->execbuf2.bo_count; i++) {
> > +      if (cmd_buffer->execbuf2.bos[i]->offset == (uint64_t)-1)
> > +         return;
> > +   }
>
> Note you still have to make sure the reloc.presumed_offset matches the
> value used to compute the contents of the buffer. Old relocations will
> retain their value from the previous pass, new allocations will have
> presumed_offset = 0 and hopefully the content will be delta. (Otherwise
> you run into trouble if the kernel places the buffer at 0, it will then
> skip the relocations pointing to it.)
>

So, I *thought* we were doing this correctly, but I think the current code
is a bit on the busted side.  I'm going to be sending out a v3 that has
some general relocation fixes as well as more comments (along the line of
what you said above)


> That deserves a comment (if only to contrast the two paths).
>
> > +   anv_reloc_list_apply(&cmd_buffer->surface_relocs,
> > +                        cmd_buffer->device,
> > +                        &cmd_buffer->device->surface_s
> tate_block_pool.bo);
> > +
> > +   struct anv_batch_bo **bbo;
> > +   u_vector_foreach(bbo, &cmd_buffer->seen_bbos) {
> > +      anv_reloc_list_apply(&(*bbo)->relocs,
> > +                           cmd_buffer->device, &(*bbo)->bo);
> > +   }
> > +
> > +   for (uint32_t i = 0; i < cmd_buffer->execbuf2.bo_count; i++) {
> > +      struct anv_bo *bo = cmd_buffer->execbuf2.bos[i];
> > +
> > +      cmd_buffer->execbuf2.objects[i].offset = bo->offset;
>
> /* bo->offset = last execobj.offset */
>
> *reloc.offset = bo->offset + delta;
> reloc.presumed_offset = bo->offset;
> execobj.offset = bo->offset;
>
> Looks good.
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
>

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 6/6] anv: Do relocations in userspace before execbuf ioctl

Reply via email to