On Mon, Apr 10, 2023 at 3:40 PM Sharma, Shashank
<shashank.sha...@amd.com> wrote:
>
> [AMD Official Use Only - General]
>
> Hello Bas,
>
> This is not the correct interpretation of the code, the USERQ_IOCTL has both 
> the OPs (create and destroy), but th euser has to exclusively call  it.
>
> Please see the sample test program in the existing libDRM series 
> (userq_test.c, it specifically calls amdgpu_free_userq, which does the 
> destroy_OP
>
> for the IOCTL.

In the presence of crashes the kernel should always be able to clean
this up no? Otherwise there is a resource leak?

>
> - Shashank
>
> -----Original Message-----
> From: Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl>
> Sent: 10 April 2023 11:26
> To: Sharma, Shashank <shashank.sha...@amd.com>
> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
> <alexander.deuc...@amd.com>; Kuehling, Felix <felix.kuehl...@amd.com>; 
> Koenig, Christian <christian.koe...@amd.com>; Yadav, Arvind 
> <arvind.ya...@amd.com>
> Subject: Re: [PATCH v3 0/9] AMDGPU Usermode queues
>
> Hi Shashank,
>
> I think I found the issue: I wasn't destroying the user queue in my program 
> and the kernel doesn't clean up any remaining user queues in the postclose 
> hook. I think we need something like
> https://github.com/BNieuwenhuizen/linux/commit/e90c8d1185da7353c12837973ceddf55ccc85d29
> ?
>
> While running things multiple times now works, I still have problems doing 
> multiple submissions from the same queue. Looking forward to the updated 
> test/sample
>
> Thanks,
> Bas
>
> On Mon, Apr 10, 2023 at 9:32 AM Sharma, Shashank <shashank.sha...@amd.com> 
> wrote:
> >
> > [AMD Official Use Only - General]
> >
> > Hello Bas,
> > Thanks for trying this out.
> >
> > This could be due to the doorbell as you mentioned, Usermode queue uses 
> > doorbell manager internally.
> > This week, we are planning to publis the latest libDRM sample code which 
> > uses a doorbell object (instead of the doorbell hack IOCTL), adapting to 
> > that should fix your problem in my opinion.
> > We have tested this full stack (libDRM test + Usermode queue + doorbell 
> > manager) for 500+ consecutive runs, and it worked well for us.
> >
> > You can use this integrated kernel stack (1+2) from my gitlab to build
> > your kernel:
> > https://gitlab.freedesktop.org/contactshashanksharma/userq-amdgpu/-/tr
> > ee/integrated-db-and-uq-v3 Please stay tuned for updated libDRM
> > changes with doorbell objects.
> >
> > Regards
> > Shashank
> > -----Original Message-----
> > From: Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl>
> > Sent: 10 April 2023 02:37
> > To: Sharma, Shashank <shashank.sha...@amd.com>
> > Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> > <alexander.deuc...@amd.com>; Kuehling, Felix <felix.kuehl...@amd.com>;
> > Koenig, Christian <christian.koe...@amd.com>
> > Subject: Re: [PATCH v3 0/9] AMDGPU Usermode queues
> >
> > Hi Shashank,
> >
> > I tried writing a program to experiment with usermode queues and I found 
> > some weird behavior: The first run of the program works as expected, while 
> > subsequent runs don't seem to do anything (and I allocate everything in 
> > GTT, so it should be zero initialized consistently). Is this a known issue?
> >
> > The linked libdrm code for the uapi still does a doorbell ioctl so it could 
> > very well be that I do the doorbell wrong (especially since the ioctl 
> > implementation was never shared AFAICT?), but it seems like the kernel 
> > submissions (i.e. write wptr in dwords to the wptr va and to the doorbell). 
> > Is it possible to update the test in libdrm?
> >
> > Code: https://gitlab.freedesktop.org/bnieuwenhuizen/usermode-queue
> >
> > Thanks,
> > Bas
> >
> > On Wed, Mar 29, 2023 at 6:05 PM Shashank Sharma <shashank.sha...@amd.com> 
> > wrote:
> > >
> > > This patch series introduces AMDGPU usermode queues for gfx workloads.
> > > Usermode queues is a method of GPU workload submission into the
> > > graphics hardware without any interaction with kernel/DRM schedulers.
> > > In this method, a userspace graphics application can create its own
> > > workqueue and submit it directly in the GPU HW.
> > >
> > > The general idea of how this is supposed to work:
> > > - The application creates the following GPU objetcs:
> > >   - A queue object to hold the workload packets.
> > >   - A read pointer object.
> > >   - A write pointer object.
> > >   - A doorbell page.
> > > - The application picks a 32-bit offset in the doorbell page for this 
> > > queue.
> > > - The application uses the usermode_queue_create IOCTL introduced in
> > >   this patch, by passing the the GPU addresses of these objects (read
> > >   ptr, write ptr, queue base address and 32-bit doorbell offset from
> > >   the doorbell page)
> > > - The kernel creates the queue and maps it in the HW.
> > > - The application can start submitting the data in the queue as soon as
> > >   the kernel IOCTL returns.
> > > - After filling the workload data in the queue, the app must write the
> > >   number of dwords added in the queue into the doorbell offset, and the
> > >   GPU will start fetching the data.
> > >
> > > libDRM changes for this series and a sample DRM test program can be
> > > found in the MESA merge request here:
> > > https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/287
> > >
> > > This patch series depends on the doorbell-manager changes, which are
> > > being reviewed here:
> > > https://patchwork.freedesktop.org/series/115802/
> > >
> > > Alex Deucher (1):
> > >   drm/amdgpu: UAPI for user queue management
> > >
> > > Arvind Yadav (2):
> > >   drm/amdgpu: add new parameters in v11_struct
> > >   drm/amdgpu: map wptr BO into GART
> > >
> > > Shashank Sharma (6):
> > >   drm/amdgpu: add usermode queue base code
> > >   drm/amdgpu: add new IOCTL for usermode queue
> > >   drm/amdgpu: create GFX-gen11 MQD for userqueue
> > >   drm/amdgpu: create context space for usermode queue
> > >   drm/amdgpu: map usermode queue into MES
> > >   drm/amdgpu: generate doorbell index for userqueue
> > >
> > >  drivers/gpu/drm/amd/amdgpu/Makefile           |   3 +
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  10 +-
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |   2 +
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c       |   6 +
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 298
> > > ++++++++++++++++++  .../drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c |
> > > ++++++++++++++++++ 230 ++++++++++++++
> > >  .../gpu/drm/amd/include/amdgpu_userqueue.h    |  66 ++++
> > >  drivers/gpu/drm/amd/include/v11_structs.h     |  16 +-
> > >  include/uapi/drm/amdgpu_drm.h                 |  55 ++++
> > >  9 files changed, 677 insertions(+), 9 deletions(-)  create mode
> > > 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> > >  create mode 100644
> > > drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c
> > >  create mode 100644 drivers/gpu/drm/amd/include/amdgpu_userqueue.h
> > >
> > > --
> > > 2.40.0
> > >

Reply via email to