On Mon, Apr 10, 2023 at 4:01 PM Shashank Sharma <shashank.sha...@amd.com> wrote:
>
>
> On 10/04/2023 15:46, Bas Nieuwenhuizen wrote:
> > On Mon, Apr 10, 2023 at 3:40 PM Sharma, Shashank
> > <shashank.sha...@amd.com> wrote:
> >> [AMD Official Use Only - General]
> >>
> >> Hello Bas,
> >>
> >> This is not the correct interpretation of the code, the USERQ_IOCTL has 
> >> both the OPs (create and destroy), but th euser has to exclusively call  
> >> it.
> >>
> >> Please see the sample test program in the existing libDRM series 
> >> (userq_test.c, it specifically calls amdgpu_free_userq, which does the 
> >> destroy_OP
> >>
> >> for the IOCTL.
> > In the presence of crashes the kernel should always be able to clean
> > this up no? Otherwise there is a resource leak?
>
> The crash handling is the same as any of the existing GPU resource which
> are allocated and freed with IOCTL_OPs.

Most of those are handled in the when the DRM fd gets closed (i.e.
when the process exits):

- buffers through drm_gem_release()
- mappings in amdgpu_vm_fini
- contexts in amdgpu_ctx_mgr_fini

etc.

Why would we do things differently for userspace queues? It doesn't
look complicated looking at the above patch (which does seem to work).


>
> To be honest a crash handling can be very elaborated and complex one,
> and hence only can be done at the driver unload IMO, which doesn't help
> at that stage,
>
> coz anyways driver will re-allocate the resources on next load.
>
> - Shashank
>
> >
> >> - Shashank
> >>
> >> -----Original Message-----
> >> From: Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl>
> >> Sent: 10 April 2023 11:26
> >> To: Sharma, Shashank <shashank.sha...@amd.com>
> >> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
> >> <alexander.deuc...@amd.com>; Kuehling, Felix <felix.kuehl...@amd.com>; 
> >> Koenig, Christian <christian.koe...@amd.com>; Yadav, Arvind 
> >> <arvind.ya...@amd.com>
> >> Subject: Re: [PATCH v3 0/9] AMDGPU Usermode queues
> >>
> >> Hi Shashank,
> >>
> >> I think I found the issue: I wasn't destroying the user queue in my 
> >> program and the kernel doesn't clean up any remaining user queues in the 
> >> postclose hook. I think we need something like
> >> https://github.com/BNieuwenhuizen/linux/commit/e90c8d1185da7353c12837973ceddf55ccc85d29
> >> ?
> >>
> >> While running things multiple times now works, I still have problems doing 
> >> multiple submissions from the same queue. Looking forward to the updated 
> >> test/sample
> >>
> >> Thanks,
> >> Bas
> >>
> >> On Mon, Apr 10, 2023 at 9:32 AM Sharma, Shashank <shashank.sha...@amd.com> 
> >> wrote:
> >>> [AMD Official Use Only - General]
> >>>
> >>> Hello Bas,
> >>> Thanks for trying this out.
> >>>
> >>> This could be due to the doorbell as you mentioned, Usermode queue uses 
> >>> doorbell manager internally.
> >>> This week, we are planning to publis the latest libDRM sample code which 
> >>> uses a doorbell object (instead of the doorbell hack IOCTL), adapting to 
> >>> that should fix your problem in my opinion.
> >>> We have tested this full stack (libDRM test + Usermode queue + doorbell 
> >>> manager) for 500+ consecutive runs, and it worked well for us.
> >>>
> >>> You can use this integrated kernel stack (1+2) from my gitlab to build
> >>> your kernel:
> >>> https://gitlab.freedesktop.org/contactshashanksharma/userq-amdgpu/-/tr
> >>> ee/integrated-db-and-uq-v3 Please stay tuned for updated libDRM
> >>> changes with doorbell objects.
> >>>
> >>> Regards
> >>> Shashank
> >>> -----Original Message-----
> >>> From: Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl>
> >>> Sent: 10 April 2023 02:37
> >>> To: Sharma, Shashank <shashank.sha...@amd.com>
> >>> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander
> >>> <alexander.deuc...@amd.com>; Kuehling, Felix <felix.kuehl...@amd.com>;
> >>> Koenig, Christian <christian.koe...@amd.com>
> >>> Subject: Re: [PATCH v3 0/9] AMDGPU Usermode queues
> >>>
> >>> Hi Shashank,
> >>>
> >>> I tried writing a program to experiment with usermode queues and I found 
> >>> some weird behavior: The first run of the program works as expected, 
> >>> while subsequent runs don't seem to do anything (and I allocate 
> >>> everything in GTT, so it should be zero initialized consistently). Is 
> >>> this a known issue?
> >>>
> >>> The linked libdrm code for the uapi still does a doorbell ioctl so it 
> >>> could very well be that I do the doorbell wrong (especially since the 
> >>> ioctl implementation was never shared AFAICT?), but it seems like the 
> >>> kernel submissions (i.e. write wptr in dwords to the wptr va and to the 
> >>> doorbell). Is it possible to update the test in libdrm?
> >>>
> >>> Code: https://gitlab.freedesktop.org/bnieuwenhuizen/usermode-queue
> >>>
> >>> Thanks,
> >>> Bas
> >>>
> >>> On Wed, Mar 29, 2023 at 6:05 PM Shashank Sharma <shashank.sha...@amd.com> 
> >>> wrote:
> >>>> This patch series introduces AMDGPU usermode queues for gfx workloads.
> >>>> Usermode queues is a method of GPU workload submission into the
> >>>> graphics hardware without any interaction with kernel/DRM schedulers.
> >>>> In this method, a userspace graphics application can create its own
> >>>> workqueue and submit it directly in the GPU HW.
> >>>>
> >>>> The general idea of how this is supposed to work:
> >>>> - The application creates the following GPU objetcs:
> >>>>    - A queue object to hold the workload packets.
> >>>>    - A read pointer object.
> >>>>    - A write pointer object.
> >>>>    - A doorbell page.
> >>>> - The application picks a 32-bit offset in the doorbell page for this 
> >>>> queue.
> >>>> - The application uses the usermode_queue_create IOCTL introduced in
> >>>>    this patch, by passing the the GPU addresses of these objects (read
> >>>>    ptr, write ptr, queue base address and 32-bit doorbell offset from
> >>>>    the doorbell page)
> >>>> - The kernel creates the queue and maps it in the HW.
> >>>> - The application can start submitting the data in the queue as soon as
> >>>>    the kernel IOCTL returns.
> >>>> - After filling the workload data in the queue, the app must write the
> >>>>    number of dwords added in the queue into the doorbell offset, and the
> >>>>    GPU will start fetching the data.
> >>>>
> >>>> libDRM changes for this series and a sample DRM test program can be
> >>>> found in the MESA merge request here:
> >>>> https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/287
> >>>>
> >>>> This patch series depends on the doorbell-manager changes, which are
> >>>> being reviewed here:
> >>>> https://patchwork.freedesktop.org/series/115802/
> >>>>
> >>>> Alex Deucher (1):
> >>>>    drm/amdgpu: UAPI for user queue management
> >>>>
> >>>> Arvind Yadav (2):
> >>>>    drm/amdgpu: add new parameters in v11_struct
> >>>>    drm/amdgpu: map wptr BO into GART
> >>>>
> >>>> Shashank Sharma (6):
> >>>>    drm/amdgpu: add usermode queue base code
> >>>>    drm/amdgpu: add new IOCTL for usermode queue
> >>>>    drm/amdgpu: create GFX-gen11 MQD for userqueue
> >>>>    drm/amdgpu: create context space for usermode queue
> >>>>    drm/amdgpu: map usermode queue into MES
> >>>>    drm/amdgpu: generate doorbell index for userqueue
> >>>>
> >>>>   drivers/gpu/drm/amd/amdgpu/Makefile           |   3 +
> >>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  10 +-
> >>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |   2 +
> >>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c       |   6 +
> >>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 298
> >>>> ++++++++++++++++++  .../drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c |
> >>>> ++++++++++++++++++ 230 ++++++++++++++
> >>>>   .../gpu/drm/amd/include/amdgpu_userqueue.h    |  66 ++++
> >>>>   drivers/gpu/drm/amd/include/v11_structs.h     |  16 +-
> >>>>   include/uapi/drm/amdgpu_drm.h                 |  55 ++++
> >>>>   9 files changed, 677 insertions(+), 9 deletions(-)  create mode
> >>>> 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
> >>>>   create mode 100644
> >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c
> >>>>   create mode 100644 drivers/gpu/drm/amd/include/amdgpu_userqueue.h
> >>>>
> >>>> --
> >>>> 2.40.0
> >>>>

Reply via email to