On Mon, Apr 10, 2023 at 4:01 PM Shashank Sharma <shashank.sha...@amd.com> wrote: > > > On 10/04/2023 15:46, Bas Nieuwenhuizen wrote: > > On Mon, Apr 10, 2023 at 3:40 PM Sharma, Shashank > > <shashank.sha...@amd.com> wrote: > >> [AMD Official Use Only - General] > >> > >> Hello Bas, > >> > >> This is not the correct interpretation of the code, the USERQ_IOCTL has > >> both the OPs (create and destroy), but th euser has to exclusively call > >> it. > >> > >> Please see the sample test program in the existing libDRM series > >> (userq_test.c, it specifically calls amdgpu_free_userq, which does the > >> destroy_OP > >> > >> for the IOCTL. > > In the presence of crashes the kernel should always be able to clean > > this up no? Otherwise there is a resource leak? > > The crash handling is the same as any of the existing GPU resource which > are allocated and freed with IOCTL_OPs.
Most of those are handled in the when the DRM fd gets closed (i.e. when the process exits): - buffers through drm_gem_release() - mappings in amdgpu_vm_fini - contexts in amdgpu_ctx_mgr_fini etc. Why would we do things differently for userspace queues? It doesn't look complicated looking at the above patch (which does seem to work). > > To be honest a crash handling can be very elaborated and complex one, > and hence only can be done at the driver unload IMO, which doesn't help > at that stage, > > coz anyways driver will re-allocate the resources on next load. > > - Shashank > > > > >> - Shashank > >> > >> -----Original Message----- > >> From: Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl> > >> Sent: 10 April 2023 11:26 > >> To: Sharma, Shashank <shashank.sha...@amd.com> > >> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander > >> <alexander.deuc...@amd.com>; Kuehling, Felix <felix.kuehl...@amd.com>; > >> Koenig, Christian <christian.koe...@amd.com>; Yadav, Arvind > >> <arvind.ya...@amd.com> > >> Subject: Re: [PATCH v3 0/9] AMDGPU Usermode queues > >> > >> Hi Shashank, > >> > >> I think I found the issue: I wasn't destroying the user queue in my > >> program and the kernel doesn't clean up any remaining user queues in the > >> postclose hook. I think we need something like > >> https://github.com/BNieuwenhuizen/linux/commit/e90c8d1185da7353c12837973ceddf55ccc85d29 > >> ? > >> > >> While running things multiple times now works, I still have problems doing > >> multiple submissions from the same queue. Looking forward to the updated > >> test/sample > >> > >> Thanks, > >> Bas > >> > >> On Mon, Apr 10, 2023 at 9:32 AM Sharma, Shashank <shashank.sha...@amd.com> > >> wrote: > >>> [AMD Official Use Only - General] > >>> > >>> Hello Bas, > >>> Thanks for trying this out. > >>> > >>> This could be due to the doorbell as you mentioned, Usermode queue uses > >>> doorbell manager internally. > >>> This week, we are planning to publis the latest libDRM sample code which > >>> uses a doorbell object (instead of the doorbell hack IOCTL), adapting to > >>> that should fix your problem in my opinion. > >>> We have tested this full stack (libDRM test + Usermode queue + doorbell > >>> manager) for 500+ consecutive runs, and it worked well for us. > >>> > >>> You can use this integrated kernel stack (1+2) from my gitlab to build > >>> your kernel: > >>> https://gitlab.freedesktop.org/contactshashanksharma/userq-amdgpu/-/tr > >>> ee/integrated-db-and-uq-v3 Please stay tuned for updated libDRM > >>> changes with doorbell objects. > >>> > >>> Regards > >>> Shashank > >>> -----Original Message----- > >>> From: Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl> > >>> Sent: 10 April 2023 02:37 > >>> To: Sharma, Shashank <shashank.sha...@amd.com> > >>> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander > >>> <alexander.deuc...@amd.com>; Kuehling, Felix <felix.kuehl...@amd.com>; > >>> Koenig, Christian <christian.koe...@amd.com> > >>> Subject: Re: [PATCH v3 0/9] AMDGPU Usermode queues > >>> > >>> Hi Shashank, > >>> > >>> I tried writing a program to experiment with usermode queues and I found > >>> some weird behavior: The first run of the program works as expected, > >>> while subsequent runs don't seem to do anything (and I allocate > >>> everything in GTT, so it should be zero initialized consistently). Is > >>> this a known issue? > >>> > >>> The linked libdrm code for the uapi still does a doorbell ioctl so it > >>> could very well be that I do the doorbell wrong (especially since the > >>> ioctl implementation was never shared AFAICT?), but it seems like the > >>> kernel submissions (i.e. write wptr in dwords to the wptr va and to the > >>> doorbell). Is it possible to update the test in libdrm? > >>> > >>> Code: https://gitlab.freedesktop.org/bnieuwenhuizen/usermode-queue > >>> > >>> Thanks, > >>> Bas > >>> > >>> On Wed, Mar 29, 2023 at 6:05 PM Shashank Sharma <shashank.sha...@amd.com> > >>> wrote: > >>>> This patch series introduces AMDGPU usermode queues for gfx workloads. > >>>> Usermode queues is a method of GPU workload submission into the > >>>> graphics hardware without any interaction with kernel/DRM schedulers. > >>>> In this method, a userspace graphics application can create its own > >>>> workqueue and submit it directly in the GPU HW. > >>>> > >>>> The general idea of how this is supposed to work: > >>>> - The application creates the following GPU objetcs: > >>>> - A queue object to hold the workload packets. > >>>> - A read pointer object. > >>>> - A write pointer object. > >>>> - A doorbell page. > >>>> - The application picks a 32-bit offset in the doorbell page for this > >>>> queue. > >>>> - The application uses the usermode_queue_create IOCTL introduced in > >>>> this patch, by passing the the GPU addresses of these objects (read > >>>> ptr, write ptr, queue base address and 32-bit doorbell offset from > >>>> the doorbell page) > >>>> - The kernel creates the queue and maps it in the HW. > >>>> - The application can start submitting the data in the queue as soon as > >>>> the kernel IOCTL returns. > >>>> - After filling the workload data in the queue, the app must write the > >>>> number of dwords added in the queue into the doorbell offset, and the > >>>> GPU will start fetching the data. > >>>> > >>>> libDRM changes for this series and a sample DRM test program can be > >>>> found in the MESA merge request here: > >>>> https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/287 > >>>> > >>>> This patch series depends on the doorbell-manager changes, which are > >>>> being reviewed here: > >>>> https://patchwork.freedesktop.org/series/115802/ > >>>> > >>>> Alex Deucher (1): > >>>> drm/amdgpu: UAPI for user queue management > >>>> > >>>> Arvind Yadav (2): > >>>> drm/amdgpu: add new parameters in v11_struct > >>>> drm/amdgpu: map wptr BO into GART > >>>> > >>>> Shashank Sharma (6): > >>>> drm/amdgpu: add usermode queue base code > >>>> drm/amdgpu: add new IOCTL for usermode queue > >>>> drm/amdgpu: create GFX-gen11 MQD for userqueue > >>>> drm/amdgpu: create context space for usermode queue > >>>> drm/amdgpu: map usermode queue into MES > >>>> drm/amdgpu: generate doorbell index for userqueue > >>>> > >>>> drivers/gpu/drm/amd/amdgpu/Makefile | 3 + > >>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 10 +- > >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 + > >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 6 + > >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 298 > >>>> ++++++++++++++++++ .../drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c | > >>>> ++++++++++++++++++ 230 ++++++++++++++ > >>>> .../gpu/drm/amd/include/amdgpu_userqueue.h | 66 ++++ > >>>> drivers/gpu/drm/amd/include/v11_structs.h | 16 +- > >>>> include/uapi/drm/amdgpu_drm.h | 55 ++++ > >>>> 9 files changed, 677 insertions(+), 9 deletions(-) create mode > >>>> 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c > >>>> create mode 100644 > >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c > >>>> create mode 100644 drivers/gpu/drm/amd/include/amdgpu_userqueue.h > >>>> > >>>> -- > >>>> 2.40.0 > >>>>