On Mon, Apr 10, 2023 at 3:40 PM Sharma, Shashank <shashank.sha...@amd.com> wrote: > > [AMD Official Use Only - General] > > Hello Bas, > > This is not the correct interpretation of the code, the USERQ_IOCTL has both > the OPs (create and destroy), but th euser has to exclusively call it. > > Please see the sample test program in the existing libDRM series > (userq_test.c, it specifically calls amdgpu_free_userq, which does the > destroy_OP > > for the IOCTL.
In the presence of crashes the kernel should always be able to clean this up no? Otherwise there is a resource leak? > > - Shashank > > -----Original Message----- > From: Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl> > Sent: 10 April 2023 11:26 > To: Sharma, Shashank <shashank.sha...@amd.com> > Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander > <alexander.deuc...@amd.com>; Kuehling, Felix <felix.kuehl...@amd.com>; > Koenig, Christian <christian.koe...@amd.com>; Yadav, Arvind > <arvind.ya...@amd.com> > Subject: Re: [PATCH v3 0/9] AMDGPU Usermode queues > > Hi Shashank, > > I think I found the issue: I wasn't destroying the user queue in my program > and the kernel doesn't clean up any remaining user queues in the postclose > hook. I think we need something like > https://github.com/BNieuwenhuizen/linux/commit/e90c8d1185da7353c12837973ceddf55ccc85d29 > ? > > While running things multiple times now works, I still have problems doing > multiple submissions from the same queue. Looking forward to the updated > test/sample > > Thanks, > Bas > > On Mon, Apr 10, 2023 at 9:32 AM Sharma, Shashank <shashank.sha...@amd.com> > wrote: > > > > [AMD Official Use Only - General] > > > > Hello Bas, > > Thanks for trying this out. > > > > This could be due to the doorbell as you mentioned, Usermode queue uses > > doorbell manager internally. > > This week, we are planning to publis the latest libDRM sample code which > > uses a doorbell object (instead of the doorbell hack IOCTL), adapting to > > that should fix your problem in my opinion. > > We have tested this full stack (libDRM test + Usermode queue + doorbell > > manager) for 500+ consecutive runs, and it worked well for us. > > > > You can use this integrated kernel stack (1+2) from my gitlab to build > > your kernel: > > https://gitlab.freedesktop.org/contactshashanksharma/userq-amdgpu/-/tr > > ee/integrated-db-and-uq-v3 Please stay tuned for updated libDRM > > changes with doorbell objects. > > > > Regards > > Shashank > > -----Original Message----- > > From: Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl> > > Sent: 10 April 2023 02:37 > > To: Sharma, Shashank <shashank.sha...@amd.com> > > Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander > > <alexander.deuc...@amd.com>; Kuehling, Felix <felix.kuehl...@amd.com>; > > Koenig, Christian <christian.koe...@amd.com> > > Subject: Re: [PATCH v3 0/9] AMDGPU Usermode queues > > > > Hi Shashank, > > > > I tried writing a program to experiment with usermode queues and I found > > some weird behavior: The first run of the program works as expected, while > > subsequent runs don't seem to do anything (and I allocate everything in > > GTT, so it should be zero initialized consistently). Is this a known issue? > > > > The linked libdrm code for the uapi still does a doorbell ioctl so it could > > very well be that I do the doorbell wrong (especially since the ioctl > > implementation was never shared AFAICT?), but it seems like the kernel > > submissions (i.e. write wptr in dwords to the wptr va and to the doorbell). > > Is it possible to update the test in libdrm? > > > > Code: https://gitlab.freedesktop.org/bnieuwenhuizen/usermode-queue > > > > Thanks, > > Bas > > > > On Wed, Mar 29, 2023 at 6:05 PM Shashank Sharma <shashank.sha...@amd.com> > > wrote: > > > > > > This patch series introduces AMDGPU usermode queues for gfx workloads. > > > Usermode queues is a method of GPU workload submission into the > > > graphics hardware without any interaction with kernel/DRM schedulers. > > > In this method, a userspace graphics application can create its own > > > workqueue and submit it directly in the GPU HW. > > > > > > The general idea of how this is supposed to work: > > > - The application creates the following GPU objetcs: > > > - A queue object to hold the workload packets. > > > - A read pointer object. > > > - A write pointer object. > > > - A doorbell page. > > > - The application picks a 32-bit offset in the doorbell page for this > > > queue. > > > - The application uses the usermode_queue_create IOCTL introduced in > > > this patch, by passing the the GPU addresses of these objects (read > > > ptr, write ptr, queue base address and 32-bit doorbell offset from > > > the doorbell page) > > > - The kernel creates the queue and maps it in the HW. > > > - The application can start submitting the data in the queue as soon as > > > the kernel IOCTL returns. > > > - After filling the workload data in the queue, the app must write the > > > number of dwords added in the queue into the doorbell offset, and the > > > GPU will start fetching the data. > > > > > > libDRM changes for this series and a sample DRM test program can be > > > found in the MESA merge request here: > > > https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/287 > > > > > > This patch series depends on the doorbell-manager changes, which are > > > being reviewed here: > > > https://patchwork.freedesktop.org/series/115802/ > > > > > > Alex Deucher (1): > > > drm/amdgpu: UAPI for user queue management > > > > > > Arvind Yadav (2): > > > drm/amdgpu: add new parameters in v11_struct > > > drm/amdgpu: map wptr BO into GART > > > > > > Shashank Sharma (6): > > > drm/amdgpu: add usermode queue base code > > > drm/amdgpu: add new IOCTL for usermode queue > > > drm/amdgpu: create GFX-gen11 MQD for userqueue > > > drm/amdgpu: create context space for usermode queue > > > drm/amdgpu: map usermode queue into MES > > > drm/amdgpu: generate doorbell index for userqueue > > > > > > drivers/gpu/drm/amd/amdgpu/Makefile | 3 + > > > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 10 +- > > > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 + > > > drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 6 + > > > drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 298 > > > ++++++++++++++++++ .../drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c | > > > ++++++++++++++++++ 230 ++++++++++++++ > > > .../gpu/drm/amd/include/amdgpu_userqueue.h | 66 ++++ > > > drivers/gpu/drm/amd/include/v11_structs.h | 16 +- > > > include/uapi/drm/amdgpu_drm.h | 55 ++++ > > > 9 files changed, 677 insertions(+), 9 deletions(-) create mode > > > 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c > > > create mode 100644 > > > drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue_gfx_v11.c > > > create mode 100644 drivers/gpu/drm/amd/include/amdgpu_userqueue.h > > > > > > -- > > > 2.40.0 > > >