[PATCH V2 09/10] drm/amdgpu/userq: add helpers to start/stop scheduling

2025-04-11 Thread Alex Deucher
This will be used to stop/start user queue scheduling for example when switching between kernel and user queues when enforce isolation is enabled. v2: use idx Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c

[PATCH 07/10] drm/amdgpu: don't swallow errors in amdgpu_userqueue_resume_all()

2025-04-11 Thread Alex Deucher
since we loop through the queues |= the errors. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c b/drivers/gpu/drm/amd

[PATCH 10/10] drm/amdgpu/userq: integrate with enforce isolation

2025-04-11 Thread Alex Deucher
Enforce isolation serializes access to the GFX IP. User queues are isolated in the MES scheduler, but we still need to serialize between kernel queues and user queues. For enforce isolation, group KGD user queues with KFD user queues. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu

[PATCH 02/10] drm/amdgpu/userq: rework front end call sequence

2025-04-11 Thread Alex Deucher
Split out the queue map from the mqd create call and split out the queue unmap from the mqd destroy call. This splits the queue setup and teardown with the actual enablement in the firmware. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu

[PATCH 01/10] drm/amdgpu/userq: rename suspend/resume callbacks

2025-04-11 Thread Alex Deucher
Rename to map and umap to better align with what is happening at the firmware level and remove the extra level of indirection in the MES userq code. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 10 ++-- drivers/gpu/drm/amd/amdgpu

[PATCH 05/10] drm/amdgpu/userq: add suspend and resume helpers

2025-04-11 Thread Alex Deucher
Add helpers to unmap and map user queues on suspend and resume. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 39 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h | 3 ++ 2 files changed, 42 insertions(+) diff --git

[PATCH 06/10] drm/amdgpu/userq: handle system suspend and resume

2025-04-11 Thread Alex Deucher
Unmap user queues on suspend and map them on resume. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu

[PATCH 08/10] drm/amdgpu/userq: track the xcp_id associated with the queue

2025-04-11 Thread Alex Deucher
Track this to align with KFD for enforce isolation handling. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h index

[PATCH 03/10] drm/amdgpu/userq: move some code around

2025-04-11 Thread Alex Deucher
Move some userq fence handling code into amdgpu_userq_fence.c. This matches the other code in that file. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 26 +++ .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.h | 1

[PATCH 04/10] drm/amdgpu/userq: properly clean up userq fence driver on failure

2025-04-11 Thread Alex Deucher
If userq creation fails, we need to properly unwind and free the user queue fence driver. v2: free idr as well (Sunil) Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu

Re: [PATCH 07/13] drm/amdgpu/userq: enable support for queue priorities

2025-04-11 Thread Alex Deucher
On Fri, Apr 11, 2025 at 1:18 PM Khatri, Sunil wrote: > > A small comment otherwise it looks great. > Reviewed-by: Sunil Khatri > > On 4/11/2025 12:23 AM, Alex Deucher wrote: > > Enable users to create queues at different priority levels. > > The highest level

Re: [PATCH v2 2/2] drm/amdgpu: Enhance Cleaner Shader Handling in GFX v9.0 Architecture v2

2025-04-11 Thread Alex Deucher
ferent IP versions, using > PACKET3_RUN_CLEANER_SHADER_9_0 for the versions 9.0.1, 9.1.0, > 9.2.1, 9.2.2, 9.3.0, and 9.4.0, while retaining > PACKET3_RUN_CLEANER_SHADER for version 9.4.2. > > v2: Simplify logic (Alex). > > Cc: Christian König > Cc: Alex Deucher > Signed-off-b

Re: [PATCH 1/2] drm/amdgpu: Add PACKET3_RUN_CLEANER_SHADER_9_0 for Cleaner Shader execution

2025-04-11 Thread Alex Deucher
do not > interfere with subsequent workloads. > > Cc: Christian König > Cc: Alex Deucher > Signed-off-by: Srinivasan Shanmugam Acked-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/soc15d.h | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/d

Re: [PATCH 9/9] drm/amdgpu/userq: integrate with enforce isolation

2025-04-11 Thread Alex Deucher
use they are no longer KFD specific and then the change to add the new function calls for userqs. Alex > > Looks good in general if the above understanding is correct. Some one > with better understanding of isolation should review. > Acked-by: Sunil Khatri > > On 4/10/202

Re: [PATCH 8/9] drm/amdgpu/userq: add helpers to start/stop scheduling

2025-04-11 Thread Alex Deucher
On Fri, Apr 11, 2025 at 1:26 PM Khatri, Sunil wrote: > > > On 4/11/2025 10:22 PM, Alex Deucher wrote: > > On Fri, Apr 11, 2025 at 12:17 PM Khatri, Sunil wrote: > >> > >> On 4/11/2025 7:42 PM, Alex Deucher wrote: > >>> This will be used to stop/start

Re: [PATCH 8/9] drm/amdgpu/userq: add helpers to start/stop scheduling

2025-04-11 Thread Alex Deucher
On Fri, Apr 11, 2025 at 12:17 PM Khatri, Sunil wrote: > > > On 4/11/2025 7:42 PM, Alex Deucher wrote: > > This will be used to stop/start user queue scheduling for > > example when switching between kernel and user queues when > > enforce isolation is enabled. > >

Re: [PATCH 2/2] drm/amdgpu: Enhance Cleaner Shader Handling in GFX v9.0 Architecture

2025-04-11 Thread Alex Deucher
ferent IP versions, using > PACKET3_RUN_CLEANER_SHADER_9_0 for the versions 9.0.1, 9.1.0, > 9.2.1, 9.2.2, 9.3.0, and 9.4.0, while retaining > PACKET3_RUN_CLEANER_SHADER for version 9.4.2. > > Cc: Christian König > Cc: Alex Deucher > Signed-off-by: Srinivasan Shanmugam > Sugges

Re: [v5 2/6] drm/amdgpu: Register the new sdma function pointers for each sdma IP version that needs them

2025-04-11 Thread Alex Deucher
On Fri, Apr 11, 2025 at 4:37 AM jesse.zh...@amd.com wrote: > > From: "jesse.zh...@amd.com" > > Register stop/start/soft_reset queue functions for SDMA IP versions > v4.4.2, v5.0 and v5.2. > > Suggested-by: Alex Deucher > Signed-off-by: Jesse Zhang Might want

Re: [v5 6/6] drm/amdgpu:remove old sdma reset callback mechanism

2025-04-11 Thread Alex Deucher
ngine_reset_funcs` function, which was > used to register the callbacks. >- Removed the `sdma_v4_4_2_engine_reset_funcs` structure, which contained > the pre-reset and post-reset callback functions. > > Signed-off-by: Jesse Zhang > Reviewed-by: Alex Deucher Reviewed-by: Alex Deu

Re: [v5 5/6] drm/amdgpu: Implement SDMA soft reset directly for v5.x

2025-04-11 Thread Alex Deucher
cing the previous call to `amdgpu_dpm_reset_sdma`. > > Suggested-by: Alex Deucher > Signed-off-by: Jesse Zhang > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 38 +++- > 1 file changed, 37 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/

Re: [v5 4/6] drm/amdgpu: optimize queue reset and stop logic

2025-04-11 Thread Alex Deucher
uot; last. > E.g. longest lines first and short lasts. (Chritian) > > Signed-off-by: Jesse Zhang > Acked-by: Alex Deucher Might want to split this per IP? Either way: Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 31 -- &

Re: [v5 3/6] drm/amdgpu: switch amdgpu_sdma_reset_engine to use the new sdma function pointers

2025-04-11 Thread Alex Deucher
On Fri, Apr 11, 2025 at 4:30 AM jesse.zh...@amd.com wrote: > > From: "jesse.zh...@amd.com" > > Replace old callback mechanism with direct calls to stop/start functions. > > Suggested-by: Alex Deucher > Signed-off-by: Jesse Zhang Reviewed-by: Alex Deucher >

Re: [v5 1/6] drm/amdgpu: Add the new sdma function pointers for amdgpu_sdma.h

2025-04-11 Thread Alex Deucher
to use ring pointer > instead of device/instance(Chritian) > v3: move stop_queue/start_queue to struct amdgpu_sdma_instance and rename > them. (Alex) > v4: rework the ordering a bit (Alex) > > Suggested-by: Alex Deucher > Signed-off-by: Jesse Zhang Reviewed-by:

[PATCH 6/9] drm/amdgpu/userq: handle system suspend and resume

2025-04-11 Thread Alex Deucher
Unmap user queues on suspend and map them on resume. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu

[PATCH] drm/amdgpu/gfx: replace a comma with a semicolon

2025-04-11 Thread Alex Deucher
Not techincally wrong, but I think a semicolon was intended here. Fixes: 6cc6e61788f7 ("drm/amdgpu: use a dummy owner for sysfs triggered cleaner shaders v3") Cc: Christian König Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 2 +- 1 file changed, 1 inser

[PATCH 2/9] drm/amdgpu/userq: rework front end call sequence

2025-04-11 Thread Alex Deucher
Split out the queue map from the mqd create call and split out the queue unmap from the mqd destroy call. This splits the queue setup and teardown with the actual enablement in the firmware. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu

Re: [PATCH v1 3/3] drm/amdgpu: update the error logging for more information

2025-04-11 Thread Alex Deucher
On Fri, Apr 11, 2025 at 9:05 AM Sunil Khatri wrote: > > add process and pid information in the userqueue error > logging to make it more useful in resolving the error > by logs. > > Sample log: > [ 42.444297] [drm:amdgpu_userqueue_wait_for_signal [amdgpu]] *ERROR* Timed > out waiting for fence

Re: [PATCH v1 3/3] drm/amdgpu: update the error logging for more information

2025-04-11 Thread Alex Deucher
On Fri, Apr 11, 2025 at 9:05 AM Sunil Khatri wrote: > > add process and pid information in the userqueue error > logging to make it more useful in resolving the error > by logs. > > Sample log: > [ 42.444297] [drm:amdgpu_userqueue_wait_for_signal [amdgpu]] *ERROR* Timed > out waiting for fence

[PATCH V2 4/9] drm/amdgpu/userq: properly clean up userq fence driver on failure

2025-04-11 Thread Alex Deucher
If userq creation fails, we need to properly unwind and free the user queue fence driver. v2: free idr as well (Sunil) Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu

[PATCH 5/9] drm/amdgpu/userq: add suspend and resume helpers

2025-04-11 Thread Alex Deucher
Add helpers to unmap and map user queues on suspend and resume. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 39 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h | 3 ++ 2 files changed, 42 insertions(+) diff --git a/drivers/gpu/drm/amd

[PATCH 7/9] drm/amdgpu: don't swallow errors in amdgpu_userqueue_resume_all()

2025-04-11 Thread Alex Deucher
since we loop through the queues |= the errors. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c index

[PATCH 9/9] drm/amdgpu/userq: integrate with enforce isolation

2025-04-11 Thread Alex Deucher
Enforce isolation serializes access to the GFX IP. User queues are isolated in the MES scheduler, but we still need to serialize between kernel queues and user queues. For enforce isolation, group KGD user queues with KFD user queues. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu

[PATCH 8/9] drm/amdgpu/userq: add helpers to start/stop scheduling

2025-04-11 Thread Alex Deucher
This will be used to stop/start user queue scheduling for example when switching between kernel and user queues when enforce isolation is enabled. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 68

[PATCH 1/9] drm/amdgpu/userq: rename suspend/resume callbacks

2025-04-11 Thread Alex Deucher
Rename to map and umap to better align with what is happening at the firmware level and remove the extra level of indirection in the MES userq code. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 10 ++-- drivers/gpu/drm/amd/amdgpu

[PATCH 3/9] drm/amdgpu/userq: move some code around

2025-04-11 Thread Alex Deucher
Move some userq fence handling code into amdgpu_userq_fence.c. This matches the other code in that file. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 26 +++ .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.h | 1

Re: [PATCH 1/2] drm/amdgpu: Use generic hdp flush function

2025-04-11 Thread Alex Deucher
On Fri, Apr 11, 2025 at 8:42 AM Lijo Lazar wrote: > > Except HDP v5.2 all use a common logic for HDP flush. Use a generic > function. HDP v5.2 forces NO_KIQ logic, revisit it later. > > Signed-off-by: Lijo Lazar Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amd

Re: [PATCH] drm/amd/amdgpu: Fix out of bounds warning in amdgpu_hw_ip_info

2025-04-11 Thread Alex Deucher
r_submission. > > The mismatch caused UBSAN to report an array bounds violation since > it was accessing the GFX ring array with SDMA instance indices. > > Fix the commit: a245daf3d7a143fb2df(drm/amdgpu: cleanup HW_IP query). Fixes: 4310acd4464b ("drm/amdgpu: add ring flag for

Re: [PATCH] drm/amdgpu: Clear overflow for SRIOV

2025-04-11 Thread Alex Deucher
On Fri, Apr 11, 2025 at 4:07 AM Emily Deng wrote: > > For VF, it doesn't have the permission to clear overflow, clear the bit > by reset. > > Signed-off-by: Emily Deng > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 10 -- > drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 + > drivers/gpu/d

[PATCH] drm/amdgpu: fix no_user_submission check for SDMA

2025-04-11 Thread Alex Deucher
Copy paste typo. Use the flag from the sdma structure. Fixes: 4310acd4464b ("drm/amdgpu: add ring flag for no user submissions") Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/g

Re: [PATCH 2/2] drm/amdgpu: Use the right function for hdp flush

2025-04-11 Thread Alex Deucher
off-by: Lijo Lazar Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 8 > drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 12 ++-- > drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 6 +++--- > drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 4 ++-- > dr

Re: [v4 1/7] drm/amd/amdgpu: Simplify SDMA reset mechanism by removing dynamic callbacks

2025-04-10 Thread Alex Deucher
On Tue, Apr 8, 2025 at 4:47 AM jesse.zh...@amd.com wrote: > > Since KFD no longer registers its own callbacks for SDMA resets, and only KGD > uses the reset mechanism, > we can simplify the SDMA reset flow by directly calling the ring's > `stop_queue` and `start_queue` functions. > This patch re

Re: [PATCH v2] drm/amdgpu: make mes_userq_unmap as int from void

2025-04-10 Thread Alex Deucher
On Wed, Apr 2, 2025 at 8:11 AM Sunil Khatri wrote: > > mes_userq_unmap could fail due to MES fw unable to > unmap the queue and the return value needs is not > to be ignored and handled on first step itself. > > Also queue->queue_active set to false in this function > but only when the queue is re

Re: [PATCH] drm/amdgpu: fix warning of drm_mm_clean

2025-04-10 Thread Alex Deucher
Guo Yin > > At least from my point that patch seems to make a lot of sense, so feel free > to add Reviewed-by: Christian König . > > But I would at least give Alex a chance to take a loop and double check. Looks correct. Acked-by: Alex Deucher > > Regards, > Christia

Re: [v3 7/7] drm/amd/amdgpu: Remove deprecated SDMA reset callback mechanism

2025-04-10 Thread Alex Deucher
ngine_reset_funcs` function, which was > used to register the callbacks. >- Removed the `sdma_v4_4_2_engine_reset_funcs` structure, which contained > the pre-reset and post-reset callback functions. > > Signed-off-by: Jesse Zhang Reviewed-by: Alex Deucher > --- > driver

Re: [v3 1/7] drm/amd/amdgpu: Simplify SDMA reset mechanism by removing dynamic callbacks

2025-04-10 Thread Alex Deucher
On Wed, Apr 2, 2025 at 5:28 AM jesse.zh...@amd.com wrote: > > Since KFD no longer registers its own callbacks for SDMA resets, and only KGD > uses the reset mechanism, > we can simplify the SDMA reset flow by directly calling the ring's > `stop_queue` and `start_queue` functions. > This patch re

[PATCH] drm/amd/display/dml2: use vzalloc rather than kzalloc

2025-04-10 Thread Alex Deucher
The structures are large and they do not require continuous memory so use vzalloc. Fixes: 70839da63605 ("drm/amd/display: Add new DCN401 sources") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4126 Cc: Aurabindo Pillai Signed-off-by: Alex Deucher --- .../gpu/drm/amd/displ

Re: [PATCH] drm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.5.2/11.5.3 GPUs

2025-04-10 Thread Alex Deucher
earing GPU > memory between processes and maintains a consistent GPU state across KGD > and KFD workloads. > > Cc: Mario Sopena-Novales > Cc: Christian König > Cc: Alex Deucher > Signed-off-by: Srinivasan Shanmugam Acked-by: Alex Deucher > --- > drivers/gpu/drm/amd/a

[PATCH 09/13] drm/amdgpu: add tmz queue parameter to mqd props

2025-04-10 Thread Alex Deucher
Use this to track the whether we want TMZ for queues. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index b818ad63dc84f..364a65524cfdb 100644

[PATCH 11/13] drm/amdgpu/gfx12: add support for TMZ queues to mqd_init

2025-04-10 Thread Alex Deucher
Set up TMZ for queues. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c index 2474006b1a340..da67f27d65a33 100644 --- a/drivers/gpu/drm

[PATCH 10/13] drm/amdgpu/gfx11: add support for TMZ queues to mqd_init

2025-04-10 Thread Alex Deucher
Set up TMZ for queues. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c index 91d29f482c3ca..b204d0e6e816d 100644 --- a/drivers/gpu/drm

[PATCH 05/13] drm/amdgpu/user: add priorty to user queue structure

2025-04-10 Thread Alex Deucher
So we can track this when we create user queues. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h index fd0542f60433b

[PATCH 12/13] drm/amdgpu/userq/mes: pass the secure flag to mqd init

2025-04-10 Thread Alex Deucher
So that we initialize the MQD as a secure queue. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/mes_userqueue.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c b/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c index f406a9a29bda5

[PATCH 13/13] drm/amdgpu/userq: enable support for secure queues

2025-04-10 Thread Alex Deucher
Enable users to create secure GFX/compute queues. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c b/drivers/gpu/drm/amd/amdgpu

[PATCH 07/13] drm/amdgpu/userq: enable support for queue priorities

2025-04-10 Thread Alex Deucher
Enable users to create queues at different priority levels. The highest level is restricted to drm master. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 26 ++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd

[PATCH 06/13] drm/amdgpu/userq/mes: handle user queue priority

2025-04-10 Thread Alex Deucher
Handle the queue priority set by the user. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/mes_userqueue.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c b/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c

[PATCH 08/13] drm/amdgpu/userq: add UAPI for setting up secure queues

2025-04-10 Thread Alex Deucher
If the queues needs to access TMZ surfaces, it must be set up as secure. Signed-off-by: Alex Deucher --- include/uapi/drm/amdgpu_drm.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h index 8719754c777b4..0ca4b3b961eb3 100644

[PATCH 04/13] drm/amdgpu/mes12: add conversion for priority levels

2025-04-10 Thread Alex Deucher
Convert driver priority levels to MES11 priority levels. At the moment they are the same, but they may not always be. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm

[PATCH 01/13] drm/amdgpu: convert userq UAPI _pad to flags

2025-04-10 Thread Alex Deucher
Reuse the _pad field for flags. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 4 ++-- include/uapi/drm/amdgpu_drm.h | 5 - 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c b

[PATCH 03/13] drm/amdgpu/mes11: add conversion for priority levels

2025-04-10 Thread Alex Deucher
Convert driver priority levels to MES11 priority levels. At the moment they are the same, but they may not always be. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm

[PATCH 02/13] drm/amdgpu/userq: add UAPI for setting queue priority

2025-04-10 Thread Alex Deucher
compositors) (maps to MES AMD_PRIORITY_LEVEL_HIGH) Signed-off-by: Alex Deucher --- include/uapi/drm/amdgpu_drm.h | 8 1 file changed, 8 insertions(+) diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h index 1a451907184cc..8719754c777b4 100644 --- a/include/uapi/drm

[PATCH 3/9] drm/amdgpu/userq: move some code around

2025-04-10 Thread Alex Deucher
Move some userq fence handling code into amdgpu_userq_fence.c. This matches the other code in that file. Signed-off-by: Alex Deucher --- .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 26 +++ .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.h | 1 + drivers/gpu/drm/amd/amdgpu

[PATCH 9/9] drm/amdgpu/userq: integrate with enforce isolation

2025-04-10 Thread Alex Deucher
Enforce isolation serializes access to the GFX IP. User queues are isolated in the MES scheduler, but we still need to serialize between kernel queues and user queues. For enforce isolation, group KGD user queues with KFD user queues. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu

[PATCH 1/9] drm/amdgpu/userq: rename suspend/resume callbacks

2025-04-10 Thread Alex Deucher
Rename to map and umap to better align with what is happening at the firmware level and remove the extra level of indirection in the MES userq code. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 10 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h | 8

[PATCH] drm/amdgpu/userq/mes: remove unused header

2025-04-10 Thread Alex Deucher
This is unused so remove it. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/mes_userqueue.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c b/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c index abd32415d7343..e3c3fc160b799 100644 --- a

[PATCH 5/9] drm/amdgpu/userq: add suspend and resume helpers

2025-04-10 Thread Alex Deucher
Add helpers to unmap and map user queues on suspend and resume. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 39 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h | 3 ++ 2 files changed, 42 insertions(+) diff --git a/drivers/gpu/drm/amd

[PATCH 4/9] drm/amdgpu/userq: properly clean up userq fence driver on failure

2025-04-10 Thread Alex Deucher
If userq creation fails, we need to properly unwind and free the user queue fence driver. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c b/drivers/gpu/drm/amd

[PATCH 2/9] drm/amdgpu/userq: rework front end call sequence

2025-04-10 Thread Alex Deucher
Split out the queue map from the mqd create call and split out the queue unmap from the mqd destroy call. This splits the queue setup and teardown with the actual enablement in the firmware. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 17

[PATCH 8/9] drm/amdgpu/userq: add helpers to start/stop scheduling

2025-04-10 Thread Alex Deucher
This will be used to stop/start user queue scheduling for example when switching between kernel and user queues when enforce isolation is enabled. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 66

[PATCH 7/9] drm/amdgpu: don't swallow errors in amdgpu_userqueue_resume_all()

2025-04-10 Thread Alex Deucher
since we loop through the queues |= the errors. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c index

[PATCH 6/9] drm/amdgpu/userq: handle system suspend and resume

2025-04-10 Thread Alex Deucher
Unmap user queues on suspend and map them on resume. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu

Re: [PATCH] drm/amdgpu: still cleanup sid.h

2025-04-10 Thread Alex Deucher
Applied. Thanks! Alex On Mon, Apr 7, 2025 at 1:52 AM Alexandre Demers wrote: > > The defines, shifts and masks are already available in dce_6_0_d.h, > dce_6_0_sh_mask.h. > > Signed-off-by: Alexandre Demers > --- > drivers/gpu/drm/amd/amdgpu/si.c | 26 +- > drivers/gpu

Re: [PATCH 0/6] Introduce a generic function to get the CSB buffer

2025-04-10 Thread Alex Deucher
On Mon, Apr 7, 2025 at 4:15 PM Rodrigo Siqueira wrote: > > On 04/07, Alex Deucher wrote: > > On Sun, Apr 6, 2025 at 7:07 PM Rodrigo Siqueira wrote: > > > > > > This patchset was inspired and made on top of the below series: > > > > > > https://

Re: [PATCH] Documentation/amdgpu: Add Ryzen AI 350 series processors

2025-04-10 Thread Alex Deucher
On Thu, Apr 10, 2025 at 1:33 PM Mario Limonciello wrote: > > These have been announced so add them to the table. > > Link: > https://www.amd.com/en/products/processors/laptop/ryzen/ai-300-series/amd-ryzen-ai-7-350.html > Signed-off-by: Mario Limonciello Acke

Re: [PATCH 1/2 v3] drm/amdgpu: Add fw minimum version check for usermode queue

2025-04-10 Thread Alex Deucher
ons directly. > v3: Firmware version checks only for Navi3x(by Alex). > > Cc: Alex Deucher > Cc: Christian Koenig > Cc: Shashank Sharma > Cc: Sunil Khatri > Signed-off-by: Arvind Yadav Acked-by: Alex Deucher For some reason I haven't gotten any of the 2/2 patches for a

Re: [PATCH 1/2 v2] drm/amdgpu: Add fw minimum version check for usermode queue

2025-04-10 Thread Alex Deucher
On Thu, Apr 10, 2025 at 11:56 AM Yadav, Arvind wrote: > > > On 4/10/2025 8:50 PM, Alex Deucher wrote: > > On Thu, Apr 10, 2025 at 10:57 AM Arvind Yadav wrote: > >> This patch is load usermode queue based on FW support for gfx11. > >> CP Ucode FW version: [PFP =

Re: [PATCH 1/2 v2] drm/amdgpu: Add fw minimum version check for usermode queue

2025-04-10 Thread Alex Deucher
s directly. > > Cc: Alex Deucher > Cc: Christian Koenig > Cc: Shashank Sharma > Cc: Sunil Khatri > Signed-off-by: Arvind Yadav > --- > drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 10 -- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/driv

Re: [PATCH 2/2] drm/amdgpu: Add fw minimum version check for usermode queue

2025-04-10 Thread Alex Deucher
On Thu, Apr 10, 2025 at 7:48 AM Arvind Yadav wrote: > > This patch is load usermode queue based on FW support for gfx12. > CP Ucode FW Vesion: [PFP = 2840, ME = 2780, MEC = 2600, MES = 123] > > Cc: Alex Deucher > Cc: Christian Koenig > Cc: Shashank Sharma > Cc: Sunil

Re: [v3 3/7] drm/amdgpu: Optimize SDMA v5.0 queue reset and stop logic

2025-04-09 Thread Alex Deucher
ops and checks by directly using the `ring->me` field > to identify the SDMA instance. >- Reused the `sdma_v5_0_gfx_stop` function to stop the queue, reducing code > duplication. > > Signed-off-by: Jesse Zhang Acked-by: Alex Deucher > --- > drivers/gpu/drm/amd/am

Re: [PATCH] drm/amd/display: Add htmldocs description for fused_io interface

2025-04-09 Thread Alex Deucher
Acked-by: Alex Deucher On Wed, Apr 9, 2025 at 1:06 PM wrote: > > From: Roman Li > > [Why] > htmldocs build warning: "Function parameter or struct member 'fused_io' > not described in 'amdgpu_display_manager'". > > [How] > Add missing des

[pull] amdgpu, amdkfd drm-fixes-6.15

2025-04-09 Thread Alex Deucher
possible division by 0 in fan handling amdkfd: - Queue reset fixes Alex Deucher (5): drm/amdgpu/mes11: optimize MES pipe FW version fetching drm/amdgpu/pm: add workload profile pause helper drm/amdgpu/pm/swsmu: implement

[pull] amdgpu, amdkfd drm-fixes-6.15

2025-04-09 Thread Alex Deucher
possible division by 0 in fan handling amdkfd: - Queue reset fixes Alex Deucher (5): drm/amdgpu/mes11: optimize MES pipe FW version fetching drm/amdgpu/pm: add workload profile pause helper drm/amdgpu/pm/swsmu: implement

Re: [PATCH 3/3] drm/amdgpu: adjust enforce_isolation handling

2025-04-09 Thread Alex Deucher
On Wed, Apr 9, 2025 at 10:36 AM SRINIVASAN SHANMUGAM wrote: > > > On 4/8/2025 9:30 PM, Alex Deucher wrote: > > Switch from a bool to an enum and allow more options > > for enforce isolation. There are now 3 modes of operation: > > - Disabled (0) > > - Enabled

Re: [PATCH 1/2] drm/amdgpu: use a dummy owner for sysfs triggered cleaner shaders v3

2025-04-08 Thread Alex Deucher
On Tue, Apr 8, 2025 at 11:30 AM Christian König wrote: > > Otherwise triggering sysfs multiple times without other submissions in > between only runs the shader once. > > v2: add some comment > v3: re-add missing cast > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_g

Re: [PATCH] drm/amd: Forbid suspending into non-default suspend states

2025-04-08 Thread Alex Deucher
.org/drm/amd/-/issues/4093 > Signed-off-by: Mario Limonciello Acked-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 14 +- > 2 files changed, 14 insertions(+), 1 deletion(-) > > diff --

Re: [PATCH] drm/amdgpu: remove the duplicated mes queue active state setting

2025-04-08 Thread Alex Deucher
On Fri, Mar 28, 2025 at 7:52 AM Prike Liang wrote: > > The MES queue deactivation and active status are already set in > mes_userq_unmap|map(), so the caller needn't set the queue_active > bit again. > > Signed-off-by: Prike Liang Acked-by: Alex Deucher > --- >

[PATCH 1/3] drm/amdgpu/mes11: use the device value for enforce isolation

2025-04-08 Thread Alex Deucher
Use the local setting rather than the global parameter. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index 06b51867c9aac

[PATCH 3/3] drm/amdgpu: adjust enforce_isolation handling

2025-04-08 Thread Alex Deucher
. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 11 +- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 16 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 22 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 12 +++--- drivers/gpu/drm/amd

[PATCH 2/3] drm/amdgpu/mes12: use the device value for enforce isolation

2025-04-08 Thread Alex Deucher
Use the local setting rather than the global parameter. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c index 8892858cfd9ae

Re: [PATCH] drm/amd/pm/smu11: Prevent division by zero

2025-04-08 Thread Alex Deucher
Oh, sorry, I've picked it up now. Thanks! Alex On Tue, Apr 8, 2025 at 4:16 AM Denis Arefev wrote: > > > --- > > drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c > > b/dr

Re: [PATCH] drm/amd/pm/smu11: Prevent division by zero

2025-04-08 Thread Alex Deucher
Oh, sorry, I've picked it up now. Thanks! Alex On Tue, Apr 8, 2025 at 4:16 AM Denis Arefev wrote: > > > --- > > drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c > > b/dr

Re: [v3 5/7] drm/amdgpu: Optimize SDMA v5.2 queue reset and stop logic

2025-04-08 Thread Alex Deucher
- Removed redundant loops and checks by directly using the `ring->me` > field > to identify the SDMA instance. > - Reused the `sdma_v5_2_gfx_stop` function to stop the queue, > reducing code > duplication. > > Signed-off-by: Jesse Zhang Ac

Re: [v3 2/7] drm/amd/amdgpu: Implement SDMA soft reset directly for sdma v5

2025-04-07 Thread Alex Deucher
tly manipulates the `GRBM_SOFT_RESET` > register to reset the specified SDMA instance. > > 2. **Integration into `amdgpu_sdma_reset_engine`**: >- The `amdgpu_sdma_soft_reset` function is called during the SDMA reset > process, replacing the previous call to `amdgpu_dpm_reset_sdm

Re: [PATCH 2/2] drm/amdgpu/dma_buf: fix page_link check

2025-04-07 Thread Alex Deucher
Applied. Thanks! On Mon, Apr 7, 2025 at 10:42 AM Christian König wrote: > > Am 07.04.25 um 16:18 schrieb Matthew Auld: > > The page_link lower bits of the first sg could contain something like > > SG_END, if we are mapping a single VRAM page or contiguous blob which > > fits into one sg entry. R

Re: [PATCH 0/2] drm/amdgpu: typos and standardization

2025-04-07 Thread Alex Deucher
Applied. Thanks! On Fri, Apr 4, 2025 at 1:48 AM Alexandre Demers wrote: > > Typos were found in DCE, where hpd should have been used. > > DCE6/8: standardize the "interrupt" vs "irq" usage in function names > with DCE10/11. > > Alexandre Demers (2): > drm/amdgpu: fix typos in DCEs > drm/amdg

Re: [PATCH 0/2] drm/amdgpu: better complete DCE6 and GMC6

2025-04-07 Thread Alex Deucher
Applied. Thanks! On Fri, Apr 4, 2025 at 1:43 AM Alexandre Demers wrote: > > First patch moves some DCE files around so they are distributed as are > other DCE files > > Second patch implements gmc_v6_0_set_clockgating_state(), which was mostly > there, but commented out. A few tweeks were needed

Re: [PATCH 0/6] Introduce a generic function to get the CSB buffer

2025-04-07 Thread Alex Deucher
On Sun, Apr 6, 2025 at 7:07 PM Rodrigo Siqueira wrote: > > This patchset was inspired and made on top of the below series: > > https://lore.kernel.org/amd-gfx/20250319162225.3775315-1-alexander.deuc...@amd.com/ > > In the above series, there is a bug fix in many functions named > gfx_vX_0_get_csb_

Re: [PATCH 1/2] Documentation: update KIQ documentation

2025-04-07 Thread Alex Deucher
Ping on this series? Alex On Wed, Mar 26, 2025 at 1:52 PM Alex Deucher wrote: > > KIQ is replaced with MES on GFX 11 and newer. > > Signed-off-by: Alex Deucher > --- > Documentation/gpu/amdgpu/driver-core.rst | 3 ++- > 1 file changed, 2 insertions(+), 1 deletio

Re: [PATCH 2/2] drm/amdgpu/mes12: optimize MES pipe FW version fetching

2025-04-07 Thread Alex Deucher
Ping? Alex On Fri, Mar 28, 2025 at 9:09 AM Alex Deucher wrote: > > Don't fetch it again if we already have it. It seems the > registers don't reliably have the value at resume in some > cases. > > Fixes: 785f0f9fe742 ("drm/amdgpu: Add mes v12_0 ip block sup

Re: [PATCH 1/5] drm/amdgpu/gfx9: dump full CP packet header FIFOs

2025-04-07 Thread Alex Deucher
On Mon, Apr 7, 2025 at 9:27 AM Khatri, Sunil wrote: > > > On 4/7/2025 6:26 PM, Alex Deucher wrote: > > On Mon, Apr 7, 2025 at 6:14 AM Khatri, Sunil wrote: > > On 3/25/2025 1:18 AM, Alex Deucher wrote: > > ping on this series? > > Alex > > On Thu, Mar 20,

Re: [PATCH 1/5] drm/amdgpu/gfx9: dump full CP packet header FIFOs

2025-04-07 Thread Alex Deucher
On Mon, Apr 7, 2025 at 6:14 AM Khatri, Sunil wrote: > > > On 3/25/2025 1:18 AM, Alex Deucher wrote: > > ping on this series? > > Alex > > On Thu, Mar 20, 2025 at 12:57 PM Alex Deucher > wrote: > > In dev core dump, dump the full header fifo for > each que

  1   2   3   4   5   6   7   8   9   10   >