[PATCH] drm/amd/pm: Use cached data for min/max clocks

2025-07-11 Thread Lijo Lazar
If dpm tables are already populated on SMU v13.0.6 SOCs, use the cached data. Otherwise, fetch values from firmware. Signed-off-by: Lijo Lazar --- .../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 38 +-- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/d

[PATCH 32/33] drm/amdgpu/vcn2.5: implement ring reset

2025-07-11 Thread Alex Deucher
Use the new helpers to handle engine resets for VCN. Reviewed-by: Sathishkumar S Tested-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 25 + 1 file changed, 25 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c

[PATCH 31/33] drm/amdgpu/vcn2: implement ring reset

2025-07-11 Thread Alex Deucher
Use the new helpers to handle engine resets for VCN. Reviewed-by: Sathishkumar S Tested-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c | 26 ++ 1 file changed, 26 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c

[PATCH 26/33] drm/amdgpu/vcn4: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Sathishkumar S Tested-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_

[PATCH 33/33] drm/amdgpu/vcn3: implement ring reset

2025-07-11 Thread Alex Deucher
Use the new helpers to handle engine resets for VCN. Reviewed-by: Sathishkumar S Tested-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.

[PATCH 30/33] drm/amdgpu/vcn: add a helper framework for engine resets

2025-07-11 Thread Alex Deucher
With engine resets we reset all queues on the engine rather than just a single queue. Add a framework to handle this similar to SDMA. Reviewed-by: Sathishkumar S Tested-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 79 + d

[PATCH 29/33] drm/amdgpu/vcn5: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Sathishkumar S Tested-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v

[PATCH 27/33] drm/amdgpu/vcn4.0.3: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c b/drivers/gpu/drm/

[PATCH 23/33] drm/amdgpu/jpeg4.0.5: add queue reset

2025-07-11 Thread Alex Deucher
Add queue reset support for jpeg 4.0.5. Use the new helpers to re-emit the unprocessed state after resetting the queue. Reviewed-by: Sathishkumar S Tested-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_5.c | 23 ++- 1 file changed, 2

[PATCH 25/33] drm/amdgpu/jpeg5.0.1: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c | 11 ++- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c b/drivers/gpu/d

[PATCH 28/33] drm/amdgpu/vcn4.0.5: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Sathishkumar S Tested-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v

[PATCH 24/33] drm/amdgpu/jpeg5: add queue reset

2025-07-11 Thread Alex Deucher
Add queue reset support for jpeg 5.0.0. Use the new helpers to re-emit the unprocessed state after resetting the queue. Reviewed-by: Sathishkumar S Tested-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_0.c | 23 ++- 1 file changed, 2

[PATCH 20/33] drm/amdgpu/jpeg3: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Sathishkumar S Tested-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v3_

[PATCH 21/33] drm/amdgpu/jpeg4: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Sathishkumar S Tested-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_

[PATCH 19/33] drm/amdgpu/jpeg2.5: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Sathishkumar S Tested-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c | 11 ++- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_

[PATCH 18/33] drm/amdgpu/jpeg2: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Sathishkumar S Tested-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v2_

[PATCH 17/33] drm/amdgpu/sdma7: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Jesse Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c b/drivers/gpu/drm/amd/amdgp

[PATCH 16/33] drm/amdgpu/sdma6: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Jesse Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c b/drivers/gpu/drm/amd/amdgp

[PATCH 11/33] drm/amdgpu/gfx10: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Drop the soft_recovery callbacks as the queue reset replaces it. Reviewed-by: Jesse Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 35 +++--- 1 file changed, 4 insertions(+), 31 deletion

[PATCH 15/33] drm/amdgpu/sdma5.2: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Jesse Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c b/drivers/gpu/drm/amd/amdg

[PATCH 12/33] drm/amdgpu/gfx11: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Drop the soft_recovery callbacks as the queue reset replaces it. Reviewed-by: Jesse Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 35 +++--- 1 file changed, 4 insertions(+), 31 deletion

[PATCH 22/33] drm/amdgpu/jpeg4.0.3: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Sathishkumar S Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 11 ++- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c b/drivers/gpu/d

[PATCH 08/33] drm/amdgpu: track ring state associated with a fence

2025-07-11 Thread Alex Deucher
We need to know the wptr and sequence number associated with a fence so that we can re-emit the unprocessed state after a ring reset. Pre-allocate storage space for the ring buffer contents and add helpers to save off and re-emit the unprocessed state so that it can be re-emitted after the queue i

[PATCH 13/33] drm/amdgpu/gfx12: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Drop the soft_recovery callbacks as the queue reset replaces it. Reviewed-by: Jesse Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 35 +++--- 1 file changed, 4 insertions(+), 31 deletion

[PATCH 09/33] drm/amdgpu/gfx9: re-emit unprocessed state on kcq reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Jesse Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amd

[PATCH 14/33] drm/amdgpu/sdma5: re-emit unprocessed state on ring reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Reviewed-by: Jesse Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c b/drivers/gpu/drm/amd/amdg

[PATCH 05/33] drm/amdgpu/vcn: don't enable per queue resets on SR-IOV

2025-07-11 Thread Alex Deucher
Power control is only available in bare metal. SR-IOV will need a different method. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c | 3 ++- drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c | 3 ++- drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c | 3 ++- 3 files changed, 6 insertions(+)

[PATCH 10/33] drm/amdgpu/gfx9.4.3: re-emit unprocessed state on kcq reset

2025-07-11 Thread Alex Deucher
Re-emit the unprocessed state after resetting the queue. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c index e

[PATCH 07/33] drm/amdgpu: clean up GC reset functions

2025-07-11 Thread Alex Deucher
Make them consistent and use the reset flags. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 14 +- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 19 --- drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 18 +++--- drivers/gpu/drm/amd/amdgpu

[PATCH 04/33] drm/amdgpu/jpeg4: add additional ring reset error checking

2025-07-11 Thread Alex Deucher
Start and stop can fail, so add checks. Fixes: 74894ffc7d0c ("drm/amdgpu: Add ring reset callback for JPEG4_0_0") Signed-off-by: Alex Deucher Cc: Sathishkumar S --- drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm

[PATCH 03/33] drm/amdgpu/jpeg3: add additional ring reset error checking

2025-07-11 Thread Alex Deucher
Start and stop can fail, so add checks. Fixes: 03399d0bff25 ("drm/amdgpu: Add ring reset callback for JPEG3_0_0") Signed-off-by: Alex Deucher Cc: Sathishkumar S --- drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm

[PATCH 06/33] drm/amdgpu: clean up jpeg reset functions

2025-07-11 Thread Alex Deucher
Make them consistent and use the reset flags. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c | 6 +- drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c | 6 +- drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c | 6 +- drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c | 7 --- 4 files chang

[PATCH 02/33] drm/amdgpu/jpeg2: add additional ring reset error checking

2025-07-11 Thread Alex Deucher
Start and stop can fail, so add checks. Fixes: 500c04d2a708 ("drm/amdgpu: Add ring reset callback for JPEG2_0_0") Signed-off-by: Alex Deucher Cc: Sathishkumar S --- drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm

[PATCH V15 00/33] Reset improvements

2025-07-11 Thread Alex Deucher
This set improves per queue reset support for a number of IPs. When we reset the queue, the queue is lost so we need to re-emit the unprocessed state from subsequent submissions. This is handled in gfx/compute queues via switch buffer and pipeline sync packets. However, you can still end up with p

[PATCH 01/33] drm/amdgpu: clean up sdma reset functions

2025-07-11 Thread Alex Deucher
Make them consistent and drop unneeded extra variables. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 14 +++--- drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 17 + drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 20 drivers/gpu/drm/amd

Re: [PATCH] drm/amdgpu: Add atomic CPU-GPU clock counter correlation

2025-07-11 Thread Alex Deucher
On Fri, Jul 11, 2025 at 5:13 AM Jesse Zhang wrote: > > This patch introduces a new IOCTL to provide tightly correlated > CPU and GPU timestamps for accurate performance measurements > and synchronization between host and device timelines. > > Key improvements: > 1. Adds AMDGPU_INFO_CLOCK_COUNTERS

[PATCH 3/3] drm/amdgpu/gfx12: set MQD as appriopriate for queue priv

2025-07-11 Thread Alex Deucher
Set the MQD as appropriate for the queue priv state. Acked-by: Christian König Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c b/drivers/gpu/drm/amd/amdgpu/

[PATCH 1/3] drm/amdgpu: track queue privilege in amdgpu_mqd_prop

2025-07-11 Thread Alex Deucher
Used to track the privilege level of the queue. Acked-by: Christian König Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/driver

[PATCH 2/3] drm/amdgpu/gfx11: set MQD as appriopriate for queue priv

2025-07-11 Thread Alex Deucher
Set the MQD as appropriate for the queue priv state. Acked-by: Christian König Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/

Re: [PATCH v2] drm/amdkfd: enable kfd on LoongArch systems

2025-07-11 Thread Felix Kuehling
On 2025-07-09 02:51, Han Gao wrote: KFD has been confirmed that can run on LoongArch systems. It's necessary to support CONFIG_HSA_AMD on LoongArch. Signed-off-by: Han Gao Thank you. I'm applying this patch to amd-staging-drm-next. Reviewed-by: Felix Kuehling --- Changes in v2: Add

Re: [PATCH 5/6] drm/amdgpu: add support for cyan skillfish gpu_info

2025-07-11 Thread Alex Deucher
On Fri, Jul 11, 2025 at 6:38 AM Yu, Lang wrote: > > [Public] > > >-Original Message- > >From: amd-gfx On Behalf Of Alex > >Deucher > >Sent: Friday, June 27, 2025 10:34 PM > >To: amd-gfx@lists.freedesktop.org > >Cc: Deucher, Alexander > >Subject: [PATCH 5/6] drm/amdgpu: add support for c

Re: [PATCH] drm/amdgpu: Fix missing unlocking in an error path in amdgpu_userq_create()

2025-07-11 Thread Alex Deucher
On Wed, Jul 9, 2025 at 3:28 PM Christophe JAILLET wrote: > > If kasprintf() fails, some mutex still need to be released to avoid locking > issue, as already done in all other error handling path. > > Fixes: c03ea34cbf88 ("drm/amdgpu: add support of debugfs for mqd information") > Signed-off-by: Ch

[pull] amdgpu, amdkfd drm-next-6.17

2025-07-11 Thread Alex Deucher
Hi Dave, Simona, A few more bits for 6.17. The following changes since commit 17d081ef84a6f3c2a1867cc753d7c8459a34d829: Merge tag 'drm-misc-next-2025-07-03' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next (2025-07-04 11:54:31 +1000) are available in the Git repository at:

Re: [PATCH 06/18] drm/amd/display: limit clear_update_flags to dcn32 and above

2025-07-11 Thread Mario Limonciello
On 7/10/2025 4:25 PM, ivan.lip...@amd.com wrote: From: Charlene Liu [why] dc has some code out of sync: dc_commit_updates_for_stream handles v1/v2/v3, but dc_update_planes_and_stream makes v1 asic to use v2. as a reression fix: limit clear_update_flags to dcn32 or newer asic. regression ne

Re: [PATCH v2] drm/sched: Avoid double re-lock on the job free path

2025-07-11 Thread Danilo Krummrich
On 7/11/25 9:08 PM, Maíra Canal wrote: Hi Tvrtko, On 11/07/25 12:09, Tvrtko Ursulin wrote: Currently the job free work item will lock sched->job_list_lock first time to see if there are any jobs, free a single job, and then lock again to decide whether to re-queue itself if there are more finis

Re: [PATCH v2] drm/sched: Avoid double re-lock on the job free path

2025-07-11 Thread Maíra Canal
Hi Tvrtko, On 11/07/25 12:09, Tvrtko Ursulin wrote: Currently the job free work item will lock sched->job_list_lock first time to see if there are any jobs, free a single job, and then lock again to decide whether to re-queue itself if there are more finished jobs. Since drm_sched_get_finished_

Re: [RFC] drm/amdgpu/sdma5.2: Avoid latencies caused by the powergating workaround

2025-07-11 Thread Alex Deucher
On Fri, Jul 11, 2025 at 12:07 PM Tvrtko Ursulin wrote: > > > On 11/07/2025 16:51, Alex Deucher wrote: > > On Fri, Jul 11, 2025 at 9:58 AM Tvrtko Ursulin > > wrote: > >> > >> > >> On 11/07/2025 14:39, Alex Deucher wrote: > >>> On Fri, Jul 11, 2025 at 9:22 AM Tvrtko Ursulin > >>> wrote: > > >

Re: [PATCH V10 33/46] drm: Add Enhanced LUT precision structure

2025-07-11 Thread Alex Hung
On 7/8/25 11:10, Simon Ser wrote: On Tuesday, June 17th, 2025 at 06:26, Alex Hung wrote: diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h index 651bdf48b766..21bd96f437e0 100644 --- a/include/uapi/drm/drm_mode.h +++ b/include/uapi/drm/drm_mode.h @@ -872,6 +872,16 @@ st

Re: [PATCH V10 33/46] drm: Add Enhanced LUT precision structure

2025-07-11 Thread Alex Hung
On 7/9/25 14:49, Borah, Chaitanya Kumar wrote: Hi Alex, -Original Message- From: Alex Hung Sent: Tuesday, June 17, 2025 9:47 AM To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org Cc: wayland-de...@lists.freedesktop.org; harry.wentl...@amd.com; alex.h...@amd.com; leo.

Re: [RFC] drm/amdgpu/sdma5.2: Avoid latencies caused by the powergating workaround

2025-07-11 Thread Tvrtko Ursulin
On 11/07/2025 16:51, Alex Deucher wrote: On Fri, Jul 11, 2025 at 9:58 AM Tvrtko Ursulin wrote: On 11/07/2025 14:39, Alex Deucher wrote: On Fri, Jul 11, 2025 at 9:22 AM Tvrtko Ursulin wrote: On 11/07/2025 13:45, Christian König wrote: On 11.07.25 14:23, Tvrtko Ursulin wrote: Commit 94

Re: [PATCH v5 00/14] drm/amd/display: more drm_edid to AMD display driver

2025-07-11 Thread Alex Hung
Thanks Melissa. I will send this series to promotion test and post the result by the end of next week. On 6/18/25 11:19, Melissa Wen wrote: Hi, Siqueira and I have been working on a solution to reduce the usage of drm_edid_raw in the AMD display driver, since the current guideline in the DRM s

Re: [RFC] drm/amdgpu/sdma5.2: Avoid latencies caused by the powergating workaround

2025-07-11 Thread Alex Deucher
On Fri, Jul 11, 2025 at 9:58 AM Tvrtko Ursulin wrote: > > > On 11/07/2025 14:39, Alex Deucher wrote: > > On Fri, Jul 11, 2025 at 9:22 AM Tvrtko Ursulin > > wrote: > >> > >> > >> On 11/07/2025 13:45, Christian König wrote: > >>> On 11.07.25 14:23, Tvrtko Ursulin wrote: > Commit > 94b1e02

Re: WARNING: possible circular locking dependency detected: drm_client_dev_suspend() & radeon_suspend_kms()

2025-07-11 Thread Thomas Zimmermann
Hi Am 11.07.25 um 16:46 schrieb Ville Syrjälä: On Fri, Jul 11, 2025 at 11:08:03AM +0200, Simona Vetter wrote: On Thu, Jul 10, 2025 at 04:43:02PM -0700, Jeff Johnson wrote: I'm trying to debug a hibernation issue with the ath12k driver, but to establish a baseline I started with Linus' current

Re: [RFC] drm/amdgpu/sdma5.2: Avoid latencies caused by the powergating workaround

2025-07-11 Thread Christian König
On 11.07.25 15:58, Tvrtko Ursulin wrote: > > On 11/07/2025 14:39, Alex Deucher wrote: >> On Fri, Jul 11, 2025 at 9:22 AM Tvrtko Ursulin >> wrote: >>> >>> >>> On 11/07/2025 13:45, Christian König wrote: On 11.07.25 14:23, Tvrtko Ursulin wrote: > Commit > 94b1e028e15c ("drm/amdgpu/sdma

Re: [PATCH] drm/sched: Avoid double re-lock on the job free path

2025-07-11 Thread Tvrtko Ursulin
On 11/07/2025 14:04, Philipp Stanner wrote: Late to the party; had overlooked that the discussion with Matt is resolved. Some comments below On Tue, 2025-07-08 at 13:20 +0100, Tvrtko Ursulin wrote: Currently the job free work item will lock sched->job_list_lock first time to see if there are

[PATCH v2] drm/sched: Avoid double re-lock on the job free path

2025-07-11 Thread Tvrtko Ursulin
Currently the job free work item will lock sched->job_list_lock first time to see if there are any jobs, free a single job, and then lock again to decide whether to re-queue itself if there are more finished jobs. Since drm_sched_get_finished_job() already looks at the second job in the queue we c

Re: WARNING: possible circular locking dependency detected: drm_client_dev_suspend() & radeon_suspend_kms()

2025-07-11 Thread Ville Syrjälä
On Fri, Jul 11, 2025 at 11:08:03AM +0200, Simona Vetter wrote: > On Thu, Jul 10, 2025 at 04:43:02PM -0700, Jeff Johnson wrote: > > I'm trying to debug a hibernation issue with the ath12k driver, but to > > establish a baseline I started with Linus' current tree. I have the > > following > > enable

Re: [RFC] drm/amdgpu/sdma5.2: Avoid latencies caused by the powergating workaround

2025-07-11 Thread Tvrtko Ursulin
On 11/07/2025 14:39, Alex Deucher wrote: On Fri, Jul 11, 2025 at 9:22 AM Tvrtko Ursulin wrote: On 11/07/2025 13:45, Christian König wrote: On 11.07.25 14:23, Tvrtko Ursulin wrote: Commit 94b1e028e15c ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks") added a workaround which disable

RE: [PATCH v6 09/11] drm/amdgpu: validate the shared bo for tracking usage size

2025-07-11 Thread Liang, Prike
[Public] > -Original Message- > From: Koenig, Christian > Sent: Friday, July 11, 2025 8:14 PM > To: Liang, Prike ; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander > Subject: Re: [PATCH v6 09/11] drm/amdgpu: validate the shared bo for tracking > usage size > > On 11.07.25 11:39, Pr

Re: [RFC] drm/amdgpu/sdma5.2: Avoid latencies caused by the powergating workaround

2025-07-11 Thread Alex Deucher
On Fri, Jul 11, 2025 at 9:22 AM Tvrtko Ursulin wrote: > > > On 11/07/2025 13:45, Christian König wrote: > > On 11.07.25 14:23, Tvrtko Ursulin wrote: > >> Commit > >> 94b1e028e15c ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks") > >> added a workaround which disables GFXOFF for the duration

Re: [PATCH] drm/scheduler: Fix sched hang when killing app with dependent jobs

2025-07-11 Thread Christian König
On 11.07.25 15:13, Philipp Stanner wrote: > On Thu, 2025-07-10 at 08:33 +, cao, lin wrote: >> >> [AMD Official Use Only - AMD Internal Distribution Only] >> >> >> >> Hi Christian, >> >> >> Thanks for your suggestion, I modified the patch as: > > Looks promising. You'll send a v2 I guess :) We

Re: [RFC] drm/amdgpu/sdma5.2: Avoid latencies caused by the powergating workaround

2025-07-11 Thread Tvrtko Ursulin
On 11/07/2025 13:45, Christian König wrote: On 11.07.25 14:23, Tvrtko Ursulin wrote: Commit 94b1e028e15c ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks") added a workaround which disables GFXOFF for the duration of the job submit stage (with a 100ms trailing hysteresis). Empirically t

Re: [PATCH] drm/scheduler: Fix sched hang when killing app with dependent jobs

2025-07-11 Thread Philipp Stanner
On Thu, 2025-07-10 at 08:33 +, cao, lin wrote: > > [AMD Official Use Only - AMD Internal Distribution Only] > > > > Hi Christian, > > > Thanks for your suggestion, I modified the patch as: Looks promising. You'll send a v2 I guess :) P. > > > diff --git a/drivers/gpu/drm/scheduler/sc

Re: [PATCH] drm/amdgpu: Fix lifetime of struct amdgpu_task_info after ring reset

2025-07-11 Thread André Almeida
Em 04/07/2025 00:06, André Almeida escreveu: When a ring reset happens, amdgpu calls drm_dev_wedged_event() using struct amdgpu_task_info *ti as one of the arguments. After using *ti, a call to amdgpu_vm_put_task_info(ti) is required to correctly track its lifetime. However, it's called from a p

Re: [PATCH] drm/sched: Avoid double re-lock on the job free path

2025-07-11 Thread Philipp Stanner
Late to the party; had overlooked that the discussion with Matt is resolved. Some comments below On Tue, 2025-07-08 at 13:20 +0100, Tvrtko Ursulin wrote: > Currently the job free work item will lock sched->job_list_lock first time > to see if there are any jobs, free a single job, and then lock ag

WARNING: possible circular locking dependency detected: drm_client_dev_suspend() & radeon_suspend_kms()

2025-07-11 Thread Jeff Johnson
I'm trying to debug a hibernation issue with the ath12k driver, but to establish a baseline I started with Linus' current tree. I have the following enabled in my .config: CONFIG_PROVE_LOCKING=y CONFIG_PROVE_RAW_LOCK_NESTING=y CONFIG_PROVE_RCU=y As part of the baseline I observed the following:

Re: [RFC] drm/amdgpu/sdma5.2: Avoid latencies caused by the powergating workaround

2025-07-11 Thread Christian König
On 11.07.25 14:23, Tvrtko Ursulin wrote: > Commit > 94b1e028e15c ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks") > added a workaround which disables GFXOFF for the duration of the job > submit stage (with a 100ms trailing hysteresis). > > Empirically the GFXOFF disable/enable request can

Re: [PATCH] drm/sched: Avoid double re-lock on the job free path

2025-07-11 Thread Tvrtko Ursulin
On 09/07/2025 18:22, Matthew Brost wrote: On Wed, Jul 09, 2025 at 11:49:44AM +0100, Tvrtko Ursulin wrote: On 09/07/2025 05:45, Matthew Brost wrote: On Tue, Jul 08, 2025 at 01:20:32PM +0100, Tvrtko Ursulin wrote: Currently the job free work item will lock sched->job_list_lock first time to

[PATCH] drm/amdgpu: Cache some values in ring emission helpers

2025-07-11 Thread Tvrtko Ursulin
By caching some values in local variables we can allow the compiler to emit more compact code because it does not have to reload those values constantly. Before and after size comparisons: text data bss dechex filename 10708384 547307 213512 11469203

[RFC] drm/amdgpu/sdma5.2: Avoid latencies caused by the powergating workaround

2025-07-11 Thread Tvrtko Ursulin
Commit 94b1e028e15c ("drm/amdgpu/sdma5.2: add begin/end_use ring callbacks") added a workaround which disables GFXOFF for the duration of the job submit stage (with a 100ms trailing hysteresis). Empirically the GFXOFF disable/enable request can suffer from significant latencies (2ms is easily seen

Re: [PATCH v6 10/11] drm/amdgpu: validate the queue va for resuming the queue

2025-07-11 Thread Christian König
On 11.07.25 11:39, Prike Liang wrote: > It requires validating the userq VA whether is mapped before > trying to resume the queue. > > Signed-off-by: Prike Liang Yeah that looks sane to me. Patch is Reviewed-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 7 +++ >

Re: [PATCH v6 09/11] drm/amdgpu: validate the shared bo for tracking usage size

2025-07-11 Thread Christian König
On 11.07.25 11:39, Prike Liang wrote: > It requires validating the shared BO before updating its usage > size; otherwise, there is a potential NULL pointer error when the > BO released improperly. Clear NAK to that. You are obviously working around a bug elsewhere. Regards, Christian. > > Signe

Re: [PATCH v6 07/11] drm/amdgpu: validate userq's last fence prior to destroying

2025-07-11 Thread Christian König
On 11.07.25 11:39, Prike Liang wrote: > The userq requires validating queue status before destroying > it, if user tries to destroy a busy userq by IOCTL then the > driver should report an error for this illegal usage. Clear NAK, destroying a busy userqueue is perfectly valid! Regards, Christian.

Re: [PATCH v6 06/11] drm/amdgpu: track the userq bo va for its obj management

2025-07-11 Thread Christian König
On 11.07.25 11:39, Prike Liang wrote: > The user queue object destroy requires ensuring its > VA keeps mapping prior to the queue being destroyed. > Otherwise, it seems a bug in the user space or VA > freed wrongly, and the kernel driver should report an > invalidated error to the user IOCLT req

Re: [PATCH v6 04/11] drm/amdgpu: validate userq buffer virtual address and size

2025-07-11 Thread Christian König
On 11.07.25 11:39, Prike Liang wrote: > It needs to validate the userq object virtual address to > determin whether it is residented in a valid vm mapping. > > Signed-off-by: Prike Liang > Reviewed-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 38 +

[PATCH] drm/amdgpu: Fix missing unlocking in an error path in amdgpu_userq_create()

2025-07-11 Thread Christophe JAILLET
If kasprintf() fails, some mutex still need to be released to avoid locking issue, as already done in all other error handling path. Fixes: c03ea34cbf88 ("drm/amdgpu: add support of debugfs for mqd information") Signed-off-by: Christophe JAILLET --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 6

Re: [PATCH v6 03/11] drm/amdgpu: rework the userq doorbell object destroy

2025-07-11 Thread Christian König
On 11.07.25 11:39, Prike Liang wrote: > This patch aims to unify and destroy the userq doorbell objects at > mes_userq_mqd_destroy(), and this change will also help with unpinning > and destroying the userq doorbell objects for amdgpu_userq_mgr_fini() > during releasing the drm files. > > Signed-o

RE: [PATCH] drm/amdgpu: The interrupt source was not released

2025-07-11 Thread Zhou1, Tao
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Tao Zhou > -Original Message- > From: Sun, Ce(Overlord) > Sent: Friday, July 11, 2025 6:16 PM > To: amd-gfx@lists.freedesktop.org > Cc: Zhang, Hawking ; Zhou1, Tao > ; Sun, Ce(Overlord) > Subject: [PATCH] drm/amdgpu:

RE: [PATCH 5/6] drm/amdgpu: add support for cyan skillfish gpu_info

2025-07-11 Thread Yu, Lang
[Public] >-Original Message- >From: amd-gfx On Behalf Of Alex Deucher >Sent: Friday, June 27, 2025 10:34 PM >To: amd-gfx@lists.freedesktop.org >Cc: Deucher, Alexander >Subject: [PATCH 5/6] drm/amdgpu: add support for cyan skillfish gpu_info > >Some SOCs which are part of the cyan skillfi

[PATCH] drm/amdgpu: The interrupt source was not released

2025-07-11 Thread Ce Sun
When the driver is unloaded, the interrupt source of the rma device is not released, resulting in the failure of hw_init when loading again using bad_page_threshold. Signed-off-by: Ce Sun --- drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --gi

[PATCH v6 10/11] drm/amdgpu: validate the queue va for resuming the queue

2025-07-11 Thread Prike Liang
It requires validating the userq VA whether is mapped before trying to resume the queue. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c b/drivers/gpu/drm/amd/amdgpu/amd

[PATCH v6 11/11] drm/amdgpu: validate userq va for GEM unmap

2025-07-11 Thread Prike Liang
This change validates the userq to see whether can be unmapped prior to the userq VA GEM unmap. The solution is based on the following idea: 1) Find out the GEM unmap VA belonds to which userq, 2) Wait the userq fence and eviction fence signal, 3) If attached fence signal, then suspend the userq

[PATCH v6 09/11] drm/amdgpu: validate the shared bo for tracking usage size

2025-07-11 Thread Prike Liang
It requires validating the shared BO before updating its usage size; otherwise, there is a potential NULL pointer error when the BO released improperly. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 25 + 1 file changed, 21 insertions(+), 4 delet

[PATCH v6 07/11] drm/amdgpu: validate userq's last fence prior to destroying

2025-07-11 Thread Prike Liang
The userq requires validating queue status before destroying it, if user tries to destroy a busy userq by IOCTL then the driver should report an error for this illegal usage. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 15 --- 1 file changed, 12 inserti

[PATCH v6 08/11] drm/amdgpu: clean up the amdgpu_userq_active()

2025-07-11 Thread Prike Liang
This is no invocation for amdgpu_userq_active(). Signed-off-by: Prike Liang Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 16 drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h | 2 -- 2 files changed, 18 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu

[PATCH v6 06/11] drm/amdgpu: track the userq bo va for its obj management

2025-07-11 Thread Prike Liang
The user queue object destroy requires ensuring its VA keeps mapping prior to the queue being destroyed. Otherwise, it seems a bug in the user space or VA freed wrongly, and the kernel driver should report an invalidated error to the user IOCLT request. Signed-off-by: Prike Liang --- drivers/gpu

[PATCH v6 04/11] drm/amdgpu: validate userq buffer virtual address and size

2025-07-11 Thread Prike Liang
It needs to validate the userq object virtual address to determin whether it is residented in a valid vm mapping. Signed-off-by: Prike Liang Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 38 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h | 2 ++

[PATCH v6 02/11] drm/amdgpu: validate userq hw unmap status for destroying userq

2025-07-11 Thread Prike Liang
Before destroying the userq buffer object, it requires validating the userq HW unmap status and ensuring the userq is unmapped from hardware. If the user HW unmap failed, then it needs to reset the queue for reusing. Signed-off-by: Prike Liang Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/a

[PATCH v6 05/11] drm/amdgpu: add userq object va track helpers

2025-07-11 Thread Prike Liang
Add the userq object virtual address get(),mapped() and put() helpers for tracking the userq obj va address usage. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 172 - drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h | 14 ++ drivers/gpu/drm/amd/amdgp

[PATCH v6 03/11] drm/amdgpu: rework the userq doorbell object destroy

2025-07-11 Thread Prike Liang
This patch aims to unify and destroy the userq doorbell objects at mes_userq_mqd_destroy(), and this change will also help with unpinning and destroying the userq doorbell objects for amdgpu_userq_mgr_fini() during releasing the drm files. Signed-off-by: Prike Liang Reviewed-by: Alex Deucher ---

[PATCH v6 01/11] drm/amdgpu: validate userq input args

2025-07-11 Thread Prike Liang
This will help on validating the userq input args, and rejecting for the invalid userq request at the IOCTLs first place. Signed-off-by: Prike Liang Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 81 +++--- drivers/gpu/drm/amd/amdgpu/mes_userqueue.c |

[PATCH 2/3] drm/amd/pm: Use cached metrics data on aldebaran

2025-07-11 Thread Lijo Lazar
Cached metrics data validity is 1ms on aldebaran. It's not reasonable for any client to query gpu_metrics at a faster rate and constantly interrupt PMFW. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -

[PATCH 1/3] drm/amd/pm: Use cached metrics data on SMUv13.0.6

2025-07-11 Thread Lijo Lazar
Cached metrics data validity is 1ms on SMUv13.0.6 SOCs. It's not reasonable for any client to query gpu_metrics at a faster rate and constantly interrupt PMFW. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

[PATCH 3/3] drm/amd/pm: Use cached metrics data on arcturus

2025-07-11 Thread Lijo Lazar
Cached metrics data validity is 1ms on arcturus. It's not reasonable for any client to query gpu_metrics at a faster rate and constantly interrupt PMFW. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --g

RE: [PATCH 1/2] drm/amdgpu: refine eeprom data check

2025-07-11 Thread Xie, Patrick
[AMD Official Use Only - AMD Internal Distribution Only] Sure, they have been verified by checkpatch.pl , thanks -Original Message- From: Zhou1, Tao Sent: Friday, July 11, 2025 4:45 PM To: Xie, Patrick ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH 1/2] drm/amdgpu: refine eeprom dat

Re: WARNING: possible circular locking dependency detected: drm_client_dev_suspend() & radeon_suspend_kms()

2025-07-11 Thread Simona Vetter
On Thu, Jul 10, 2025 at 04:43:02PM -0700, Jeff Johnson wrote: > I'm trying to debug a hibernation issue with the ath12k driver, but to > establish a baseline I started with Linus' current tree. I have the following > enabled in my .config: > > CONFIG_PROVE_LOCKING=y > CONFIG_PROVE_RAW_LOCK_NESTING=

[PATCH 2/2] drm/amdgpu: adjust the update of RAS bad page number

2025-07-11 Thread Tao Zhou
One eeprom record may not map to unit number of bad pages, the accurate bad page number is gotten after bad page address check. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 43 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 3 ++ .../gpu/drm/a

[PATCH 1/2] drm/amdgpu: add range check for RAS bad page address

2025-07-11 Thread Tao Zhou
Exclude invalid bad pages. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 58 - 1 file changed, 28 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index a6f512293b5c..1d

RE: [PATCH 1/2] drm/amdgpu: refine eeprom data check

2025-07-11 Thread Zhou1, Tao
[AMD Official Use Only - AMD Internal Distribution Only] The series is: Reviewed-by: Tao Zhou Please make sure the patches are verified by checkpatch script. > -Original Message- > From: Xie, Patrick > Sent: Friday, July 11, 2025 4:21 PM > To: amd-gfx@lists.freedesktop.org > Cc: Zhou1,

[PATCH] drm/amdgpu: Add atomic CPU-GPU clock counter correlation

2025-07-11 Thread Jesse Zhang
This patch introduces a new IOCTL to provide tightly correlated CPU and GPU timestamps for accurate performance measurements and synchronization between host and device timelines. Key improvements: 1. Adds AMDGPU_INFO_CLOCK_COUNTERS query type (0x06) 2. Implements atomic sampling of clocks with:

  1   2   >