RE: [PATCH v8 0/4] enable xgmi node migration support for hibernate on SRIOV

2025-05-26 Thread Zhang, Owen(SRDC)
[AMD Official Use Only - AMD Internal Distribution Only] Ping @Lazar, Lijo, @Koenig, Christian… Kindly pls review the updated patch in advance and we can discuss your suggestions in tomorrow's meeting. Thanks for your great support.

[PATCH] drm/amdkfd: Move the process suspend and resume out of full access

2025-05-26 Thread Emily Deng
For the suspend and resume process, exclusive access is not required. Therefore, it can be moved out of the full access section to reduce the duration of exclusive access. Signed-off-by: Emily Deng --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 16 + drivers/gpu/drm/amd/amdgpu/amdgpu_a

Re: [PATCH RESEND] drm/amd/display: Adjust prefix of dcn31_apg construct function name

2025-05-26 Thread Alex Hung
Hi Leonardo, Thank you for this patch, but unfortunately some unit test suites depend on the names. On 5/21/25 07:58, Leonardo Gomes wrote: From: Leonardo da Silva Gomes Adjust the dcn31_apg construct function name from 'apg31_construct' to 'dcn31_apg_construct'. This helps the ftrace to de

Re: [PATCH] drm/amd/display: Constify struct timing_generator_funcs

2025-05-26 Thread Alex Hung
Reviewed-by: Alex Hung On 5/24/25 10:51, Christophe JAILLET wrote: 'struct timing_generator_funcs' are not modified in these drivers. Constifying these structures moves some data to a read-only section, so increases overall security, especially when the structure holds some function pointers.

Re: [PATCH] drm/amd/display: Add null pointer check for get_first_active_display()

2025-05-26 Thread Alex Hung
Reviewed-by: Alex Hung On 5/25/25 20:37, Wentao Liang wrote: The function mod_hdcp_hdcp1_enable_encryption() calls the function get_first_active_display(), but does not check its return value. The return value is a null pointer if the display list is empty. This will lead to a null pointer dere

RE: [PATCH 2/2] drm/amd/pm: Enable static metrics table support

2025-05-26 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Kamal, Asad Sent: Monday, May 26, 2025 23:13 To: amd-gfx@lists.freedesktop.org; Lazar, Lijo Cc: Zhang, Hawking ; Ma, Le ; Zhang, Morris ; Kamal, Asa

Re: [PATCH] drm/amdkfd: Identical code for different branches

2025-05-26 Thread Tudor, Alexandru
[Public] [Public] With the change suggested by Harish below this patch looks good to me too. Reviewed-by: Alexandru Tudor From: Kasiviswanathan, Harish Sent: Monday, May 26, 2025 1:48 PM To: Clement, Sunday ; amd-gfx@lists.freedesktop.org Cc: Tudor, Alexandr

[PATCH 2/2] drm/amd/pm: Enable static metrics table support

2025-05-26 Thread Asad Kamal
Enable static metrics support to fetch board voltage and pldm version for smu_v13_0_14 Signed-off-by: Asad Kamal Reviewed-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0

Re: 6.15-rc6/regression/bisected - after commit f1c6be3999d2 error appeared: *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error

2025-05-26 Thread Pillai, Aurabindo
[AMD Official Use Only - AMD Internal Distribution Only] Hi Mike, It is indeed a bit harder, but we were able to repro the issue on the 6000 series. I'll need to get the DMCUB trace log to confirm, but it looks like an SMU hang from within DMCUB. So we'd need more debugging to find out whats go

RE: [PATCH] drm/amdkfd: Identical code for different branches

2025-05-26 Thread Kasiviswanathan, Harish
[Public] You can remove BUG:SWDEV-534537 from commit message as it doesn't provide any information to public. With that this patch is Reviewed-by: Harish Kasiviswanathan Sent: Friday, May 23, 2025 7:54 PM To: amd-gfx@lists.freedesktop.org Cc: Tudor, Alexandru ; Kasiviswanathan, Harish ; Clemen

Re: [PATCH] amd/amdkfd: fix a kfd_process ref leak

2025-05-26 Thread Philip Yang
On 2025-05-21 06:12, Yifan Zhang wrote: This patch is to fix a kfd_prcess ref leak. Signed-off-by: Yifan Zhang Reviewed-by: Philip Yang --- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/

[PATCH 1/2] drm/amd/pm: Enable static metrics table support

2025-05-26 Thread Asad Kamal
Enable static metrics support to fetch board voltage and pldm version for other smu_v13_0_6 program Signed-off-by: Asad Kamal Reviewed-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/dr

[PATCH v11 00/10] Improve gpu_scheduler trace events + UAPI

2025-05-26 Thread Pierre-Eric Pelloux-Prayer
Hi, The initial goal of this series was to improve the drm and amdgpu trace events to be able to expose more of the inner workings of the scheduler and drivers to developers via tools. Then, the series evolved to become focused only on gpu_scheduler. The changes around vblank events will be part

[PATCH v11 08/10] drm: Get rid of drm_sched_job.id

2025-05-26 Thread Pierre-Eric Pelloux-Prayer
Its only purpose was for trace events, but jobs can already be uniquely identified using their fence. The downside of using the fence is that it's only available after 'drm_sched_job_arm' was called which is true for all trace events that used job.id so they can safely switch to using it. Suggest

[PATCH v11 10/10] drm/amdgpu: update trace format to match gpu_scheduler_trace

2025-05-26 Thread Pierre-Eric Pelloux-Prayer
Log fences using the same format for coherency. Signed-off-by: Pierre-Eric Pelloux-Prayer Reviewed-by: Christian König Reviewed-by: Arvind Yadav --- drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/drivers/gp

[PATCH v11 02/10] drm/sched: Store the drm client_id in drm_sched_fence

2025-05-26 Thread Pierre-Eric Pelloux-Prayer
This will be used in a later commit to trace the drm client_id in some of the gpu_scheduler trace events. This requires changing all the users of drm_sched_job_init to add an extra parameter. The newly added drm_client_id field in the drm_sched_fence is a bit of a duplicate of the owner one. One

Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency

2025-05-26 Thread Philipp Stanner
On Fri, 2025-05-23 at 14:56 +0200, Christian König wrote: > It turned out that we can actually massively optimize here. > > The previous code was horrible inefficient since it constantly > released > and re-acquired the lock of the xarray and started each iteration > from the > base of the array t

Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency

2025-05-26 Thread Christian König
On 5/26/25 13:14, Danilo Krummrich wrote: > (Cc: Matthew) > > Let's get this clarified to not work with assumptions. :) > > On Mon, May 26, 2025 at 12:59:41PM +0200, Christian König wrote: >> On 5/24/25 13:17, Danilo Krummrich wrote: >>> On Fri, May 23, 2025 at 04:11:39PM +0200, Danilo Krummrich

Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency

2025-05-26 Thread Christian König
On 5/26/25 11:34, Philipp Stanner wrote: > On Mon, 2025-05-26 at 11:25 +0200, Christian König wrote: >> On 5/23/25 16:16, Danilo Krummrich wrote: >>> On Fri, May 23, 2025 at 04:11:39PM +0200, Danilo Krummrich wrote: On Fri, May 23, 2025 at 02:56:40PM +0200, Christian König wrote: > It turn

Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency

2025-05-26 Thread Danilo Krummrich
(Cc: Matthew) Let's get this clarified to not work with assumptions. :) On Mon, May 26, 2025 at 12:59:41PM +0200, Christian König wrote: > On 5/24/25 13:17, Danilo Krummrich wrote: > > On Fri, May 23, 2025 at 04:11:39PM +0200, Danilo Krummrich wrote: > > So, your code here should be correct. Howe

Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency

2025-05-26 Thread Christian König
On 5/24/25 13:17, Danilo Krummrich wrote: > On Fri, May 23, 2025 at 04:11:39PM +0200, Danilo Krummrich wrote: >> On Fri, May 23, 2025 at 02:56:40PM +0200, Christian König wrote: >>> + if (xas_nomem(&xas, GFP_KERNEL)) { >>> + xa_lock(&job->dependencies); >>> + goto retry; >> >>

Re: [PATCH V2 00/10] Reset improvements for GC10+

2025-05-26 Thread Christian König
On 5/23/25 16:39, Alex Deucher wrote: > On Fri, May 23, 2025 at 10:12 AM Alex Deucher wrote: >> >> On Fri, May 23, 2025 at 10:03 AM Christian König >> wrote: >>> >>> On 5/23/25 15:58, Alex Deucher wrote: I think that's probably the best option. I was thinking we could mirror the ring f

Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency

2025-05-26 Thread Christian König
On 5/23/25 16:16, Danilo Krummrich wrote: > On Fri, May 23, 2025 at 04:11:39PM +0200, Danilo Krummrich wrote: >> On Fri, May 23, 2025 at 02:56:40PM +0200, Christian König wrote: >>> It turned out that we can actually massively optimize here. >>> >>> The previous code was horrible inefficient since

[PATCH] drm/amd/display: Add null pointer check for get_first_active_display()

2025-05-26 Thread Wentao Liang
The function mod_hdcp_hdcp1_enable_encryption() calls the function get_first_active_display(), but does not check its return value. The return value is a null pointer if the display list is empty. This will lead to a null pointer dereference in mod_hdcp_hdcp2_enable_encryption(). Add a null pointe