[PATCH] drm/amdgpu: Clear overflow for SRIOV

2025-04-09 Thread Emily Deng
For VF, it doesn't have the permission to clear overflow, clear the bit by reset. Signed-off-by: Emily Deng --- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 10 -- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 + drivers/gpu/drm/amd/amdgpu/ih_v6_0.c | 6 +- drivers/gpu/drm/amd/amdgpu/ve

[PATCH] drm/amdkfd: Add rec SDMA engines support with limited XGMI

2025-04-09 Thread Shane Xiao
This patch adds recommended SDMA engines with limited XGMI SDMA engines. It will help improve overall performance for device to device copies with this optimization. Signed-off-by: Shane Xiao Suggested-by: Jonathan Kim --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 42 ++-

Re: [PATCH] drm/amdgpu: cleanup amdgpu_vm_flush v5

2025-04-09 Thread SRINIVASAN SHANMUGAM
On 4/9/2025 7:16 PM, SRINIVASAN SHANMUGAM wrote: On 4/9/2025 7:11 PM, SRINIVASAN SHANMUGAM wrote: On 4/9/2025 6:45 PM, SRINIVASAN SHANMUGAM wrote: On 4/9/2025 4:15 PM, Christian König wrote: This reverts commit c2cc3648ba517a6c270500b5447d5a1efdad5936. Turned out that this has some negati

Re: [PATCH] drm/amdgpu: Replace tmp_adev with hive in amdgpu_pci_slot_reset

2025-04-09 Thread Christian König
Am 09.04.25 um 15:18 schrieb Ce Sun: > Checking hive is more readable. > > The following smatch warning: > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:6820 amdgpu_pci_slot_reset() > warn: iterator used outside loop: 'tmp_adev' Please also remove the setting of hive and tmp_adev to NULL at the decla

RE: [PATCH v1 1/1] drm/amdgpu: fix a smatch static checker warning in amdgpu_pci_slot_reset

2025-04-09 Thread Zhou1, Tao
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Tao Zhou > -Original Message- > From: amd-gfx On Behalf Of Ce Sun > Sent: Wednesday, April 9, 2025 8:10 PM > To: amd-gfx@lists.freedesktop.org > Cc: dan.carpen...@linaro.org; Sun, Ce(Overlord) > Subject: [PATCH v1 1/

Re: [PATCH] drm/amdgpu: fix warning of drm_mm_clean

2025-04-09 Thread Christian König
Am 08.04.25 um 10:23 schrieb ZhenGuo Yin: > Kernel doorbell BOs needs to be freed before ttm_fini. > > Fixes: 54c30d2a8def ("drm/amdgpu: create kernel doorbell pages") > Signed-off-by: ZhenGuo Yin At least from my point that patch seems to make a lot of sense, so feel free to add Reviewed-by: Ch

Re: [v3 3/7] drm/amdgpu: Optimize SDMA v5.0 queue reset and stop logic

2025-04-09 Thread Alex Deucher
On Wed, Apr 2, 2025 at 5:23 AM jesse.zh...@amd.com wrote: > > From: "jesse.zh...@amd.com" > > This patch refactors the SDMA v5.0 queue reset and stop logic to improve > code readability, maintainability, and performance. The key changes include: > > 1. **Generalized `sdma_v5_0_gfx_stop` Function*

Re: [PATCH] drm/amd/display: Add htmldocs description for fused_io interface

2025-04-09 Thread Alex Deucher
Acked-by: Alex Deucher On Wed, Apr 9, 2025 at 1:06 PM wrote: > > From: Roman Li > > [Why] > htmldocs build warning: "Function parameter or struct member 'fused_io' > not described in 'amdgpu_display_manager'". > > [How] > Add missing description. > > Fixes: af632d3f59e6 ("drm/amd/display: HDCP

[PATCH] drm/amd/display: Add htmldocs description for fused_io interface

2025-04-09 Thread Roman.Li
From: Roman Li [Why] htmldocs build warning: "Function parameter or struct member 'fused_io' not described in 'amdgpu_display_manager'". [How] Add missing description. Fixes: af632d3f59e6 ("drm/amd/display: HDCP Locality check using DMUB Fused IO") Reported-by: Stephen Rothwell Signed-off-by:

[pull] amdgpu, amdkfd drm-fixes-6.15

2025-04-09 Thread Alex Deucher
Hi Dave, Sima, Fixes for 6.15. The following changes since commit dce8bd9137b88735dd0efc4e2693213d98c15913: drm/amdgpu/gfx12: fix num_mec (2025-03-26 17:47:18 -0400) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-fixes-6.15-2025-04-09

RE: [PATCH] drm/amd/display: Fix drm_err argument type error

2025-04-09 Thread Li, Roman
[Public] Reviewed-by: Roman Li Please fix your authorship format as "First_name Last_name " before merge. Thanks, Roman > -Original Message- > From: chengjya > Sent: Wednesday, April 9, 2025 5:17 AM > To: Li, Roman ; amd-gfx@lists.freedesktop.org; Kaszewski, > Dominik ; Wheeler, Dani

Re: [PATCH 3/3] drm/amdgpu: adjust enforce_isolation handling

2025-04-09 Thread SRINIVASAN SHANMUGAM
On 4/8/2025 9:30 PM, Alex Deucher wrote: Switch from a bool to an enum and allow more options for enforce isolation. There are now 3 modes of operation: - Disabled (0) - Enabled (serialization and cleaner shader) (1) - Enabled in legacy mode (no serialization or cleaner shader) (2) This provid

[PATCH 0/2] dma-fence: Rename dma_fence_is_signaled()

2025-04-09 Thread Philipp Stanner
Hi all, I'm currently debugging a Nouveau issue [1] and potentially might want to add a function that just checks whether a fence is signaled already – which then would obviously be called dma_fence_is_signaled(). In any case, I think it is reasonable to rename dma_fence_is_signaled() so that it

Re: [PATCH 3/3] drm/amdgpu: adjust enforce_isolation handling

2025-04-09 Thread Alex Deucher
On Wed, Apr 9, 2025 at 10:36 AM SRINIVASAN SHANMUGAM wrote: > > > On 4/8/2025 9:30 PM, Alex Deucher wrote: > > Switch from a bool to an enum and allow more options > > for enforce isolation. There are now 3 modes of operation: > > - Disabled (0) > > - Enabled (serialization and cleaner shader) (1

Re: [PATCH 1/2] dma-fence: Rename dma_fence_is_signaled()

2025-04-09 Thread Christian König
Am 09.04.25 um 16:01 schrieb Philipp Stanner: > On Wed, 2025-04-09 at 15:14 +0200, Christian König wrote: >> Am 09.04.25 um 14:56 schrieb Philipp Stanner: >>> On Wed, 2025-04-09 at 14:51 +0200, Philipp Stanner wrote: On Wed, 2025-04-09 at 14:39 +0200, Boris Brezillon wrote: > Hi Philipp, >

Re: [PATCH] drm/amdgpu: Replace tmp_adev with hive in amdgpu_pci_slot_reset

2025-04-09 Thread Christian König
Am 09.04.25 um 15:39 schrieb Ce Sun: > Checking hive is more readable. > > The following smatch warning: > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:6820 amdgpu_pci_slot_reset() > warn: iterator used outside loop: 'tmp_adev' > > Fixes: 8ba904f54148 ("drm/amdgpu: Multi-GPU DPC recovery support") I

Re: [PATCH 1/2] drm/sched: add drm_sched_prealloc_dependency_slots v2

2025-04-09 Thread Christian König
Am 09.04.25 um 12:28 schrieb Philipp Stanner: > On Fri, 2025-03-21 at 16:58 +0100, Christian König wrote: >> Sometimes drivers need to be able to submit multiple jobs which >> depend on >> each other to different schedulers at the same time, but using >> drm_sched_job_add_dependency() can't fail an

Re: [PATCH] drm/amdgpu: Replace tmp_adev with hive in amdgpu_pci_slot_reset

2025-04-09 Thread Lazar, Lijo
On 4/9/2025 7:09 PM, Ce Sun wrote: > Checking hive is more readable. > > The following smatch warning: > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:6820 amdgpu_pci_slot_reset() > warn: iterator used outside loop: 'tmp_adev' > > Fixes: 8ba904f54148 ("drm/amdgpu: Multi-GPU DPC recovery support")

Re: [PATCH] drm/amdgpu: cleanup amdgpu_vm_flush v5

2025-04-09 Thread SRINIVASAN SHANMUGAM
On 4/9/2025 7:11 PM, SRINIVASAN SHANMUGAM wrote: On 4/9/2025 6:45 PM, SRINIVASAN SHANMUGAM wrote: On 4/9/2025 4:15 PM, Christian König wrote: This reverts commit c2cc3648ba517a6c270500b5447d5a1efdad5936. Turned out that this has some negative consequences for some workloads. Instead check

Re: [PATCH] drm/amdgpu: cleanup amdgpu_vm_flush v5

2025-04-09 Thread SRINIVASAN SHANMUGAM
On 4/9/2025 6:45 PM, SRINIVASAN SHANMUGAM wrote: On 4/9/2025 4:15 PM, Christian König wrote: This reverts commit c2cc3648ba517a6c270500b5447d5a1efdad5936. Turned out that this has some negative consequences for some workloads. Instead check if the cleaner shader should run directly. While

[PATCH] drm/amdgpu: Replace tmp_adev with hive in amdgpu_pci_slot_reset

2025-04-09 Thread Ce Sun
Checking hive is more readable. The following smatch warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:6820 amdgpu_pci_slot_reset() warn: iterator used outside loop: 'tmp_adev' Fixes: 8ba904f54148 ("drm/amdgpu: Multi-GPU DPC recovery support") Reported-by: Dan Carpenter Signed-off-by: Ce Sun

Re: [PATCH] drm/amdgpu: cleanup amdgpu_vm_flush v5

2025-04-09 Thread SRINIVASAN SHANMUGAM
On 4/9/2025 4:15 PM, Christian König wrote: This reverts commit c2cc3648ba517a6c270500b5447d5a1efdad5936. Turned out that this has some negative consequences for some workloads. Instead check if the cleaner shader should run directly. While at it remove amdgpu_vm_need_pipeline_sync(), we also

[PATCH] drm/amdgpu: Replace tmp_adev with hive in amdgpu_pci_slot_reset

2025-04-09 Thread Ce Sun
Checking hive is more readable. The following smatch warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:6820 amdgpu_pci_slot_reset() warn: iterator used outside loop: 'tmp_adev' Fixes: 8ba904f54148 ("drm/amdgpu: Multi-GPU DPC recovery support") Reported-by: Dan Carpenter Signed-off-by: Ce Sun

Re: [PATCH 2/2] dma-fence: Improve docu for dma_fence_check_and_signal()

2025-04-09 Thread Li, Yunxiang (Teddy)
[AMD Official Use Only - AMD Internal Distribution Only] Hi Philipp, I feel like the problem has two parts. The documentation does not make explicit that DMA_FENCE_FLAG_SIGNALED_BIT is "caching" the hardware state when a fence is backed by hardware, so what dma_fence_is_signaled here is doing i

[PATCH 1/2] dma-fence: Rename dma_fence_is_signaled()

2025-04-09 Thread Philipp Stanner
dma_fence_is_signaled()'s name strongly reads as if this function were intended for checking whether a fence is already signaled. Also the boolean it returns hints at that. The function's behavior, however, is more complex: it can check with a driver callback whether the hardware's sequence number

[PATCH v1 1/1] drm/amdgpu: fix a smatch static checker warning in amdgpu_pci_slot_reset

2025-04-09 Thread Ce Sun
Fixes smatch warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:6820 amdgpu_pci_slot_reset() warn: iterator used outside loop: 'tmp_adev' Fixes: 8ba904f54148 ("drm/amdgpu: Multi-GPU DPC recovery support") Signed-off-by: Ce Sun --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file chan

[PATCH 2/2] dma-fence: Improve docu for dma_fence_check_and_signal()

2025-04-09 Thread Philipp Stanner
The documentation of the return value of dma_fence_check_and_signal() and dma_fence_check_and_signal_locked() reads as if the returned boolean only describes whether dma_fence_signal() (or similar) has been called before this function call already. That's not the case, since dma_fence_ops.signaled(

[PATCH] drm/amdgpu: cleanup amdgpu_vm_flush v5

2025-04-09 Thread Christian König
This reverts commit c2cc3648ba517a6c270500b5447d5a1efdad5936. Turned out that this has some negative consequences for some workloads. Instead check if the cleaner shader should run directly. While at it remove amdgpu_vm_need_pipeline_sync(), we also check again if the VMID has seen a GPU reset sin

Re: [bug report] drm/amdgpu: Multi-GPU DPC recovery support

2025-04-09 Thread Sun, Ce(Overlord)
[AMD Official Use Only - AMD Internal Distribution Only] Hi Dan Carpenter, Thank you for your review, there is indeed a problem with NULL Pointers, I will correct this problem immediately Regards, Sun,Ce From: Dan Carpenter Sent: Wednesday, April 9, 2025 4:

[PATCH] drm/amd/display: Fix drm_err argument type error

2025-04-09 Thread chengjya
The drm_err function expects a struct drm_device * pointer, so fix it. Fixes: af632d3f59e6 ("drm/amd/display: HDCP Locality check using DMUB Fused IO") Signed-off-by: chengjya --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

Re: [PATCH v8 00/10] Improve gpu_scheduler trace events + UAPI

2025-04-09 Thread Pierre-Eric Pelloux-Prayer
Hi, I've rebased the series on top of drm-next, applied the minor tweaks suggested by Tvrtko on v8 and the R-b tags. The result can be found on gitlab.fdo: https://gitlab.freedesktop.org/pepp/linux/-/commits/improve_gpu_scheduler_trace_v9 I believe it's ready to be merged, unless I've missed

Re: [lvc-project] [PATCH] drm/amdgpu: check a user-provided number of BOs in list

2025-04-09 Thread Linus Torvalds
On Tue, 8 Apr 2025 at 09:07, Fedor Pchelkin wrote: > > > Linus comment is about kvmalloc(), but the code here is using > > kvmalloc_array() which as far as I know is explicitly made to safely > > allocate arrays with parameters provided by userspace. No. ABSOLUTELY NOTHING CAN ALLOCATE ARRAYS WI