[PATCH 3/5] drm/amdgpu: Fix userq ttm_bo_pin and ttm_bo_unpin lockdep warnings

2025-04-08 Thread Arunpravin Paneer Selvam
The ttm_bo_pin and ttm_bo_unpin warnings are resolved by moving the doorbell bo reserve up before pin/unpin. WARNING: CPU: 11 PID: 1818 at drivers/gpu/drm/ttm/ttm_bo.c:592 ttm_bo_pin+0x1f6/0x270 [ttm] [ +0.000277] CPU: 11 UID: 1000 PID: 1818 Comm: Xwayland Tainted: GW 6.12.0+ #

Re: [PATCH v2] drm/amd/display: Add error check for avi and vendor infoframe setup function

2025-04-08 Thread kernel test robot
Hi Wentao, kernel test robot noticed the following build errors: [auto build test ERROR on drm-exynos/exynos-drm-next] [also build test ERROR on linus/master v6.15-rc1 next-20250408] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to

[PATCH 1/5] drm/amdgpu/userq: Fix lock contention in userq fence

2025-04-08 Thread Arunpravin Paneer Selvam
Fix lockdep warnings. [ +0.000637] [ +0.04] WARNING: inconsistent lock state [ +0.04] 6.12.0+ #18 Tainted: GW OE [ +0.04] [ +0.04] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. [ +0.04] Xway

Re: [PATCH v2] drm/amd/display: Add error check for avi and vendor infoframe setup function

2025-04-08 Thread kernel test robot
Hi Wentao, kernel test robot noticed the following build errors: [auto build test ERROR on drm-exynos/exynos-drm-next] [also build test ERROR on linus/master v6.15-rc1 next-20250408] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to

Re: [RFC] PCI: add CONFIG_MMU dependency

2025-04-08 Thread Thomas Zimmermann
Am 07.04.25 um 12:38 schrieb Arnd Bergmann: From: Arnd Bergmann It turns out that there are no platforms that have PCI but don't have an MMU, so adding a Kconfig dependency on CONFIG_PCI simplifies build testing kernels for those platforms a lot, and avoids a lot of inadvertent build regress

Re: [lvc-project] [PATCH] drm/amdgpu: check a user-provided number of BOs in list

2025-04-08 Thread Fedor Pchelkin
On Tue, 08. Apr 11:26, Christian König wrote: > Am 08.04.25 um 11:17 schrieb Denis Arefev: > > The user can set any value to the variable ‘bo_number’, via the ioctl > > command DRM_IOCTL_AMDGPU_BO_LIST. This will affect the arithmetic > > expression ‘in->bo_number * in->bo_info_size’, which is pron

Re: [lvc-project] [PATCH] drm/amdgpu: check a user-provided number of BOs in list

2025-04-08 Thread Fedor Pchelkin
On Tue, 08. Apr 14:22, Christian König wrote: > Am 08.04.25 um 13:54 schrieb Fedor Pchelkin: > > If user can request an arbitrary size value then we should use __GFP_NOWARN > > and back on the allocator to return NULL in case it doesn't even try to > > satisfy an enormous memory allocation request

Re: [PATCH 1/2] drm/amdgpu: use a dummy owner for sysfs triggered cleaner shaders v3

2025-04-08 Thread Alex Deucher
On Tue, Apr 8, 2025 at 11:30 AM Christian König wrote: > > Otherwise triggering sysfs multiple times without other submissions in > between only runs the shader once. > > v2: add some comment > v3: re-add missing cast > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_g

Re: [PATCH] Fixed the warning at ./drivers/gpu/drm/amd/include/amd_shared.h:369

2025-04-08 Thread Mario Limonciello
On 4/4/2025 10:12 PM, Kevin Paul Reddy Janagari wrote: warning: Incorrect use of kernel-doc format: * @DC_HDCP_LC_ENABLE_SW_FALLBACK If set, upon HDCP Locality Check FW Signed-off-by: Kevin Paul Reddy Janagari --- drivers/gpu/drm/amd/include/amd_shared.h | 2 +- 1 file changed, 1 insertion(+

Re: [PATCH] drm/amd: Forbid suspending into non-default suspend states

2025-04-08 Thread Alex Deucher
On Tue, Apr 8, 2025 at 2:10 PM Mario Limonciello wrote: > > From: Mario Limonciello > > On systems that default to 'deep' some userspace software likes > to try to suspend in 'deep' first. If there is a failure for any > reason (such as -ENOMEM) the failure is ignored and then it will > try to u

Re: [PATCH] drm/amdgpu: Fix CPER error handling on VFs

2025-04-08 Thread Yi, Tony
[AMD Official Use Only - AMD Internal Distribution Only] Signed-off-by: Tony Yi From: Zhang, Hawking Sent: Wednesday, April 2, 2025 12:23 AM To: Skvortsov, Victor ; amd-gfx@lists.freedesktop.org Cc: Luo, Zhigang ; Zhou1, Tao ; Zhao, Victor ; Yi, Tony Subject:

Re: [PATCH V8 06/43] drm/colorop: Add 1D Curve subtype

2025-04-08 Thread Daniel Stone
Hi Harry, On Tue, 8 Apr 2025 at 18:30, Harry Wentland wrote: > On 2025-04-08 12:40, Daniel Stone wrote: > > OK, Harry's reply cleared that up perfectly - the flexibility that's > > there at the moment is about being able to reuse colorops for CRTCs in > > post-blend ops (great!), not shared betwe

[PATCH] drm/amd: Forbid suspending into non-default suspend states

2025-04-08 Thread Mario Limonciello
From: Mario Limonciello On systems that default to 'deep' some userspace software likes to try to suspend in 'deep' first. If there is a failure for any reason (such as -ENOMEM) the failure is ignored and then it will try to use 's2idle' as a fallback. This fails, but more importantly it leads t

RE: [PATCH] drm/amdgpu: Fix the comment to avoid warning

2025-04-08 Thread SHANMUGAM, SRINIVASAN
[AMD Official Use Only - AMD Internal Distribution Only] This was submitted much before https://patchwork.freedesktop.org/patch/639004/ -Original Message- From: amd-gfx On Behalf Of Christian König Sent: Thursday, April 3, 2025 2:33 PM To: Khatri, Sunil ; amd-gfx@lists.freedesktop.org C

[PATCH 5.10 062/227] drm/amd/display/dc/core/dc_resource: Staticify local functions

2025-04-08 Thread Greg Kroah-Hartman
5.10-stable review patch. If anyone has any objections, please let me know. -- From: Lee Jones [ Upstream commit c88855f3a50903721c4e1dda16cb42b5f5432b5c ] Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.c:1120:5: wa

Re: [PATCH V8 06/43] drm/colorop: Add 1D Curve subtype

2025-04-08 Thread Harry Wentland
On 2025-04-08 12:40, Daniel Stone wrote: > Hi there, > > On Tue, 1 Apr 2025 at 20:53, Simon Ser wrote: >> On Tuesday, April 1st, 2025 at 17:14, Daniel Stone >> wrote: >>> 'plane' seems really incongruous here. The colorop can be created for >>> any number of planes, but we're setting it to a

Re: [PATCH V8 06/43] drm/colorop: Add 1D Curve subtype

2025-04-08 Thread Daniel Stone
Hi there, On Tue, 1 Apr 2025 at 20:53, Simon Ser wrote: > On Tuesday, April 1st, 2025 at 17:14, Daniel Stone > wrote: > > 'plane' seems really incongruous here. The colorop can be created for > > any number of planes, but we're setting it to always be bound to a > > single plane at init, and th

Re: [PATCH] drm/amdgpu: remove the duplicated mes queue active state setting

2025-04-08 Thread Alex Deucher
On Fri, Mar 28, 2025 at 7:52 AM Prike Liang wrote: > > The MES queue deactivation and active status are already set in > mes_userq_unmap|map(), so the caller needn't set the queue_active > bit again. > > Signed-off-by: Prike Liang Acked-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/mes_

Re: [PATCH] drm/amd/pm/smu11: Prevent division by zero

2025-04-08 Thread Denis Arefev
> --- > drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c > b/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c > index 189c6a32b6bd..54229b991858 100644 > --- a/drivers/gpu/drm/amd/p

Re: [lvc-project] [PATCH] drm/amdgpu: check a user-provided number of BOs in list

2025-04-08 Thread Fedor Pchelkin
On Tue, 08. Apr 13:37, Christian König wrote: > Am 08.04.25 um 11:39 schrieb Fedor Pchelkin: > > On Tue, 08. Apr 11:26, Christian König wrote: > >> Am 08.04.25 um 11:17 schrieb Denis Arefev: > >>> The user can set any value to the variable ‘bo_number’, via the ioctl > >>> command DRM_IOCTL_AMDGPU_B

Re: [RFC] PCI: add CONFIG_MMU dependency

2025-04-08 Thread Arnd Bergmann
On Tue, Apr 8, 2025, at 12:22, Geert Uytterhoeven wrote: > On Mon, 7 Apr 2025 at 12:40, Arnd Bergmann wrote: > >> --- a/drivers/pci/Kconfig >> +++ b/drivers/pci/Kconfig >> @@ -21,6 +21,7 @@ config GENERIC_PCI_IOMAP >> menuconfig PCI >> bool "PCI support" >> depends on HAVE_PCI >>

[PATCH] drm/amdgpu: check a user-provided number of BOs in list

2025-04-08 Thread Denis Arefev
The user can set any value to the variable ‘bo_number’, via the ioctl command DRM_IOCTL_AMDGPU_BO_LIST. This will affect the arithmetic expression ‘in->bo_number * in->bo_info_size’, which is prone to overflow. Add a valid value check. Found by Linux Verification Center (linuxtesting.org) with SVA

Re: [RFC] PCI: add CONFIG_MMU dependency

2025-04-08 Thread Geert Uytterhoeven
Hi Arnd, CC Gerg On Mon, 7 Apr 2025 at 12:40, Arnd Bergmann wrote: > From: Arnd Bergmann > > It turns out that there are no platforms that have PCI but don't have an MMU, > so adding a Kconfig dependency on CONFIG_PCI simplifies build testing kernels > for those platforms a lot, and avoids a lo

[PATCH 1/3] drm/amdgpu/mes11: use the device value for enforce isolation

2025-04-08 Thread Alex Deucher
Use the local setting rather than the global parameter. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c index 06b51867c9aac.

[PATCH 3/3] drm/amdgpu: adjust enforce_isolation handling

2025-04-08 Thread Alex Deucher
Switch from a bool to an enum and allow more options for enforce isolation. There are now 3 modes of operation: - Disabled (0) - Enabled (serialization and cleaner shader) (1) - Enabled in legacy mode (no serialization or cleaner shader) (2) This provides better flexibility for more use cases. Si

[PATCH 2/3] drm/amdgpu/mes12: use the device value for enforce isolation

2025-04-08 Thread Alex Deucher
Use the local setting rather than the global parameter. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c index 8892858cfd9ae.

[PATCH 2/2] drm/amdgpu: cleanup amdgpu_vm_flush v4

2025-04-08 Thread Christian König
This reverts commit c2cc3648ba517a6c270500b5447d5a1efdad5936. Turned out that this has some negative consequences for some workloads. Instead check if the cleaner shader should run directly. While at it remove amdgpu_vm_need_pipeline_sync(), we also check again if the VMID has seen a GPU reset sin

Re: [PATCH v2] drm/amdgpu: Fix CPER error handling on VFs

2025-04-08 Thread Yi, Tony
[AMD Official Use Only - AMD Internal Distribution Only] Signed-off-by: Tony Yi From: Zhang, Hawking Sent: Wednesday, April 2, 2025 12:24 AM To: Skvortsov, Victor; amd-gfx@lists.freedesktop.org Cc: Luo, Zhigang; Zhou1, Tao; Zhao, Victor; Yi, Tony Subject: RE: [P

Re: [PATCH v1 1/2] drm/amdgpu: add pid and name of the process with userq manager

2025-04-08 Thread Christian König
Am 08.04.25 um 14:41 schrieb Sunil Khatri: > Add the pid and the process name of the process > with the userq manager which could be used in > debugging and understanding error messages better. That should be unnecessary. We already have that in the DRM file as well as the VM which is also update

Re: [PATCH 0/6] Introduce a generic function to get the CSB buffer

2025-04-08 Thread Rodrigo Siqueira
On 04/07, Alex Deucher wrote: > On Mon, Apr 7, 2025 at 4:15 PM Rodrigo Siqueira wrote: > > > > On 04/07, Alex Deucher wrote: > > > On Sun, Apr 6, 2025 at 7:07 PM Rodrigo Siqueira > > > wrote: > > > > > > > > This patchset was inspired and made on top of the below series: > > > > > > > > https://

Re: [PATCH] drm/amd/pm/smu11: Prevent division by zero

2025-04-08 Thread Alex Deucher
Oh, sorry, I've picked it up now. Thanks! Alex On Tue, Apr 8, 2025 at 4:16 AM Denis Arefev wrote: > > > --- > > drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c > > b/dr

[PATCH v1 1/2] drm/amdgpu: add pid and name of the process with userq manager

2025-04-08 Thread Sunil Khatri
Add the pid and the process name of the process with the userq manager which could be used in debugging and understanding error messages better. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 8 drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h | 2 ++ 2 fil

[PATCH v1 2/2] drm/amdgpu: update the error logging for more information

2025-04-08 Thread Sunil Khatri
add process and pid information in the userqueue error logging to make it more useful in resolving the error by logs. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/

Re: [lvc-project] [PATCH] drm/amdgpu: check a user-provided number of BOs in list

2025-04-08 Thread Christian König
Am 08.04.25 um 13:54 schrieb Fedor Pchelkin: > If user can request an arbitrary size value then we should use __GFP_NOWARN > and back on the allocator to return NULL in case it doesn't even try to > satisfy an enormous memory allocation request (in which case it yells in > the log without __GFP_NOW

Re: [lvc-project] [PATCH] drm/amdgpu: check a user-provided number of BOs in list

2025-04-08 Thread Christian König
Am 08.04.25 um 11:39 schrieb Fedor Pchelkin: > On Tue, 08. Apr 11:26, Christian König wrote: >> Am 08.04.25 um 11:17 schrieb Denis Arefev: >>> The user can set any value to the variable ‘bo_number’, via the ioctl >>> command DRM_IOCTL_AMDGPU_BO_LIST. This will affect the arithmetic >>> expression ‘

[PATCH 5.15.y] drm/amd/pm: Fix negative array index read

2025-04-08 Thread jianqi.ren.cn
From: Jesse Zhang [ Upstream commit c8c19ebf7c0b202a6a2d37a52ca112432723db5f ] Avoid using the negative values for clk_idex as an index into an array pptable->DpmDescriptor. V2: fix clk_index return check (Tim Huang) Signed-off-by: Jesse Zhang Reviewed-by: Tim Huang Signed-off-by: Alex Deuch

Re: [PATCH] drm/amdgpu: check a user-provided number of BOs in list

2025-04-08 Thread Christian König
Am 08.04.25 um 11:17 schrieb Denis Arefev: > The user can set any value to the variable ‘bo_number’, via the ioctl > command DRM_IOCTL_AMDGPU_BO_LIST. This will affect the arithmetic > expression ‘in->bo_number * in->bo_info_size’, which is prone to > overflow. Add a valid value check. As far as I

[PATCH 5.10.y] drm/amd/display: Skip inactive planes within ModeSupportAndSystemConfiguration

2025-04-08 Thread jianqi.ren.cn
From: Hersen Wu [ Upstream commit a54f7e866cc73a4cb71b8b24bb568ba35c8969df ] [Why] Coverity reports Memory - illegal accesses. [How] Skip inactive planes. Reviewed-by: Alex Hung Acked-by: Tom Chung Signed-off-by: Hersen Wu Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher [get_pipe_id

Re: [RFC] PCI: add CONFIG_MMU dependency

2025-04-08 Thread Jeff Hugo
On 4/7/2025 4:38 AM, Arnd Bergmann wrote: From: Arnd Bergmann It turns out that there are no platforms that have PCI but don't have an MMU, so adding a Kconfig dependency on CONFIG_PCI simplifies build testing kernels for those platforms a lot, and avoids a lot of inadvertent build regressions.

[v4 6/7] drm/amd/amdgpu: Refactor SDMA v5.2 reset logic into stop_queue and restore_queue functions

2025-04-08 Thread jesse.zh...@amd.com
This patch refactors the SDMA v5.2 reset logic by splitting the `sdma_v5_2_reset_queue` function into two separate functions: `sdma_v5_2_stop_queue` and `sdma_v5_2_restore_queue`. This change aligns with the new SDMA reset mechanism, where the reset process is divided into stopping the queue, pe

[PATCH 5.15.y] drm/amd/display: Skip inactive planes within ModeSupportAndSystemConfiguration

2025-04-08 Thread jianqi.ren.cn
From: Hersen Wu [ Upstream commit a54f7e866cc73a4cb71b8b24bb568ba35c8969df ] [Why] Coverity reports Memory - illegal accesses. [How] Skip inactive planes. Reviewed-by: Alex Hung Acked-by: Tom Chung Signed-off-by: Hersen Wu Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher [get_pipe_id

[v4 7/7] drm/amd/amdgpu: Remove deprecated SDMA reset callback mechanism

2025-04-08 Thread jesse.zh...@amd.com
From: "jesse.zh...@amd.com" This patch removes the deprecated SDMA reset callback mechanism, which was previously used to register pre-reset and post-reset callbacks for SDMA engine resets. The callback mechanism has been replaced with a more direct and efficient approach using `stop_queue` a

[v4 4/7] drm/amd/amdgpu: Refactor SDMA v5.0 reset logic into top_queue and restore_queue function

2025-04-08 Thread jesse.zh...@amd.com
From: "jesse.zh...@amd.com" This patch refactors the SDMA v5.0 reset logic by splitting the `sdma_v5_0_reset_queue` function into two separate functions: `sdma_v5_0_stop_queue` and `sdma_v5_0_restore_queue`. This change aligns with the new SDMA reset mechanism, where the reset process is divi

[v4 3/7] drm/amdgpu: Optimize SDMA v5.0 queue reset and stop logic

2025-04-08 Thread jesse.zh...@amd.com
From: "jesse.zh...@amd.com" This patch refactors the SDMA v5.0 queue reset and stop logic to improve code readability, maintainability, and performance. The key changes include: 1. **Generalized `sdma_v5_0_gfx_stop` Function**: - Added an `inst_mask` parameter to allow stopping specific SDMA

[v4 2/7] drm/amd/amdgpu: Implement SDMA soft reset directly for sdma v5

2025-04-08 Thread jesse.zh...@amd.com
From: "jesse.zh...@amd.com" This patch introduces a new function `amdgpu_sdma_soft_reset` to handle SDMA soft resets directly, rather than relying on the DPM interface. 1. **New `amdgpu_sdma_soft_reset` Function**: - Implements a soft reset for SDMA engines by directly writing to the hardwa

[v4 1/7] drm/amd/amdgpu: Simplify SDMA reset mechanism by removing dynamic callbacks

2025-04-08 Thread jesse.zh...@amd.com
Since KFD no longer registers its own callbacks for SDMA resets, and only KGD uses the reset mechanism, we can simplify the SDMA reset flow by directly calling the ring's `stop_queue` and `start_queue` functions. This patch removes the dynamic callback mechanism and prepares for its eventual dep

[PATCH] drm/amdgpu: fix warning of drm_mm_clean

2025-04-08 Thread ZhenGuo Yin
Kernel doorbell BOs needs to be freed before ttm_fini. Fixes: 54c30d2a8def ("drm/amdgpu: create kernel doorbell pages") Signed-off-by: ZhenGuo Yin --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_

RE: [PATCH] drm/amdgpu: Reset RAS table if header is invalid

2025-04-08 Thread Zhou1, Tao
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Tao Zhou > -Original Message- > From: Lazar, Lijo > Sent: Tuesday, April 8, 2025 2:14 PM > To: amd-gfx@lists.freedesktop.org; Lazar, Lijo > Cc: Zhang, Hawking ; Deucher, Alexander > ; Li, Candice ; Zhou1, Tao > > Su

[PATCH 6.6.y] drm/amd/display: Check link_index before accessing dc->links[]

2025-04-08 Thread jianqi.ren.cn
From: Alex Hung [ Upstream commit 8aa2864044b9d13e95fe224f32e808afbf79ecdf ] [WHY & HOW] dc->links[] has max size of MAX_LINKS and NULL is return when trying to access with out-of-bound index. This fixes 3 OVERRUN and 1 RESOURCE_LEAK issues reported by Coverity. Reviewed-by: Harry Wentland Ac

[PATCH 5.10.y] drm/amd/pm: Fix negative array index read

2025-04-08 Thread jianqi.ren.cn
From: Jesse Zhang [ Upstream commit c8c19ebf7c0b202a6a2d37a52ca112432723db5f ] Avoid using the negative values for clk_idex as an index into an array pptable->DpmDescriptor. V2: fix clk_index return check (Tim Huang) Signed-off-by: Jesse Zhang Reviewed-by: Tim Huang Signed-off-by: Alex Deuch

[PATCH v2] drm/amd/display: Add error check for avi and vendor infoframe setup function

2025-04-08 Thread Wentao Liang
The function fill_stream_properties_from_drm_display_mode() calls the function drm_hdmi_avi_infoframe_from_display_mode() and the function drm_hdmi_vendor_infoframe_from_display_mode(), but does not check its return value. Log the error messages to prevent silent failure if either function fails.

Re: [v3 5/7] drm/amdgpu: Optimize SDMA v5.2 queue reset and stop logic

2025-04-08 Thread Alex Deucher
On Wed, Apr 2, 2025 at 5:15 AM jesse.zh...@amd.com wrote: > > From: "jesse.zh...@amd.com" > > This patch refactors the SDMA v5.2 queue reset and stop logic to improve > code readability, maintainability, and performance. The key changes include: > > 1. **Generalized `sdma_v5_2_gfx_stop` Function*