Re: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's

2024-03-13 Thread Sharma, Shashank
On 14/03/2024 06:58, Khatri, Sunil wrote: On 3/14/2024 2:06 AM, Alex Deucher wrote: On Tue, Mar 12, 2024 at 8:42 AM Sunil Khatri wrote: Add firmware version information of each IP and each instance where applicable. Is there a way we can share some common code with devcoredump, debugfs, a

Re: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's

2024-03-13 Thread Khatri, Sunil
On 3/14/2024 2:06 AM, Alex Deucher wrote: On Tue, Mar 12, 2024 at 8:42 AM Sunil Khatri wrote: Add firmware version information of each IP and each instance where applicable. Is there a way we can share some common code with devcoredump, debugfs, and the info IOCTL? All three places need to

Re: [PATCH] Documentation: add a page on amdgpu debugging

2024-03-13 Thread Friedrich Vock
On 13.03.24 22:01, Alex Deucher wrote: Covers GPU page fault debugging and adds a reference to umr. Signed-off-by: Alex Deucher --- Documentation/gpu/amdgpu/debugging.rst | 79 ++ Documentation/gpu/amdgpu/index.rst | 1 + 2 files changed, 80 insertions(+) creat

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-13 Thread Khatri, Sunil
On 3/14/2024 1:58 AM, Alex Deucher wrote: On Tue, Mar 12, 2024 at 8:41 AM Sunil Khatri wrote: Add all the IP's information on a SOC to the devcoredump. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 19 +++ 1 file changed, 19 insertions(+) di

Re: [PATCH 1/9] drm/amd/pm: Add support for DPM policies

2024-03-13 Thread Lazar, Lijo
This one is missing some NULL checks. Will send a v2. Thanks, Lijo On 3/13/2024 4:32 PM, Lijo Lazar wrote: > Add support to set/get information about different DPM policies. The > support is only available on SOCs which use swsmu architecture. > > A DPM policy type may be defined with different

Re: [PATCH] drm/amdgpu: Do a basic health check before reset

2024-03-13 Thread Lazar, Lijo
On 3/14/2024 1:19 AM, Felix Kuehling wrote: > > On 2024-03-13 5:41, Lijo Lazar wrote: >> Check if the device is present in the bus before trying to recover. It >> could be that device itself is lost from the bus in some hang >> situations. >> >> Signed-off-by: Lijo Lazar >> --- >>   drivers/gp

RE: [PATCH AUTOSEL 5.15 3/5] drm/amdgpu: Enable gpu reset for S3 abort cases on Raven series

2024-03-13 Thread Liang, Prike
[AMD Official Use Only - General] > From: Alex Deucher > Sent: Thursday, March 14, 2024 4:46 AM > To: Kuehling, Felix > Cc: Sasha Levin ; linux-ker...@vger.kernel.org; > sta...@vger.kernel.org; Liang, Prike ; Deucher, > Alexander ; Koenig, Christian > ; Pan, Xinhui ; > airl...@gmail.com; dan...@

回覆: [PATCH] drm/amdgpu/vpe: power on vpe when hw_init

2024-03-13 Thread Lee, Peyton
[AMD Official Use Only - General] Hi Alex, There are two places where VPE will lose power: When there is a system call to vpe_hw_fini(), and the vpe-thread finds that VEP has no jobs in the queue. This patch is to make sure that VPE is power up before loading firmware. Thanks, Peyton -原始郵件-

[PATCH] drm/sched: fix null-ptr-deref in init entity

2024-03-13 Thread vitaly.prosyak
From: Vitaly Prosyak The bug can be triggered by sending an amdgpu_cs_wait_ioctl to the AMDGPU DRM driver on any ASICs with valid context. The bug was reported by Joonkyo Jung . For example the following code: static void Syzkaller2(int fd) { union drm_amdgpu_ctx arg1; un

[PATCH] drm/scheduler: fix null-ptr-deref in init entity

2024-03-13 Thread vitaly.prosyak
From: Vitaly Prosyak The bug can be triggered by sending an amdgpu_cs_wait_ioctl to the AMDGPU DRM driver on any ASICs with valid context. The bug was reported by Joonkyo Jung . For example the following code: static void Syzkaller2(int fd) { union drm_amdgpu_ctx arg1; un

[PATCH] Documentation: add a page on amdgpu debugging

2024-03-13 Thread Alex Deucher
Covers GPU page fault debugging and adds a reference to umr. Signed-off-by: Alex Deucher --- Documentation/gpu/amdgpu/debugging.rst | 79 ++ Documentation/gpu/amdgpu/index.rst | 1 + 2 files changed, 80 insertions(+) create mode 100644 Documentation/gpu/amdgpu/debug

Re: [PATCH AUTOSEL 5.15 3/5] drm/amdgpu: Enable gpu reset for S3 abort cases on Raven series

2024-03-13 Thread Alex Deucher
On Wed, Mar 13, 2024 at 4:12 PM Felix Kuehling wrote: > > On 2024-03-11 11:14, Sasha Levin wrote: > > From: Prike Liang > > > > [ Upstream commit c671ec01311b4744b377f98b0b4c6d033fe569b3 ] > > > > Currently, GPU resets can now be performed successfully on the Raven > > series. While GPU reset is

Re: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's

2024-03-13 Thread Alex Deucher
On Tue, Mar 12, 2024 at 8:42 AM Sunil Khatri wrote: > > Add firmware version information of each > IP and each instance where applicable. > Is there a way we can share some common code with devcoredump, debugfs, and the info IOCTL? All three places need to query this information and the same log

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-13 Thread Alex Deucher
On Tue, Mar 12, 2024 at 8:41 AM Sunil Khatri wrote: > > Add all the IP's information on a SOC to the > devcoredump. > > Signed-off-by: Sunil Khatri > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 19 +++ > 1 file changed, 19 insertions(+) > > diff --git a/drivers/gpu/drm/amd/

Re: [PATCH AUTOSEL 5.15 3/5] drm/amdgpu: Enable gpu reset for S3 abort cases on Raven series

2024-03-13 Thread Felix Kuehling
On 2024-03-11 11:14, Sasha Levin wrote: From: Prike Liang [ Upstream commit c671ec01311b4744b377f98b0b4c6d033fe569b3 ] Currently, GPU resets can now be performed successfully on the Raven series. While GPU reset is required for the S3 suspend abort case. So now can enable gpu reset for S3 abor

Re: [PATCH] drm/amdgpu: Do a basic health check before reset

2024-03-13 Thread Felix Kuehling
On 2024-03-13 5:41, Lijo Lazar wrote: Check if the device is present in the bus before trying to recover. It could be that device itself is lost from the bus in some hang situations. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 24 ++ 1 fil

Re: [PATCH 2/2] drm:amdgpu: add firmware information of all IP's

2024-03-13 Thread Khatri, Sunil
[AMD Official Use Only - General] Gentle reminder Regards Sunil Get Outlook for Android From: Sunil Khatri Sent: Tuesday, March 12, 2024 6:11:48 PM To: Deucher, Alexander ; Koenig, Christian ; Sharma, Shashank Cc: amd-gfx@lists.freedesk

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-13 Thread Khatri, Sunil
[AMD Official Use Only - General] Gentle reminder for review. Regards Sunil Get Outlook for Android From: Sunil Khatri Sent: Tuesday, March 12, 2024 6:11:47 PM To: Deucher, Alexander ; Koenig, Christian ; Sharma, Shashank Cc: amd-gfx@li

Re: [PATCH 1/2] drm/amdgpu: add the IP information of the soc

2024-03-13 Thread Khatri, Sunil
[AMD Official Use Only - General] Gentle Reminder for review. Regards, Sunil Get Outlook for Android From: Sunil Khatri Sent: Tuesday, March 12, 2024 6:11:47 PM To: Deucher, Alexander ; Koenig, Christian ; Sharma, Shashank Cc: amd-gfx@l

Re: [PATCH] drm/amd/amdgpu: Enable IH Retry CAM by register read

2024-03-13 Thread Felix Kuehling
On 2024-03-13 13:43, Dewan Alam wrote: IH Retry CAM should be enabled by register reads instead of always being set to true. This explanation sounds odd. Your code is still writing the register first. What's the reason for reading back the register? I assume it's not needed for enabling the CA

[PATCH] drm/amd/amdgpu: Enable IH Retry CAM by register read

2024-03-13 Thread Dewan Alam
IH Retry CAM should be enabled by register reads instead of always being set to true. Signed-off-by: Dewan Alam --- drivers/gpu/drm/amd/amdgpu/vega20_ih.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/vega20_ih.c b/drivers/gpu/d

Re: [PATCH 33/43] drm/amd/display: Prevent crash on bring-up

2024-03-13 Thread Pillai, Aurabindo
[AMD Official Use Only - General] Might want to avoid bringup in the commit description -- Regards, Jay From: Wayne Lin Sent: Tuesday, March 12, 2024 5:20 AM To: amd-gfx@lists.freedesktop.org Cc: Wentland, Harry ; Li, Sun peng (Leo) ; Siqueira, Rodrigo ; Pilla

[PATCH] drm/amdkfd: range check cp bad op exception interrupts

2024-03-13 Thread Jonathan Kim
Due to a CP interrupt bug, bad packet garbage exception codes are raised. Do a range check so that the debugger and runtime do not receive garbage codes. Update the user api to guard exception code type checking as well. Signed-off-by: Jonathan Kim Tested-by: Jesse Zhang --- .../gpu/drm/amd/amd

Re: [PATCH] drm/amd/display: Get min/max vfreq from display_info

2024-03-13 Thread Hamza Mahfooz
On 3/12/24 09:47, Harry Wentland wrote: We need the min/max vfreq on the amdgpu_dm_connector in order to program VRR. Fixes: db3e4f1cbb84 ("drm/amd/display: Use freesync when `DRM_EDID_FEATURE_CONTINUOUS_FREQ` found") Signed-off-by: Harry Wentland Acked-by: Hamza Mahfooz --- drivers/gpu

RE: [PATCH] drm/amd/display: Get min/max vfreq from display_info

2024-03-13 Thread Wheeler, Daniel
[Public] Hi all, I can confirm that this re-enables VRR for a RX6800, and a RX7900XTX. Tested-by: Daniel Wheeler Thank you, Dan Wheeler Sr. Technologist | AMD SW Display -- 1 Comm

Re: [PATCH 1/2] drm/amdgpu/pm: Fix NULL pointer dereference when get power limit

2024-03-13 Thread Alex Deucher
On Wed, Mar 13, 2024 at 7:07 AM Ma Jun wrote: > > Because powerplay_table initialization is skipped under > sriov case, We check and set default lower and upper OD > value if powerplay_table is NULL. > > Fixes: 7968e9748fbb ("drm/amdgpu/pm: Fix the power1_min_cap value") > Signed-off-by: Ma Jun >

Re: [PATCH] drm/amdgpu/vpe: power on vpe when hw_init

2024-03-13 Thread Alex Deucher
On Wed, Mar 13, 2024 at 7:41 AM Peyton Lee wrote: > > To fix mode2 reset failure. > Should power on VPE when hw_init. When does it get powered down again? Won't this leave it powered up? Alex > > Signed-off-by: Peyton Lee > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c | 6 ++ > 1 file c

[PATCH] drm/amdgpu/vpe: power on vpe when hw_init

2024-03-13 Thread Peyton Lee
To fix mode2 reset failure. Should power on VPE when hw_init. Signed-off-by: Peyton Lee --- drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c index 70c5cc80ecdc..ecf

[PATCH 9/9] drm/amd/pm: Remove unused interface to set plpd

2024-03-13 Thread Lijo Lazar
Remove unused callback to set PLPD policy and its implementation from arcturus, aldebaran and SMUv13.0.6 SOCs. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 6 --- .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 22 --- .../drm

[PATCH 8/9] drm/amd/pm: Remove legacy interface for xgmi plpd

2024-03-13 Thread Lijo Lazar
Replace the legacy interface with amdgpu_dpm_set_pm_policy to set XGMI PLPD mode. Also, xgmi_plpd sysfs node is not used by any client. Remove that as well. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 4 +- drivers/gpu/drm/amd/pm/amd

[PATCH 6/9] drm/amd/pm: Add xgmi plpd to aldebaran pm_policy

2024-03-13 Thread Lijo Lazar
On aldebaran, allow changing xgmi plpd policy through pm_policy sysfs interface. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 35 +++ 1 file changed, 35 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/ald

[PATCH 7/9] drm/amd/pm: Add xgmi plpd to arcturus pm_policy

2024-03-13 Thread Lijo Lazar
On arcturus, allow changing xgmi plpd policy through pm_policy sysfs interface. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 7 ++-- .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 42 +++ 2 files changed, 46 insertion

[PATCH 4/9] drm/amd/pm: Add xgmi plpd policy to pm_policy

2024-03-13 Thread Lijo Lazar
Add support to set XGMI PLPD policy levels through pm_policy sysfs node. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/include/kgd_pp_interface.h | 1 + drivers/gpu/drm/amd/pm/amdgpu_pm.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/drivers/

[PATCH 5/9] drm/amd/pm: Add xgmi plpd to SMU v13.0.6 pm_policy

2024-03-13 Thread Lijo Lazar
On SOCs with SMU v13.0.6, allow changing xgmi plpd policy through pm_policy sysfs interface. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 15 +- .../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 51 +-- drivers/gpu/dr

[PATCH 2/9] drm/amd/pm: Update PMFW messages for SMUv13.0.6

2024-03-13 Thread Lijo Lazar
Add PMF message to select a Pstate policy in SOCs with SMU v13.0.6. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h | 3 ++- drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h | 3 ++- 2 files changed, 4 insertions(

[PATCH 3/9] drm/amd/pm: Add support to select pstate policy

2024-03-13 Thread Lijo Lazar
Add support to select pstate policy in SOCs with SMUv13.0.6 Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 2 + .../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 71 +++ drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 30 +

[PATCH 1/9] drm/amd/pm: Add support for DPM policies

2024-03-13 Thread Lijo Lazar
Add support to set/get information about different DPM policies. The support is only available on SOCs which use swsmu architecture. A DPM policy type may be defined with different levels. For example, a policy may be defined to select Pstate preference and then later a pstate preference may be ch

[PATCH 0/9] Add PM policy interfaces

2024-03-13 Thread Lijo Lazar
This series adds APIs to get the supported PM policies and also set them. A PM policy type is a predefined policy type supported by an SOC and each policy may define two or more levels to choose from. A user can select the appropriate level through amdgpu_dpm_set_pm_policy() or through sysfs node p

[PATCH 2/2] drm/amdgpu/pm: Check the validity of overdiver power limit

2024-03-13 Thread Ma Jun
Check the validity of overdriver power limit before using it. Signed-off-by: Ma Jun Suggested-by: Lazar Lijo Suggested-by: Alex Deucher --- .../gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 11 + .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 9 .../amd/pm/swsmu/smu11/sienna_c

[PATCH 1/2] drm/amdgpu/pm: Fix NULL pointer dereference when get power limit

2024-03-13 Thread Ma Jun
Because powerplay_table initialization is skipped under sriov case, We check and set default lower and upper OD value if powerplay_table is NULL. Fixes: 7968e9748fbb ("drm/amdgpu/pm: Fix the power1_min_cap value") Signed-off-by: Ma Jun Reported-by: Yin Zhenguo Suggested-by: Lazar Lijo Suggested

RE: [PATCH] drm/amdgpu: Do a basic health check before reset

2024-03-13 Thread Kamal, Asad
[AMD Official Use Only - General] Reviewed-by: Asad Kamal Thanks & Regards Asad -Original Message- From: Lazar, Lijo Sent: Wednesday, March 13, 2024 3:12 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Kamal, Asad Subject: [PATCH] drm/amdgpu: Do a basic

[PATCH] drm/amdgpu: Do a basic health check before reset

2024-03-13 Thread Lijo Lazar
Check if the device is present in the bus before trying to recover. It could be that device itself is lost from the bus in some hang situations. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 24 ++ 1 file changed, 24 insertions(+) diff --git a/dr

[PATCH] drm/amdgpu: correct the KGQ fallback message

2024-03-13 Thread Prike Liang
Fix the KGQ fallback function name, as this will help differentiate the failure in the KCQ enablement. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu

Re: [PATCH] drm/amdgpu: cleanup unused variable

2024-03-13 Thread Christian König
Am 12.03.24 um 16:31 schrieb Shashank Sharma: This patch removes an unused input variable in the MES doorbell function. Cc: Christian König Cc: Alex Deucher Signed-off-by: Shashank Sharma Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 10 +++--- 1 file

[PATCH 3/3] drm/amdgpu: make reset method configurable for RAS poison

2024-03-13 Thread Tao Zhou
Each RAS block has different requirement for gpu reset in poison consumption handling. Add support for mmhub RAS poison consumption handling. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 2 +- drivers/gpu/drm/

[PATCH 2/3] drm/amdgpu: support utcl2 RAS poison query for mmhub

2024-03-13 Thread Tao Zhou
Support the query for both gfxhub and mmhub, also replace xcc_id with hub_inst. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 17 - drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +-- .

[PATCH 1/3] drm/amdgpu: add utcl2 RAS poison query for mmhub

2024-03-13 Thread Tao Zhou
Add it for mmhub v1.8. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.h | 2 ++ drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 15 +++ 2 files changed, 17 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mmhub.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mmh

Re: [PATCH] drm/amdgpu/: Remove bo_create_kernel_at path from virt page

2024-03-13 Thread Christian König
Am 12.03.24 um 18:50 schrieb Victor Skvortsov: Use amdgpu_vram_mgr to reserve bad page ranges. Reserved ranges will be freed by amdgpu_vram_mgr_fini() Delete bo_create path as it is redundant. Suggested-by: Christian König Signed-off-by: Victor Skvortsov Acked-by: Christian König --- dr

Re: [RFC PATCH v4 22/42] drm/vkms: Use s32 for internal color pipeline precision

2024-03-13 Thread Pekka Paalanen
On Mon, 26 Feb 2024 16:10:36 -0500 Harry Wentland wrote: > Certain operations require us to preserve values below 0.0 and > above 1.0 (0x0 and 0x respectively in 16 bpc unorm). One > such operation is a BT709 encoding operation followed by its > decoding operation, or the reverse. > > We'll

Re: [RFC PATCH v4 17/42] drm/vkms: Add enumerated 1D curve colorop

2024-03-13 Thread Pekka Paalanen
On Mon, 26 Feb 2024 16:10:31 -0500 Harry Wentland wrote: > This patch introduces a VKMS color pipeline that includes two > drm_colorops for named transfer functions. For now the only ones > supported are sRGB EOTF, sRGB Inverse EOTF, and a Linear TF. > We will expand this in the future but I don'

Re: [PATCH v2] drm/amdgpu: Clear the hotplug interrupt ack bit before hpd initialization

2024-03-13 Thread Qiang Ma
On Wed, 31 Jan 2024 15:57:03 +0800 Qiang Ma wrote: Hello everyone, please help review this patch. Qiang Ma > Problem: > The computer in the bios initialization process, unplug the HDMI > display, wait until the system up, plug in the HDMI display, did not > enter the hotplug interrupt functio

Re: [RFC PATCH v4 10/42] drm/colorop: Add TYPE property

2024-03-13 Thread Pekka Paalanen
On Mon, 26 Feb 2024 16:10:24 -0500 Harry Wentland wrote: > Add a read-only TYPE property. The TYPE specifies the colorop > type, such as enumerated curve, 1D LUT, CTM, 3D LUT, PWL LUT, > etc. > > v4: > - Use enum property for TYPE (Pekka) > > v3: > - Make TYPE a range property > - Move enum

Re: [RFC PATCH v4 10/42] drm/colorop: Add TYPE property

2024-03-13 Thread Pekka Paalanen
On Tue, 12 Mar 2024 15:15:13 + Simon Ser wrote: > On Tuesday, March 12th, 2024 at 16:02, Pekka Paalanen > wrote: > > > This list here is the list of all values the enum could take, right? > > Should it not be just the one value it's going to have? It'll never > > change, and it can never b