[PATCH 1/3] Revert "drm/amdgpu: add debugfs amdgpu_reset_level"

2022-10-13 Thread Victor Zhao
This reverts commit 3ae992d5e1194a16e3d977076eb5722fa6e410d8. This commit breaks the reset logic for aldebaran, revert it for now. Will move the mask inside the reset handler. --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 2 -- drivers/gpu

[PATCH 3/3] drm/amdgpu: Refactor mode2 reset logic for v11.0.7

2022-10-13 Thread Victor Zhao
- refactor mode2 on v11.0.7 to align with aldebaran - comment out using mode2 reset as default for now, will introduce another controller to replace previous reset_level_mask Signed-off-by: Victor Zhao --- drivers/gpu/drm/amd/amdgpu/sienna_cichlid.c | 23 ++--- 1 file changed, 16

[PATCH 2/3] Revert "drm/amdgpu: let mode2 reset fallback to default when failure"

2022-10-13 Thread Victor Zhao
This reverts commit 3efc702897c54c95c332632157ab042e942512c7. This commit reverted the AMDGPU_SKIP_MODE2_RESET as it conflicts with the original design of reset handler. Will redesign it. --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +--

Re: [PATCH V3 3/3] drm/amd/pm: disable cstate feature for gpu reset scenario

2022-10-13 Thread Lazar, Lijo
On 10/13/2022 10:56 AM, Quan, Evan wrote: [AMD Official Use Only - General] -Original Message- From: Lazar, Lijo Sent: Thursday, October 13, 2022 12:14 PM To: Quan, Evan ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Zhang, Hawking Subject: Re: [PATCH V3 3/3] drm/amd/pm

Re: [PATCH 1/3] Revert "drm/amdgpu: add debugfs amdgpu_reset_level"

2022-10-13 Thread Lazar, Lijo
On 10/13/2022 1:57 PM, Victor Zhao wrote: This reverts commit 3ae992d5e1194a16e3d977076eb5722fa6e410d8. This commit breaks the reset logic for aldebaran, revert it for now. Will move the mask inside the reset handler. Thanks for reverting. Please cc me also for refactored patches for SMU 1

[PATCH] drm/amdgpu: dequeue mes scheduler during fini

2022-10-13 Thread YuBiao Wang
[Why] If mes is not dequeued during fini, mes will be in an uncleaned state during reload, then mes couldn't receive some commands which leads to reload failure. [How] Perform MES dequeue via MMIO after all the unmap jobs are done by mes and before kiq fini. Signed-off-by: YuBiao Wang --- drive

[PATCH v2] drm/amdgpu: dequeue mes scheduler during fini

2022-10-13 Thread YuBiao Wang
Resend to fix coding style issue. [Why] If mes is not dequeued during fini, mes will be in an uncleaned state during reload, then mes couldn't receive some commands which leads to reload failure. [How] Perform MES dequeue via MMIO after all the unmap jobs are done by mes and before kiq fini. Sig

Re: [PATCH v2] drm/amdgpu: dequeue mes scheduler during fini

2022-10-13 Thread Christian König
Am 13.10.22 um 12:39 schrieb YuBiao Wang: Resend to fix coding style issue. [Why] If mes is not dequeued during fini, mes will be in an uncleaned state during reload, then mes couldn't receive some commands which leads to reload failure. [How] Perform MES dequeue via MMIO after all the unmap jo

Re: [PATCH 1/3] Revert "drm/amdgpu: add debugfs amdgpu_reset_level"

2022-10-13 Thread Deucher, Alexander
[Public] This patch is missing your signed-off-by. Alex From: amd-gfx on behalf of Victor Zhao Sent: Thursday, October 13, 2022 4:27 AM To: amd-gfx@lists.freedesktop.org Cc: Grodzovsky, Andrey ; Lazar, Lijo ; Zhao, Victor Subject: [PATCH 1/3] Revert "drm/am

[regression][6.0] After commit b261509952bc19d1012cf732f853659be6ebc61e I see WARNING message at drivers/gpu/drm/drm_modeset_lock.c:276 drm_modeset_drop_locks+0x63/0x70

2022-10-13 Thread Mikhail Gavrilov
Hi! I bisected an issue of the 6.0 kernel which started happening after 6.0-rc7 on all my machines. Backtrace of this issue looks like as: [ 2807.339439] [ cut here ] [ 2807.339445] WARNING: CPU: 11 PID: 2061 at drivers/gpu/drm/drm_modeset_lock.c:276 drm_modeset_drop_locks

Re: [PATCH] drm/amd/pm: Init pm_attr_list when dpm is disabled

2022-10-13 Thread Alex Deucher
On Wed, Oct 12, 2022 at 5:36 AM ZhenGuo Yin wrote: > > [Why] > In SRIOV multi-vf, dpm is always disabled, and pm_attr_list won't > be initialized. There will be a NULL pointer call trace after > removing the dpm check condition in amdgpu_pm_sysfs_fini. > BUG: kernel NULL pointer dereference, addre

Re: [PATCH v3 0/6] Add support for atomic async page-flips

2022-10-13 Thread Simon Ser
> > > So no tests that actually verify that the kernel properly rejects > > > stuff stuff like modesets, gamma LUT updates, plane movement, > > > etc.? > > > > Pondering this a bit more, it just occurred to me the current driver > > level checks might easily lead to confusing behaviour. Eg. is > >

[PATCH] drm/amdkfd: Cleanup kfd_dev struct

2022-10-13 Thread Alex Deucher
From: Mukul Joshi Cleanup kfd_dev struct by removing ddev and pdev as both drm_device and pci_dev can be fetched from amdgpu_device. Signed-off-by: Mukul Joshi Tested-by: Amber Lin Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +-

[PATCH] drm/amd/display: Increase frame size limit for display_mode_vba_util_32.o

2022-10-13 Thread Guenter Roeck
Building 32-bit images may fail with the following error. drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_util_32.c: In function ‘dml32_UseMinimumDCFCLK’: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_util_32.c:3142:1: error: the frame size

[PATCH] drm/radeon: Replace kmap() with kmap_local_page()

2022-10-13 Thread Fabio M. De Francesco
The use of kmap() is being deprecated in favor of kmap_local_page(). There are two main problems with kmap(): (1) It comes with an overhead as the mapping space is restricted and protected by a global lock for synchronization and (2) it also requires global TLB invalidation when the kmap’s pool wr

Re: [1/2] drm/amd/pm: update SMU IP v13.0.4 driver interface version

2022-10-13 Thread Limonciello, Mario
On 10/13/2022 00:46, Tim Huang wrote: Update the SMU driver interface version to V7. Signed-off-by: Tim Huang --- .../swsmu/inc/pmfw_if/smu13_driver_if_v13_0_4.h | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/sm

Re: [2/2] drm/amd/pm: add SMU IP v13.0.4 IF version define to V7

2022-10-13 Thread Limonciello, Mario
On 10/13/2022 00:46, Tim Huang wrote: The pmfw has changed the driver interface version, so keep same with the fw. Signed-off-by: Tim Huang Cc: sta...@vger.kernel.org #6.0 Reviewed-by: Mario Limonciello --- drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h | 2 +- 1 file changed, 1 insertion(

[PATCH v3] drm/amdgpu: dequeue mes scheduler during fini

2022-10-13 Thread YuBiao Wang
Update: Remove redundant comments as Christian suggests. [Why] If mes is not dequeued during fini, mes will be in an uncleaned state during reload, then mes couldn't receive some commands which leads to reload failure. [How] Perform MES dequeue via MMIO after all the unmap jobs are done by mes an

[PATCH] drm/amdgpu: Revert "drm/amdgpu: getting fan speed pwm for vega10 properly"

2022-10-13 Thread Asher Song
This reverts commit fe01cb24b81c0091d7e5668874d51ce913e44a97. Unfortunately, that commit causes fan monitors can't be read and written properly. Signed-off-by: Asher Song --- .../amd/pm/powerplay/hwmgr/vega10_thermal.c | 25 ++- 1 file changed, 13 insertions(+), 12 deletions(-

RE: [PATCH] drm/amdgpu: Revert "drm/amdgpu: getting fan speed pwm for vega10 properly"

2022-10-13 Thread Chen, Guchun
Reviewed-by: Guchun Chen Regards, Guchun -Original Message- From: Song, Asher Sent: Friday, October 14, 2022 12:15 PM To: Deucher, Alexander ; stalk...@gmail.com; Chen, Guchun ; Quan, Evan ; amd-gfx@lists.freedesktop.org Cc: Song, Asher Subject: [PATCH] drm/amdgpu: Revert "drm/amdgp

RE: [1/2] drm/amd/pm: update SMU IP v13.0.4 driver interface version

2022-10-13 Thread Huang, Tim
[Public] Hi Mario, Comments inline. Thanks. Best Regards, Tim Huang -Original Message- From: Limonciello, Mario Sent: Friday, October 14, 2022 5:35 AM To: Huang, Tim ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Zhang, Yifan ; Du, Xiaojian ; Gong, Richard Subject: Re: [1/

[PATCH 1/8] drm/amd/pm: temporarily disable thermal alert on smu_v13_0_10

2022-10-13 Thread Gao, Likun
[AMD Official Use Only - General] temporarily disable thermal alert on smu_v13_0_10 due to kfd test fail. will enable it again after confirming the thermal hardware setting. Signed-off-by: Kenneth Feng Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 10 ++ 1

[PATCH 2/8] drm/amd/pm: remove the pptable id override on smu_v13_0_10

2022-10-13 Thread Gao, Likun
[AMD Official Use Only - General] remove the pptable id override on smu_v13_0_10, and the id is fetched from vbios now. Signed-off-by: Kenneth Feng Reviewed-by: Likun Gao --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/gpu/drm/a

[PATCH 3/8] drm/amd/amdgpu: enable gfx clock gating features on smu_v13_0_10

2022-10-13 Thread Gao, Likun
[AMD Official Use Only - General] enable gfx clock gating features on smu_v13_0_10 Signed-off-by: Kenneth Feng Reviewed-by: Jack Gui --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 1 + drivers/gpu/drm/amd/amdgpu/soc21.c | 6 +- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git

[PATCH 4/8] drm/amd/pm: skip loading pptable from driver on secure board for smu_v13_0_10

2022-10-13 Thread Gao, Likun
[AMD Official Use Only - General] skip loading pptable from driver on secure board since it's loaded from psp. Signed-off-by: Kenneth Feng Reviewed-by: Guan Yu --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/d

[PATCH 5/8] drm/amdgpu: skip mes self test for gc 11.0.3

2022-10-13 Thread Gao, Likun
[AMD Official Use Only - General] Temporary disable mes self teset for gc 11.0.3. Signed-off-by: Likun Gao Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drive

[PATCH 6/8] drm/amdgpu: Enable gmc soft reset on gmc_v11_0_3

2022-10-13 Thread Gao, Likun
[AMD Official Use Only - General] Enable gmc soft reset on gmc_v11_0_3. Signed-off-by: YiPeng Chai Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/soc21.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c inde

[PATCH 7/8] drm/amdgpu: Enable ras support for mp0 v13_0_0 and v13_0_10

2022-10-13 Thread Gao, Likun
[AMD Official Use Only - General] V1: Enable ras support for CHIP_IP_DISCOVERY asic type. V2: 1. Change commit comment. 2. Enable ras support for mp0 v13_0_0 and v13_0_10. Signed-off-by: YiPeng Chai Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 10 ++ 1 file

[PATCH 8/8] drm/amdgpu: Add sriov vf ras support in amdgpu_ras_asic_supported

2022-10-13 Thread Gao, Likun
[AMD Official Use Only - General] V2: Add sriov vf ras support in amdgpu_ras_asic_supported. Signed-off-by: YiPeng Chai Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/

[PATCH] drm/amdgpu: move convert_error_address out of umc_ras

2022-10-13 Thread Hawking Zhang
RAS error address translation algorithm is common across dGPU and A + A platform as along as the SOC integrates the same generation of UMC IP. UMC RAS is managed by x86 MCA on A + A platform, umc_ras in GPU driver is not initialized at all on A + A platform. In such case, any umc_ras callback impl

RE: [PATCH] drm/amdgpu: move convert_error_address out of umc_ras

2022-10-13 Thread Yang, Stanley
[AMD Official Use Only - General] Reviewed-by: Stanley.Yang Regards, Stanley > -Original Message- > From: amd-gfx On Behalf Of > Hawking Zhang > Sent: Friday, October 14, 2022 2:19 PM > To: amd-gfx@lists.freedesktop.org; Zhou1, Tao ; > Yang, Stanley > Cc: Russell, Kent ; Zhang, Hawking

RE: [PATCH] drm/amdgpu: move convert_error_address out of umc_ras

2022-10-13 Thread Chen, Guchun
default: + dev_warn(adev->dev, +"UMC address to Physical address translation is not supported\n"); + return NOTIFY_DONE; Before returning, maybe it's necessary to free err_data.err_addr? Regards, Guchun -Original Message- From: amd-gfx