RE: [PATCH] drm/amdkfd: Fix the deadlock in svm_range_restore_work

2025-02-09 Thread Deng, Emily
[AMD Official Use Only - AMD Internal Distribution Only] From: Chen, Xiaogang Sent: Monday, February 10, 2025 10:18 AM To: Deng, Emily ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amdkfd: Fix the deadlock in svm_range_restore_work On 2/7/2025 9:02 PM, Deng, Emily wrote: [AMD Off

[PATCH 4/4] drm/amdgpu: Improve SDMA reset logic with guilty queue tracking

2025-02-09 Thread jesse.zh...@amd.com
From: "jesse.zh...@amd.com" This commit introduces several improvements to the SDMA reset logic: 1. Added `cached_rptr` to the `amdgpu_ring` structure to store the read pointer before a reset, ensuring proper state restoration after reset. 2. Introduced `gfx_guilty` and `page_guilty` flags i

[PATCH 3/4] drm/amdgpu: Add common lock and reset caller parameter for SDMA reset synchronization

2025-02-09 Thread jesse.zh...@amd.com
From: "jesse.zh...@amd.com" This commit introduces a caller parameter to the amdgpu_sdma_reset_instance function to differentiate between reset requests originating from the KGD and KFD. This change ensures proper synchronization between KGD and KFD during SDMA resets. If the caller is KFD, th

[PATCH 2/4] drm/amdgpu/sdma: Refactor SDMA reset functionality and add callback support

2025-02-09 Thread jesse.zh...@amd.com
From: "jesse.zh...@amd.com" This patch refactors the SDMA reset functionality in the `sdma_v4_4_2` driver to improve modularity and support shared usage between AMDGPU and KFD. The changes include: 1. **Refactored SDMA Reset Logic**: - Split the `sdma_v4_4_2_reset_queue` function into two sep

[PATCH 1/4] drm/amdgpu/kfd: Add shared SDMA reset functionality with callback support

2025-02-09 Thread jesse.zh...@amd.com
From: "jesse.zh...@amd.com" This patch introduces shared SDMA reset functionality between AMDGPU and KFD. The implementation includes the following key changes: 1. Added `amdgpu_sdma_reset_queue`: - Resets a specific SDMA queue by instance ID. - Invokes registered pre-reset and post-reset

[PATCH v2 3/4] drm/amdgpu: Remove unsupported xgmi versions

2025-02-09 Thread Lijo Lazar
XGMI v4.8.0 is not used in any SOCs. Remove the associated functions. Also, ensure get_xgmi_info callback pointer is not NULL before calling the function. Signed-off-by: Lijo Lazar --- v2: Remove XGMI v4.8.0 as it is unused (Hawking) drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 2 +-

[PATCH v2 4/4] drm/amdgpu: Use xgmi APIs for init and bandwidth

2025-02-09 Thread Lijo Lazar
Initialize xgmi related static information during early_init. Use xgmi API to get max bandwidth details. Signed-off-by: Lijo Lazar --- v2: Move XGMI info init to early init phase (Jon) drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 6 -- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3

[PATCH v2 1/4] drm/amdgpu: Move xgmi definitions to xgmi header

2025-02-09 Thread Lijo Lazar
Move definitions related to xgmi to amdgpu_xgmi header Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 23 +--- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 8 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 35 +--- 3 files changed, 34 i

[PATCH v2 2/4] drm/amdgpu: Add xgmi speed/width related info

2025-02-09 Thread Lijo Lazar
Add APIs to initialize XGMI speed, width details and get to max bandwidth supported. It is assumed that a device only supports same generation of XGMI links with uniform width. Signed-off-by: Lijo Lazar --- v2: Use GC versions as XGMI version is not populated for all SOCs (Hawking)

Re: [PATCH 3/4] drm/amdgpu: Initialize xgmi info during discovery

2025-02-09 Thread Lazar, Lijo
On 2/8/2025 3:15 AM, Kim, Jonathan wrote: > [Public] > > > I think part of the problem is that gmc.xgmi.supported has weird usage > and definition. > > It’s partly says that it has potential to be supported by IP version, > but doesn’t actually say anything about real support but assumed say

Re: [PATCH] drm/amdkfd: Fix the deadlock in svm_range_restore_work

2025-02-09 Thread Chen, Xiaogang
On 2/7/2025 9:02 PM, Deng, Emily wrote: [AMD Official Use Only - AMD Internal Distribution Only] [AMD Official Use Only - AMD Internal Distribution Only] Ping... Emily Deng Best Wishes -Original Message- From: Emily Deng Sent: Friday, February 7, 2025 6:28 PM To:amd-gfx@lists.

[PATCH] drm/amdgpu - Put the fence returned by amdgpu_gem_va_update_vm

2025-02-09 Thread YuanShang
The fence in amdgpu_gem_va_update_vm is not used after amdgpu_gem_update_bo_mapping. Signed-off-by: YuanShang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index

[PATCH] drm/amd/display: restore edid reading from a given i2c adapter

2025-02-09 Thread Melissa Wen
When switching to drm_edid, we slightly changed how to get edid by removing the possibility of getting them from dc_link when in aux transaction mode. As MST doesn't initialize the connector with `drm_connector_init_with_ddc()`, restore the original behavior to avoid functional changes. Fixes: 48e

Please Apply: Revert "drm/amd/display: Fix green screen issue after suspend"

2025-02-09 Thread Mingcong Bai
The display on Lenovo Xiaoxin Pro 13 2019 (Lenovo XiaoXinPro-13API 2019) briefly shows a garbled screen upon wakeup from S3 with kernel v6.13.2, but not with v6.14-rc1. I have bisected to 04d6273faed083e619fc39a738ab0372b6a4db20 ("Revert "drm/amd/display: Fix green screen issue after suspend"")

Re: Rework and fix queue reset for gfx7-gfx10

2025-02-09 Thread Timur Kristóf
Hi André, Sorry for the late reply - we've been discussing all of this on a different, long email thread. Alex & Christian - do you think it's OK to include André on that thread? André - in a nutshell, I was using a Mesa patch that intentionally breaks NGG culling: https://gitlab.freedesktop.org