[PATCH v3 2/4] drm/amdgpu: Add xgmi speed/width related info

2025-02-24 Thread Lijo Lazar
Add APIs to initialize XGMI speed, width details and get to max bandwidth supported. It is assumed that a device only supports same generation of XGMI links with uniform width. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- v2: Use GC versions as XGMI version is not populated f

[PATCH v3 3/4] drm/amdgpu: Remove unsupported xgmi versions

2025-02-24 Thread Lijo Lazar
XGMI v4.8.0 is not used in any SOCs. Remove the associated functions. Also, ensure get_xgmi_info callback pointer is not NULL before calling the function. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- v2: Remove XGMI v4.8.0 as it is unused (Hawking) drivers/gpu/drm/amd/amdgp

[PATCH v3 4/4] drm/amdgpu: Calculate IP specific xgmi bandwidth

2025-02-24 Thread Lijo Lazar
Use IP version specific xgmi speed/width for bandwidth calculation. Signed-off-by: Lijo Lazar --- v2: Move XGMI info init to early init phase (Jon) v3: Rebase on top of drm/amdgpu: simplify xgmi peer info calls drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++ drivers/gpu/drm/amd

[PATCH v3 1/4] drm/amdgpu: Move xgmi definitions to xgmi header

2025-02-24 Thread Lijo Lazar
Move definitions related to xgmi to amdgpu_xgmi header Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 23 +--- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 8 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 35 +--

[PATCH 27/27] drm/amd/display: Promote DAL to 3.2.323

2025-02-24 Thread Wayne Lin
From: Taimur Hassan This version brings along following fixes: - Various cleanups to amdgpu dm - Add DP tunneling IRQ handler - Fix display corruption for dcn35 - Fix dmcub reset problem - Adjust BW determination for PCON - DIO encoder refactor - Fix performance with SubVP under gaming Acked-by:

[PATCH 26/27] drm/amd/display: Use drm_err() for handle_hpd_irq_helper()

2025-02-24 Thread Wayne Lin
From: Mario Limonciello drm_err() will show which device has the error. Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/a

[PATCH 21/27] drm/amd/display: Change amdgpu_dm_irq_resume_*() to void

2025-02-24 Thread Wayne Lin
From: Mario Limonciello amdgpu_dm_irq_resume_early() and amdgpu_dm_irq_resume_late() don't have any error flows. Change the return type from integer to void. Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_i

[PATCH 25/27] drm/amd/display: Use scoped guards for handle_hpd_irq_helper()

2025-02-24 Thread Wayne Lin
From: Mario Limonciello Scoped guards will release the mutex when they go out of scope. Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) d

[PATCH 24/27] drm/amd/display: Use _free() macro for amdgpu_dm_update_connector_after_detect()

2025-02-24 Thread Wayne Lin
From: Mario Limonciello By using a _free() macro multiple duplicated snippets of code to free the sink can be dropped. The sink will be released when leaving scope. Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdg

[PATCH 23/27] drm/amd/display: Use scoped guard for amdgpu_dm_update_connector_after_detect()

2025-02-24 Thread Wayne Lin
From: Mario Limonciello A scoped guard will release the mutex when it goes out of scope. Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff

[PATCH 22/27] drm/amd/display: Use _free(kfree) for dm_gpureset_commit_state()

2025-02-24 Thread Wayne Lin
From: Mario Limonciello Using a _free(kfree) macro drops the need for a goto statement as it will be freed when it goes out of scope. Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 7 ++- 1 file cha

[PATCH 18/27] drm/amd/display: Use drm_err() instead of DRM_ERROR in dm_resume()

2025-02-24 Thread Wayne Lin
From: Mario Limonciello drm_err() is helpful to show which device had the error. Adjust to using this instead for error messages. Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 ++-- 1 file changed, 2

[PATCH 20/27] drm/amd/display: Change amdgpu_dm_irq_resume_*() to use drm_dbg()

2025-02-24 Thread Wayne Lin
From: Mario Limonciello drm_dbg() is helpful to show which device had the debug statement. Adjust to using this instead for debug messages. Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c | 4 ++-- 1 f

[PATCH 19/27] drm/amd/display: Use scoped guard for dm_resume()

2025-02-24 Thread Wayne Lin
From: Mario Limonciello Scoped guards will release the mutex when they go out of scope. Adjust the code to use these instead. Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 6 ++ 1 file changed, 2 i

[PATCH 17/27] drm/amd/display: Use _free() macro for amdgpu_dm_commit_zero_streams()

2025-02-24 Thread Wayne Lin
From: Mario Limonciello All cases except a failure to create a copy of the current context will call dc_state_release() on the copied context. Use a _free() macro to free the context and then adjust the error handling flow to drop the unnecessary use of goto statements. Reviewed-by: Alex Hung

[PATCH 16/27] drm/amd/display: Catch failures for amdgpu_dm_commit_zero_streams()

2025-02-24 Thread Wayne Lin
From: Mario Limonciello amdgpu_dm_commit_zero_streams() returns a DC error code that isn't checked. Add an explicit check to this and fail dm_suspend() if it is not DC_OK. Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/amdgpu_

[PATCH 15/27] drm/amd/display: Drop `ret` variable from dm_suspend()

2025-02-24 Thread Wayne Lin
From: Mario Limonciello The `ret` variable in dm_suspend() doesn't get set and is just used to return 0. Drop the needless declaration. Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +-- 1 file chan

[PATCH 14/27] drm/amd/display: Change amdgpu_dm_irq_suspend() to void

2025-02-24 Thread Wayne Lin
From: Mario Limonciello amdgpu_dm_irq_suspend() doesn't have any error flows and always returns zero. Change the function to void. Reviewed-by: Alex Hung Signed-off-by: Mario Limonciello Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c | 3 +-- drivers/gpu/

[PATCH 10/27] drm/amd/display: Ensure DMCUB idle before reset on DCN31/DCN35

2025-02-24 Thread Wayne Lin
From: Nicholas Kazlauskas [Why] If we soft reset before halt finishes and there are outstanding memory transactions then the memory interface may produce unexpected results, such as out of order transactions when the firmware next runs. These can manifest as random or unexpected load/store viola

[PATCH 13/27] drm/amd/display: Add tunneling IRQ handler

2025-02-24 Thread Wayne Lin
From: Cruise Hung USB4 DP BW Allocation uses DP_TUNNELING_IRQ to indicate the status update. The DP_TUNNELING_IRQ is defined in LINK_SERVICE_IRQ_VECTOR_ESI0. When receiving DP HPD IRQ in USB4, read the LINK_SERVICE_IRQ_VECTOR_ESI0. Reviewed-by: Wenjing Liu Signed-off-by: Cruise Hung Signed-off

[PATCH 12/27] drm/amd/display: wait for outstanding hw updates

2025-02-24 Thread Wayne Lin
From: Ausef Yousof [why&how] seeing display corruption as a result of not waiting for certain values to latch and attempting otg locking/programming before waiting for them, there is code in place for this but dcn35 does not initialize these functions. Cc: Mario Limonciello Cc: Alex Deucher Cc

[PATCH 11/27] drm/amd/display: Added visual confirm for DCC

2025-02-24 Thread Wayne Lin
From: Leo Zeng [WHY] We want to add a visual confirm mode for DCC and MCache for debugging purpose. [HOW] color pipes based on whether DCC is enabled and what MCache id is used. black - DCC disabled red - DCC enabled grey - 2 different MCaches used other colors - 1 MCache used Reviewed-by: Dill

[PATCH 09/27] drm/amd/display: Revert "Increase halt timeout for DMCUB to 1s"

2025-02-24 Thread Wayne Lin
From: Nicholas Kazlauskas This reverts commit 7f4d49ac3944 ("drm/amd/display: Increase halt timeout for DMCUB to 1s") There's two issues here: 1. Each poll is closer to 10us than 1us so it stalls for 15s on PNP. 2. We're reading the wrong scratch register to check for the HALT code. Reviewed-by

[PATCH 08/27] drm/amd/display: Check NULL connector before it is used

2025-02-24 Thread Wayne Lin
From: Alex Hung [Why & How] amdgpu_dm_find_first_crtc_matching_connector can return NULL. It is necessary to the returned connector before passing it drm_atomic_get_new_connector_state which always assumes connector is not NULL. Reviewed-by: Roman Li Signed-off-by: Alex Hung Signed-off-by: Way

[PATCH 07/27] drm/amd/display: Remove unused struct definition

2025-02-24 Thread Wayne Lin
From: George Shen [Why/How] The struct is not and will not be used, as it is no longer relevant nor supported. Reviewed-by: Wenjing Liu Signed-off-by: George Shen Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/dc/dc_dp_types.h | 8 1 file changed, 8 deletions(-) diff --gi

[PATCH 06/27] drm/amd/display: Skip checking FRL_MODE bit for PCON BW determination

2025-02-24 Thread Wayne Lin
From: George Shen [Why/How] Certain PCON will clear the FRL_MODE bit despite supporting the link BW indicated in the other bits. Thus, skip checking the FRL_MODE bit when interpreting the hdmi_encoded_link_bw struct. Reviewed-by: Wenjing Liu Signed-off-by: George Shen Signed-off-by: Wayne Lin

[PATCH 05/27] drm/amd/display: misc for dio encoder refactor

2025-02-24 Thread Wayne Lin
From: Peichen Huang [WHY] These are left required changes for dio encoder refactor. [HOW] 1. original logic is separated by config option 2. new link encoder dp enable/disable code for dcn35 3. process fec only for DP 8b10b encoding Reviewed-by: Cruise Hung Signed-off-by: Peichen Huang Signed

[PATCH 01/27] drm/amd/display: Request HW cursor on DCN3.2 with SubVP

2025-02-24 Thread Wayne Lin
From: Aric Cyr [why] When SubVP is active the HW cursor size is limited to 64x64, and anything larger will force composition which is bad for gaming on DCN3.2 if the game uses a larger cursor. [how] If HW cursor is requested, typically by a fullscreen game, do not enable SubVP so that up to 256x

[PATCH 04/27] drm/amd/display: read mso dpcd caps

2025-02-24 Thread Wayne Lin
From: Hansen Dsouza [Why & How] Read if panel support multi-sst links Reviewed-by: Charlene Liu Signed-off-by: Hansen Dsouza Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/dc/dc_dp_types.h | 2 ++ .../display/dc/link/protocols/link_dp_capability.c| 11 +++ 2 f

[PATCH 03/27] drm/amd/display: Fix DMUB reset sequence for DCN401

2025-02-24 Thread Wayne Lin
From: Dillon Varone [WHY] It should no longer use DMCUB_SOFT_RESET as it can result in the memory request path becoming desynchronized. [HOW] To ensure robustness in the reset sequence: 1) Extend timeout on the "halt" command sent via gpint, and check for controller to enter "wait" as a stronger

[PATCH 02/27] drm/amd/display: Fix p-state type when p-state is unsupported

2025-02-24 Thread Wayne Lin
From: Dillon Varone [WHY&HOW] P-state type would remain on previously used when unsupported which causes confusion in logging and visual confirm, so set back to zero when unsupported. Reviewed-by: Aric Cyr Signed-off-by: Dillon Varone Signed-off-by: Wayne Lin --- drivers/gpu/drm/amd/display/

[PATCH 00/27] DC Patches Feb. 25, 2025

2025-02-24 Thread Wayne Lin
This DC patchset brings improvements in multiple areas. In summary, we highlight: - Various cleanups to amdgpu dm - Add DP tunneling IRQ handler - Fix display corruption for dcn35 - Fix dmcub reset problem - Adjust BW determination for PCON - DIO encoder refactor - Fix performance with SubVP under

RE: [PATCH 3/3] drm/amdkfd: Skip update vmid in while update queue

2025-02-24 Thread Deng, Emily
[AMD Official Use Only - AMD Internal Distribution Only] Yes, I hit the page fault while doorbell_mode=1. Error log is as follow. [▒~L 2▒~\~H 25 00:12:10 2025 <0.02>] kfd_ioctl_create_event:844: amdgpu: Created event (id:0x0002) (kfd_ioctl_cree ate_event) [▒~L 2▒~\~H 25 00:12:10 2025

RE: [PATCH 3/3] drm/amdkfd: Skip update vmid in while update queue

2025-02-24 Thread Joshi, Mukul
[AMD Official Use Only - AMD Internal Distribution Only] > -Original Message- > From: amd-gfx On Behalf Of Deng, > Emily > Sent: Monday, February 24, 2025 8:05 PM > To: Deng, Emily ; Kuehling, Felix > > Cc: amd-gfx@lists.freedesktop.org > Subject: RE: [PATCH 3/3] drm/amdkfd: Skip update

RE: [PATCH] drm/amdgpu: Set CPER enabled flag after ring initiailized

2025-02-24 Thread Zhou1, Tao
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Tao Zhou > -Original Message- > From: Liu, Xiang(Dean) > Sent: Monday, February 24, 2025 11:02 PM > To: amd-gfx@lists.freedesktop.org > Cc: Zhang, Hawking ; Zhou1, Tao > ; Dong, Andy ; Liu, Xiang(Dean) > > Subject: [

Re: [PATCH v3] drm/amd/display/dc: Refactor remove duplications on command_table files

2025-02-24 Thread Luan Icaro Pinto Arcanjo
Hi, Sorry for the late reply, I have submitted a v4 with the changes to put these functions in the command_table_helper.c rather than adding a new set of files. The mail title of v4 is "[PATCH v4] drm/amd/display/dc: Refactor remove duplications" @alexander.deuc...@amd.com Can you take a look?

[PATCH v4] drm/amd/display/dc: Refactor remove duplications

2025-02-24 Thread Luan Icaro Pinto Arcanjo
From: Luan Arcanjo All dce command_table_helper's shares a copy-pasted collection of copy-pasted functions, which are: phy_id_to_atom, clock_source_id_to_atom_phy_clk_src_id, and engine_bp_to_atom. This patch removes the multiple copy-pasted by moving them to the command_table_helper.c and make

RE: [PATCH] drm/amdkfd: Correct the postion of reserve and unreserve memory

2025-02-24 Thread Deng, Emily
[AMD Official Use Only - AMD Internal Distribution Only] Yes. Emily Deng Best Wishes >-Original Message- >From: amd-gfx On Behalf Of Chen, >Xiaogang >Sent: Monday, February 24, 2025 11:40 PM >To: amd-gfx@lists.freedesktop.org >Subject: Re: [PATCH] drm/amdkfd: Correct the postion of re

RE: [PATCH 3/3] drm/amdkfd: Skip update vmid in while update queue

2025-02-24 Thread Deng, Emily
[AMD Official Use Only - AMD Internal Distribution Only] Ping.. >-Original Message- >From: amd-gfx On Behalf Of Deng, Emily >Sent: Monday, February 24, 2025 9:53 AM >To: Kuehling, Felix >Cc: amd-gfx@lists.freedesktop.org >Subject: RE: [PATCH 3/3] drm/amdkfd: Skip update vmid in whil

[PATCH] drm/amdkfd: remove kfd_pasid.c from amdgpu driver build

2025-02-24 Thread Xiaogang . Chen
From: Xiaogang Chen Since kfd uses pasid values from graphic driver now do not need use kfd pasid fucntions. Signed-off-by: Xiaogang Chen --- drivers/gpu/drm/amd/amdkfd/Makefile| 1 - drivers/gpu/drm/amd/amdkfd/kfd_pasid.c | 46 -- 2 files changed, 47 deletions(-)

RE: [PATCH 2/5] drm/amd: Pass luminance data to amdgpu_dm_backlight_caps

2025-02-24 Thread Deucher, Alexander
[AMD Official Use Only - AMD Internal Distribution Only] > -Original Message- > From: amd-gfx On Behalf Of Mario > Limonciello > Sent: Friday, February 21, 2025 12:10 PM > To: amd-gfx @ lists . freedesktop . org ; Hung, > Alex > Cc: Wentland, Harry ; Limonciello, Mario > > Subject: [PAT

RE: [PATCH] drm/amdkfd: enable cooperative launch on gfx12

2025-02-24 Thread Kasiviswanathan, Harish
[Public] Reviewed-by: Harish Kasiviswanathan -Original Message- From: amd-gfx On Behalf Of Joshi, Mukul Sent: Monday, February 24, 2025 2:37 PM To: Kim, Jonathan ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH] drm/amdkfd: enable cooperative launch on gfx12 [Public] [Public] Ack

RE: [PATCH] drm/amdkfd: enable cooperative launch on gfx12

2025-02-24 Thread Joshi, Mukul
[Public] Acked-by: Mukul Joshi > -Original Message- > From: Kim, Jonathan > Sent: Friday, February 21, 2025 11:49 AM > To: amd-gfx@lists.freedesktop.org > Cc: Joshi, Mukul ; Kim, Jonathan > > Subject: [PATCH] drm/amdkfd: enable cooperative launch on gfx12 > > Even though GWS no longer

[PATCH] drm/amdgpu/vcn2.5: fix VCN stop logic

2025-02-24 Thread Alex Deucher
Need to make sure we call amdgpu_dpm_enable_vcn() in vcn_v2_5_stop() at the end if there are errors or DPG is enabled. Fixes: ebc25499de12 ("drm/amdgpu/vcn2.5: split code along instances") Suggested-by: Boyuan Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 15

Re: [V7 01/45] drm: Add helper for conversion from signed-magnitude

2025-02-24 Thread Alex Hung
On 2/24/25 09:07, Louis Chauvet wrote: Le 20/12/2024 à 05:33, Alex Hung a écrit : From: Harry Wentland CTM values are defined as signed-magnitude values. Add a helper that converts from CTM signed-magnitude fixed point value to the twos-complement value used by drm_fixed. Signed-off-by:

Re: [V7 07/45] drm/colorop: Add 1D Curve subtype

2025-02-24 Thread Louis Chauvet

Re: [V7 09/45] drm/colorop: Add BYPASS property

2025-02-24 Thread Louis Chauvet

Re: [V7 01/45] drm: Add helper for conversion from signed-magnitude

2025-02-24 Thread Louis Chauvet
Le 20/12/2024 à 05:33, Alex Hung a écrit : From: Harry Wentland CTM values are defined as signed-magnitude values. Add a helper that converts from CTM signed-magnitude fixed point value to the twos-complement value used by drm_fixed. Signed-off-by: Harry Wentland Reviewed-by: Louis Chauv

Re: [V7 10/45] drm/colorop: Add NEXT property

2025-02-24 Thread Louis Chauvet

Re: [V7 03/45] drm/vkms: Add kunit tests for VKMS LUT handling

2025-02-24 Thread Louis Chauvet

Re: [V7 05/45] drm/colorop: Introduce new drm_colorop mode object

2025-02-24 Thread Louis Chauvet

Re: [V7 02/45] drm/vkms: Round fixp2int conversion in lerp_u16

2025-02-24 Thread Louis Chauvet

Re: [V7 06/45] drm/colorop: Add TYPE property

2025-02-24 Thread Louis Chauvet

RE: [PATCH 00/24] DC Patches FEBRUARY 18, 2025 V2

2025-02-24 Thread Wheeler, Daniel
[Public] Hi all, This week this patchset was tested on 4 systems, two dGPU and two APU based, and tested across multiple display and connection types. APU * Single Display eDP -> 1080p 60hz, 2560x1600 120hz, 1920x1200 165hz * Single Display DP (SST DSC) -> 4k144hz, 4k240hz

Re: [PATCH] drm/amdkfd: Correct the postion of reserve and unreserve memory

2025-02-24 Thread Chen, Xiaogang
Is it for fixing the issue you mentioned previously " Fix the deadlock in svm_range_restore_work"? Regards Xiaogang On 2/20/2025 5:59 AM, Emily Deng wrote: Call amdgpu_amdkfd_reserve_mem_limit in svm_range_vram_node_new when creating a new SVM BO. Call amdgpu_amdkfd_unreserve_mem_limit in

RE: [PATCH 2/2] drm/amdgpu/vcn: send session ctx along with msg buffer

2025-02-24 Thread Dong, Ruijing
[AMD Official Use Only - AMD Internal Distribution Only] This series is Reviewed-by: Ruijing Dong -Original Message- From: amd-gfx On Behalf Of boyuan.zh...@amd.com Sent: Friday, February 21, 2025 9:06 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Boyuan ; Yao, Yinjie Subject: [PAT

Re: [PATCH] drm/amdkfd: Correct the postion of reserve and unreserve memory

2025-02-24 Thread Philip Yang
On 2025-02-20 06:59, Emily Deng wrote: Call amdgpu_amdkfd_reserve_mem_limit in svm_range_vram_node_new when creating a new SVM BO. Call amdgpu_amdkfd_unreserve_mem_limit in svm_range_bo_release when the SVM BO is deleted. Signed-off-by: Emily Deng --- drivers/gpu/drm/amd/amdkfd/kfd_migrate.

[PATCH] drm/amdgpu: Set CPER enabled flag after ring initiailized

2025-02-24 Thread Xiang Liu
Setting cper.enabled to be true only after cper ring is successfully created. Signed-off-by: Xiang Liu --- drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c b/drivers/gpu/drm/amd/amdgpu

[PATCH] drm/amd/display: Disable unneeded hpd interrupts during dm_init

2025-02-24 Thread sunpeng.li
From: Leo Li [Why] It seems HPD interrupts are enabled by default for all connectors, even if the hpd source isn't valid. An eDP for example, does not have a valid hpd source (but does have a valid hpdrx source; see construct_phy()). Thus, eDPs should have their hpd interrupt disabled. In the p

[PATCH] drm/amdgpu: update SDMA sysfs reset mask in late_init

2025-02-24 Thread jesse.zhang
From: "jesse.zh...@amd.com" - Added `sdma_v4_4_2_update_reset_mask` function to update the reset mask. - update the sysfs reset mask to the `late_init` stage to ensure that the SMU initialization and capability setup are completed before checking the SDMA reset capability. - For IP versio

Re: [PATCH] drm/amdgpu: Fix correct parameter desc for VCN idle check functions

2025-02-24 Thread Alex Deucher
On Mon, Feb 24, 2025 at 7:24 AM Srinivasan Shanmugam wrote: > > Fixes the kdoc for the following VCN idle check functions by updating > the parameter description from 'handle' to 'ip_block': > > - vcn_v4_0_is_idle > - vcn_v4_0_3_is_idle > - vcn_v4_0_5_is_idle > - vcn_v5_0_1_is_idle > > Fixes the b

[PATCH v5] drm/amdkfd: Fix Circular Locking Dependency in 'svm_range_cpu_invalidate_pagetables'

2025-02-24 Thread Srinivasan Shanmugam
This commit addresses a circular locking dependency in the svm_range_cpu_invalidate_pagetables function. The function previously held a lock while determining whether to perform an unmap or eviction operation, which could lead to deadlocks. v2: To resolve this issue, the allocation of the process

Re: [drm:amdgpu_ring_test_helper] *ERROR* ring kiq_0.2.1.0 test failed (-110)

2025-02-24 Thread Alex Deucher
On Mon, Feb 24, 2025 at 8:51 AM Baruch Siach wrote: > > Hi amd-gfx list, > > I see this failure on probe when trying to bring up amdgpu on a new arm64 > platform. Kernel is v6.14-rc4, and aldebaran firmware is latest > (linux-firmware commit 4f47e84d06f9). > > Tested with these kernel command lin

RE: [PATCH 3/3] drm/amdkfd: Skip update vmid in while update queue

2025-02-24 Thread Deng, Emily
[AMD Official Use Only - AMD Internal Distribution Only] Hi Felix, Could you help review this? Thanks. Emily Deng Best Wishes >-Original Message- >From: Deng, Emily >Sent: Friday, February 21, 2025 9:44 AM >To: Deng, Emily ; amd-gfx@lists.freedesktop.org >Subject: RE: [PATCH 3/3]

Re: [PATCH] drm/amdgpu: update SDMA sysfs reset mask in late_init

2025-02-24 Thread Lazar, Lijo
On 2/24/2025 8:16 AM, jesse.zh...@amd.com wrote: > From: "jesse.zh...@amd.com" > > - Added `sdma_v4_4_2_update_reset_mask` function to update the reset mask. > - update the sysfs reset mask to the `late_init` stage to ensure that the SMU > initialization > and capability setup are compl

[PATCH v4] drm/amdkfd: Fix Circular Locking Dependency in 'svm_range_cpu_invalidate_pagetables'

2025-02-24 Thread Srinivasan Shanmugam
This commit addresses a circular locking dependency in the svm_range_cpu_invalidate_pagetables function. The function previously held a lock while determining whether to perform an unmap or eviction operation, which could lead to deadlocks. v2: To resolve this issue, the allocation of the process

Re: [amdgpu] Kernel OOPS

2025-02-24 Thread Alex Deucher
On Mon, Feb 24, 2025 at 8:44 AM Jaap Aart wrote: > > Hello all! > > Before I spam this list with an unrelated bug report, would this be the > best place to report linux amdgpu kernel bugs/page faults? > I found this list on a very old reddit thread, don't want to end up > mailing the wrong place.

RE: [PATCH] drm/amdgpu: Check if CPER enabled when generating CPER

2025-02-24 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] The patch is Reviewed-by: Hawking Zhang Please make another change to set cper.enabled to be true *only* after cper ring is successfully created. Regards, Hawking -Original Message- From: Liu, Xiang(Dean) Sent: Monday, Februar

[drm:amdgpu_ring_test_helper] *ERROR* ring kiq_0.2.1.0 test failed (-110)

2025-02-24 Thread Baruch Siach
Hi amd-gfx list, I see this failure on probe when trying to bring up amdgpu on a new arm64 platform. Kernel is v6.14-rc4, and aldebaran firmware is latest (linux-firmware commit 4f47e84d06f9). Tested with these kernel command line parameters: amdgpu.vm_size=1 amdgpu.msi=1 amdgpu.gartsize=32 amd

Re: REGRESSION amdgpu 20241108 firmware breaks screen updating

2025-02-24 Thread mail
I can confirm amdgpu.dcdebugmask=0x200 fixes it, no more stutters. It would be very beneficial to include the patch making this behavior default (https://lore.kernel.org/amd-gfx/20250221160145.1730752-3-zaeem.moha...@amd.com/T/#u)  in the relevant linux-stable trees, as this is a major issue fo

Re: [PATCH v6 0/6] drm/sched: Job queue peek/pop helpers and struct job re-order

2025-02-24 Thread Philipp Stanner
On Fri, 2025-02-21 at 10:50 +, Tvrtko Ursulin wrote: > Lets add some helpers for peeking and popping from the job queue > which allows us > to re-order the fields in struct drm_sched_job and remove one hole. > > As in the process we have added a header file for scheduler internal > prototypes,

[PATCH] drm/amd/display: Remove unused optc3_fpu_set_vrr_m_const

2025-02-24 Thread linux
From: "Dr. David Alan Gilbert" The last use of optc3_fpu_set_vrr_m_const() was removed in 2022's commit 64f991590ff4 ("drm/amd/display: Fix a compilation failure on PowerPC caused by FPU code") which removed the only caller (with a similar) name. Remove it. Signed-off-by: Dr. David Alan Gilbert

[PATCH v3] drm/amdkfd: Fix Circular Locking Dependency in 'svm_range_cpu_invalidate_pagetables'

2025-02-24 Thread Srinivasan Shanmugam
This commit addresses a circular locking dependency in the svm_range_cpu_invalidate_pagetables function. The function previously held a lock while determining whether to perform an unmap or eviction operation, which could lead to deadlocks. v2: To resolve this issue, the allocation of the process

RE: [PATCH] drm/amdgpu/job: fix is_guilty logic change (v2)

2025-02-24 Thread Zhang, Jesse(Jie)
[AMD Official Use Only - AMD Internal Distribution Only] The is Reviewed-by: Jesse Zhang -Original Message- From: Deucher, Alexander Sent: Friday, February 21, 2025 11:39 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Zhang, Jesse(Jie) Subject: [PATCH] drm/amdgpu/job: f

Re: [PATCH 1/3] drm/amdgpu: Log the creation of a coredump file

2025-02-24 Thread Michel Dänzer
On 2025-02-19 22:35, André Almeida wrote: > After a GPU reset happens, the driver creates a coredump file. However, > the user might not be aware of it. Log the file creation the user can > find more information about the device and add the file to bug reports. > This is similar to what the xe driv

Re: REGRESSION amdgpu 20241108 firmware breaks screen updating

2025-02-24 Thread Mario Limonciello
On 2/23/2025 20:56, m...@tteles.dev wrote: I can confirm amdgpu.dcdebugmask=0x200 fixes it, no more stutters. Thanks for confirming. It would be very beneficial to include the patch making this behavior default (https://lore.kernel.org/amd-gfx/20250221160145.1730752-3-zaeem.moha...@amd.com

Re: [PATCH v4 0/3] drm/amd/display: Stop control flow if the divisior is zero

2025-02-24 Thread Huacai Chen
On Tue, Jan 14, 2025 at 9:29 PM Tiezhu Yang wrote: > > As far as I can tell, with the current existing macro definitions, there > is no better way to do the minimal and proper changes to stop the control > flow if the divisior is zero. > > In order to keep the current ability for the aim of debugg

RE: [PATCH] drm/amdgpu/job: fix is_guilty logic change (v2)

2025-02-24 Thread Zhang, Jesse(Jie)
[AMD Official Use Only - AMD Internal Distribution Only] The is Reviewed-by: Jesse Zhang -Original Message- From: Deucher, Alexander Sent: Friday, February 21, 2025 11:39 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Zhang, Jesse(Jie) Subject: [PATCH] drm/amdgpu/job: f

Re: [PATCH] drm/amdgpu: simplify xgmi peer info calls

2025-02-24 Thread Lazar, Lijo
On 2/21/2025 11:53 PM, Jonathan Kim wrote: > Deprecate KFD XGMI peer info calls in favour of calling directly from > simplified XGMI peer info functions. > > v2: generalize bandwidth interface to return range in one call > > Signed-off-by: Jonathan Kim > --- > drivers/gpu/drm/amd/amdgpu/amdg

Re: [PATCH 2/2] drm/amdgpu/userq: fix hardcoded uq functions

2025-02-24 Thread Saleemkhan jamadar
Hi Alex, Change looks good to me. Reviewed-by: Saleemkhan Jamadar Regards, Saleem On 2/21/2025 8:20 PM, Alex Deucher wrote: Use the IP type to look up the userq functions rather than hardcoding it. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 12 ++-

RE: [PATCH] drm/amdgpu/job: fix is_guilty logic change (v2)

2025-02-24 Thread Zhang, Jesse(Jie)
[AMD Official Use Only - AMD Internal Distribution Only] The is Reviewed-by: Jesse Zhang -Original Message- From: Deucher, Alexander Sent: Friday, February 21, 2025 11:39 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Zhang, Jesse(Jie) Subject: [PATCH] drm/amdgpu/job: fi

Re: [PATCH] drm/amd/pm: Get metrics table version for smu_v13_0_12

2025-02-24 Thread Lazar, Lijo
On 2/22/2025 10:43 PM, Asad Kamal wrote: > Get metrics table version for smu_v13_0_12 and populate pm_metrics > > Signed-off-by: Asad Kamal Reviewed-by: Lijo Lazar Thanks, Lijo > --- > drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c | 8 > 1 file changed, 8 insertions(+) >

[PATCH] drm/amd/display: fix type mismatch in CalculateDynamicMetadataParameters()

2025-02-24 Thread Vitaliy Shevtsov
There is a type mismatch between what CalculateDynamicMetadataParameters() takes and what is passed to it. Currently this function accepts several args as signed long but it's called with unsigned integers. On some systems where long is 32 bits and one of these input params is greater than INT_MAX

Wrong LTR-related check in nbif_v6_3_1_program_ltr()

2025-02-24 Thread Heiner Kallweit
In nbif_v6_3_1_program_ltr() (and maybe other functions as well) you have the following: pcie_capability_read_word(adev->pdev, PCI_EXP_DEVCTL2, &devctl2); if (adev->pdev->ltr_path == (devctl2 & PCI_EXP_DEVCTL2_LTR_EN)) return; if (adev->pdev->ltr_path) pcie_capability_set_word(ad

[amdgpu] Kernel OOPS

2025-02-24 Thread Jaap Aart
Hello all! Before I spam this list with an unrelated bug report, would this be the best place to report linux amdgpu kernel bugs/page faults? I found this list on a very old reddit thread, don't want to end up mailing the wrong place. - Jaap

[PATCH] drm/amd/display: Fix null check for pipe_ctx->plane_state in resource_build_scaling_params

2025-02-24 Thread Ma Ke
Null pointer dereference issue could occur when pipe_ctx->plane_state is null. The fix adds a check to ensure 'pipe_ctx->plane_state' is not null before accessing. This prevents a null pointer dereference. Found by code review. Cc: sta...@vger.kernel.org Fixes: 3be5262e353b ("drm/amd/display: Ren

[PATCH 1/2] drm/amdgpu: Remove hole from struct amdgpu_ib

2025-02-24 Thread Tvrtko Ursulin
Group the 32- vs 64- members together to remove hole from the struct. Before: /* size: 40, cachelines: 1, members: 5 */ /* sum members: 32, holes: 1, sum holes: 4 */ /* padding: 4 */ /* last cacheline: 40 bytes */ After: /* size: 32, cachelines: 1, members

[PATCH 2/2] drm/amdgpu: Reduce holes in struct amdgpu_job

2025-02-24 Thread Tvrtko Ursulin
Lets group same width types closer together to reduce the number and size of the holes in the struct. Before: /* size: 480, cachelines: 8, members: 30 */ /* sum members: 469, holes: 3, sum holes: 11 */ /* forced alignments: 1 */ /* last cacheline: 32 bytes */ Afte

[PATCH 0/2] Fit one IB struct amdgpu_job into a 512 byte slab

2025-02-24 Thread Tvrtko Ursulin
A lot of the workloads create jobs with just one IB and if we re-order some struct members we can stop that allocation spilling into the 1k SLAB bucket. Before: sizeof(struct amdgpu_job) + sizeof(struct amdgpu_ib) = 480 + 40 = 520 After: sizeof(struct amdgpu_job) + sizeof(struct amdgpu_ib)

Re: [PATCH] drm/amd/amdgpu: Add missing parameter for amdgpu_sdma_register_on_reset_callbacks

2025-02-24 Thread Christian König
Am 24.02.25 um 12:45 schrieb Srinivasan Shanmugam: > This commit updates the documentation for the function > amdgpu_sdma_register_on_reset_callbacks to include a description > for the 'adev' parameter. > > The 'adev' parameter is a pointer to the amdgpu_device structure, > which is necessary for r

[PATCH] drm/amdgpu: Check if CPER enabled when generating CPER

2025-02-24 Thread Xiang Liu
In the case of CPER disabled, generating CPER will cause kernel NULL pointer dereference without checking. Signed-off-by: Xiang Liu --- drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c | 3 +++ drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 5 +++-- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git

[PATCH] drm/amdgpu: Fix correct parameter desc for VCN idle check functions

2025-02-24 Thread Srinivasan Shanmugam
Fixes the kdoc for the following VCN idle check functions by updating the parameter description from 'handle' to 'ip_block': - vcn_v4_0_is_idle - vcn_v4_0_3_is_idle - vcn_v4_0_5_is_idle - vcn_v5_0_1_is_idle Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c:935: warning: Functi

[PATCH] drm/amd/amdgpu: Add missing parameter for amdgpu_sdma_register_on_reset_callbacks

2025-02-24 Thread Srinivasan Shanmugam
This commit updates the documentation for the function amdgpu_sdma_register_on_reset_callbacks to include a description for the 'adev' parameter. The 'adev' parameter is a pointer to the amdgpu_device structure, which is necessary for registering SDMA reset callbacks. Fixes the below with gcc W=1

[PATCH v2] drm/amdkfd: Fix Circular Locking Dependency in 'svm_range_cpu_invalidate_pagetables'

2025-02-24 Thread Srinivasan Shanmugam
This commit addresses a circular locking dependency in the svm_range_cpu_invalidate_pagetables function. The function previously held a lock while determining whether to perform an unmap or eviction operation, which could lead to deadlocks. v2: To resolve this issue, the allocation of the process

[PATCH] drm/amdgpu: drm/amdgpu/job: fix is_guilty logic change (v2)

2025-02-24 Thread jesse.zh...@amd.com
From: "jesse.zh...@amd.com" The is Reviewed-by: Jesse Zhang Incrementing the gpu_reset counter needs to be in the is_guilty block. Alos move the fence error before the reset to keep the original ordering. Fixes: f447ba2bbd48 ("drm/amdgpu: Update amdgpu_job_timedout to check if the ring is g