RE: [PATCH 1/3] drm/amdgpu: Add gpu_coredump parameter

2024-08-15 Thread Huang, Trigger
[AMD Official Use Only - AMD Internal Distribution Only] > -Original Message- > From: amd-gfx On Behalf Of > Huang, Trigger > Sent: Friday, August 16, 2024 2:36 PM > To: Alex Deucher > Cc: amd-gfx@lists.freedesktop.org; Khatri, Sunil ; > Deucher, Alexander > Subject: RE: [PATCH 1/3] drm

RE: [PATCH 3/3] drm/amdgpu: Change the timing of doing coredump

2024-08-15 Thread Huang, Trigger
[AMD Official Use Only - AMD Internal Distribution Only] > -Original Message- > From: Alex Deucher > Sent: Friday, August 16, 2024 12:09 AM > To: Huang, Trigger > Cc: amd-gfx@lists.freedesktop.org; Khatri, Sunil ; > Deucher, Alexander > Subject: Re: [PATCH 3/3] drm/amdgpu: Change the ti

RE: [PATCH 1/3] drm/amdgpu: Add gpu_coredump parameter

2024-08-15 Thread Huang, Trigger
[AMD Official Use Only - AMD Internal Distribution Only] > -Original Message- > From: Alex Deucher > Sent: Friday, August 16, 2024 12:02 AM > To: Huang, Trigger > Cc: amd-gfx@lists.freedesktop.org; Khatri, Sunil ; > Deucher, Alexander > Subject: Re: [PATCH 1/3] drm/amdgpu: Add gpu_cored

[PATCH 13/13] drm/amd/display: Promote DC to 3.2.297

2024-08-15 Thread Roman.Li
From: Martin Leung - Various DML 2.1 fixes - Fix MST Regression - Fix module unload - Fix construct_phy with MXM connector - Support UHBR10 link rate on eDP - Revert updated DCCG wrappers Signed-off-by: Martin Leung Signed-off-by: Roman Li --- drivers/gpu/drm/amd/display/dc/dc.h | 2 +- 1 fil

[PATCH 12/13] drm/amd/display: Fix a typo in revert commit

2024-08-15 Thread Roman.Li
From: Fangzhi Zuo A typo is fixed for "drm/amd/display: Fix MST BW calculation Regression" Fixes: 4b6564cb120c ("drm/amd/display: Fix MST BW calculation Regression") Reviewed-by: Roman Li Signed-off-by: Fangzhi Zuo Signed-off-by: Roman Li --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_

[PATCH 11/13] drm/amd/display: DML2.1 Reintegration for Various Fixes

2024-08-15 Thread Roman.Li
From: Austin Zheng [Why and How] DML2.1 reintegration for several fixes and updates to the DML code. Reviewed-by: Dillon Varone Signed-off-by: Austin Zheng Signed-off-by: Roman Li fams2_required = display_cfg->stage3.fams2_required; dml2_core_calcs_get_global_fams2_programm

[PATCH 10/13] drm/amd/display: fix double free issue during amdgpu module unload

2024-08-15 Thread Roman.Li
From: Tim Huang Flexible endpoints use DIGs from available inflexible endpoints, so only the encoders of inflexible links need to be freed. Otherwise, a double free issue may occur when unloading the amdgpu module. [ 279.190523] RIP: 0010:__slab_free+0x152/0x2f0 [ 279.190577] Call Trace: [ 27

[PATCH 08/13] drm/amd/display: Fix construct_phy with MXM connector

2024-08-15 Thread Roman.Li
From: Ilya Bakoulin [Why/How] The call to construct_phy will fail in cases where connector type is MXM, and the dc_link won't be properly created/initialized. Reviewed-by: Wenjing Liu Signed-off-by: Ilya Bakoulin Signed-off-by: Roman Li --- drivers/gpu/drm/amd/display/dc/link/link_factory.c

[PATCH 02/13] drm/amd/display: Update HPO I/O When Handling Link Retrain Automation Request

2024-08-15 Thread Roman.Li
From: Michael Strauss [WHY] Previous multi-display HPO fix moved where HPO I/O enable/disable is performed. The codepath now taken to enable/disable HPO I/O is not used for compliance test automation, meaning that if a compliance box being driven at a DP1 rate requests retrain at UHBR, HPO I/O wi

[PATCH 04/13] drm/amd/display: Remove redundant check in DCN35 hwseq

2024-08-15 Thread Roman.Li
From: Nicholas Susanto Removing redundant condition. Reviewed-by: Hansen Dsouza Signed-off-by: Nicholas Susanto Signed-off-by: Roman Li --- drivers/gpu/drm/amd/display/dc/hwss/dcn35/dcn35_hwseq.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn35/

[PATCH 05/13] drm/amd/display: Allow UHBR Interop With eDP Supported Link Rates Table

2024-08-15 Thread Roman.Li
From: Michael Strauss [WHY] eDP 2.0 is introducing support for UHBR link rates, however current eDP ILR link optimization does not account for UHBR capabilities. Either UHBR capabilities will be provided via the same 128b/132b rate DPCD caps that are currently used on DP2.1, or Table 4-13 may be

[PATCH 09/13] drm/amd/display: DCN35 set min dispclk to 50Mhz

2024-08-15 Thread Roman.Li
From: Nicholas Susanto [Why] Causes hard hangs when resuming after display off on extended/duplicate modes [How] Set the min dispclk to 50Mhz for DCN35 Reviewed-by: Nicholas Kazlauskas Signed-off-by: Nicholas Susanto Signed-off-by: Roman Li --- drivers/gpu/drm/amd/display/dc/clk_mgr/dcn35

[PATCH 06/13] drm/amd/display: Hardware cursor changes color when switched to software cursor

2024-08-15 Thread Roman.Li
From: Nevenko Stupar [Why & How] DCN4 Cursor has separate degamma block and should always do Cursor degamma for Cursor color modes. Reviewed-by: Chris Park Signed-off-by: Nevenko Stupar Signed-off-by: Roman Li --- drivers/gpu/drm/amd/display/dc/dpp/dcn401/dcn401_dpp_cm.c | 5 ++--- 1 file ch

[PATCH 03/13] drm/amd/display: remove an extraneous call for checking dchub clock

2024-08-15 Thread Roman.Li
From: Aurabindo Pillai when removing the amdgpu module and reinserting it, a call trace is triggered: [ 334.230602] RIP: 0010:hubbub2_get_dchub_ref_freq+0xbb/0xe0 [amdgpu] [ 334.230807] Code: 25 28 00 00 00 75 3c 48 8d 65 f0 5b 41 5c 5d 31 c0 31 d2 31 c9 31 f6 31 ff 45 31 c0 45 31 c9 45 31 d2

[PATCH 07/13] drm/amd/display: Support UHBR10 link rate on eDP

2024-08-15 Thread Roman.Li
From: Sung Joon Kim [why] Supporting UHBR10 link rate on eDP leverages the existing DP2.0 code but need to add some small adjustments in code. [how] Acknowledge the given DPCD caps for UHBR10 link rate support and allow DP2.0 programming sequence and link training for eDP. Reviewed-by: Wenjing

[PATCH 01/13] Revert "drm/amd/display: Update to using new dccg callbacks"

2024-08-15 Thread Roman.Li
From: Hansen Dsouza [Why] Revert updated DCCG wrappers due to regression [How] This reverts commit 28b190df7a8f43b39e13886d744742a74a2c162d. Reviewed-by: Chris Park Signed-off-by: Hansen Dsouza Signed-off-by: Roman Li --- drivers/gpu/drm/amd/display/dc/dccg/dcn35/dcn35_dccg.c | 4 ++-- 1 fi

[PATCH 00/13] DC Patches August 15, 2024

2024-08-15 Thread Roman.Li
From: Roman Li Cc: Daniel Wheeler Aurabindo Pillai (1): drm/amd/display: remove an extraneous call for checking dchub clock Austin Zheng (1): drm/amd/display: DML2.1 Reintegration for Various Fixes Fangzhi Zuo (1): drm/amd/display: Fix a typo in revert commit Hansen Dsouza (1): Rever

Re: [PATCHv2 2/3] drm/amdkfd: Update queue unmap after VM fault with MES

2024-08-15 Thread Felix Kuehling
On 2024-08-15 17:08, Joshi, Mukul wrote: [AMD Official Use Only - AMD Internal Distribution Only] Hi Felix, -Original Message- From: Kuehling, Felix Sent: Thursday, August 15, 2024 2:25 PM To: Joshi, Mukul ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: Re: [PATCHv2

RE: [PATCHv2 2/3] drm/amdkfd: Update queue unmap after VM fault with MES

2024-08-15 Thread Joshi, Mukul
[AMD Official Use Only - AMD Internal Distribution Only] Hi Felix, > -Original Message- > From: Kuehling, Felix > Sent: Thursday, August 15, 2024 2:25 PM > To: Joshi, Mukul ; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander > Subject: Re: [PATCHv2 2/3] drm/amdkfd: Update queue unm

RE: [PATCHv2 1/3] drm/amdgpu: Implement MES Suspend and Resume APIs for GFX11

2024-08-15 Thread Joshi, Mukul
[AMD Official Use Only - AMD Internal Distribution Only] > -Original Message- > From: Kasiviswanathan, Harish > Sent: Thursday, August 15, 2024 4:32 PM > To: Joshi, Mukul ; amd-gfx@lists.freedesktop.org > Cc: Kuehling, Felix ; Deucher, Alexander > ; Joshi, Mukul > Subject: RE: [PATCHv2 1

RE: [PATCH] drm/amdgpu/sdma5.2: limit wptr workaround to sdma 5.2.1

2024-08-15 Thread Dong, Ruijing
[AMD Official Use Only - AMD Internal Distribution Only] Acked-by: Ruijing Dong -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Wednesday, August 14, 2024 10:43 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: [PATCH] drm/amdgpu/sdma5.2: limit wptr

RE: [PATCHv2 1/3] drm/amdgpu: Implement MES Suspend and Resume APIs for GFX11

2024-08-15 Thread Kasiviswanathan, Harish
[AMD Official Use Only - AMD Internal Distribution Only] -Original Message- From: amd-gfx On Behalf Of Mukul Joshi Sent: Wednesday, August 14, 2024 7:28 PM To: amd-gfx@lists.freedesktop.org Cc: Kuehling, Felix ; Deucher, Alexander ; Joshi, Mukul Subject: [PATCHv2 1/3] drm/amdgpu: Implem

RE: [PATCHv2 2/3] drm/amdkfd: Update queue unmap after VM fault with MES

2024-08-15 Thread Kasiviswanathan, Harish
[AMD Official Use Only - AMD Internal Distribution Only] -Original Message- From: amd-gfx On Behalf Of Felix Kuehling Sent: Thursday, August 15, 2024 2:25 PM To: Joshi, Mukul ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: Re: [PATCHv2 2/3] drm/amdkfd: Update queue unmap

[PATCH v3] drm/amd/display: use drm_crtc_vblank_on_config()

2024-08-15 Thread Hamza Mahfooz
Hook up drm_crtc_vblank_on_config() in amdgpu_dm. So, that we can enable PSR and other static screen optimizations more quickly, while avoiding stuttering issues that are accompanied by the following dmesg error: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3

Re: [PATCHv2 2/3] drm/amdkfd: Update queue unmap after VM fault with MES

2024-08-15 Thread Felix Kuehling
On 2024-08-14 19:27, Mukul Joshi wrote: MEC FW expects MES to unmap all queues when a VM fault is observed on a queue and then resumed once the affected process is terminated. Use the MES Suspend and Resume APIs to achieve this. Signed-off-by: Mukul Joshi --- v1->v2: - Add MES FW version check.

Re: [PATCHv2 3/3] drm/amdkfd: Update BadOpcode Interrupt handling with MES

2024-08-15 Thread Alex Deucher
On Wed, Aug 14, 2024 at 7:28 PM Mukul Joshi wrote: > > Based on the recommendation of MEC FW, update BadOpcode interrupt > handling by unmapping all queues, removing the queue that got the > interrupt from scheduling and remapping rest of the queues back when > using MES scheduler. This is done to

Re: [PATCHv2 2/3] drm/amdkfd: Update queue unmap after VM fault with MES

2024-08-15 Thread Alex Deucher
On Wed, Aug 14, 2024 at 7:28 PM Mukul Joshi wrote: > > MEC FW expects MES to unmap all queues when a VM fault is observed > on a queue and then resumed once the affected process is terminated. > Use the MES Suspend and Resume APIs to achieve this. > > Signed-off-by: Mukul Joshi Acked-by: Alex De

Re: [PATCHv2 1/3] drm/amdgpu: Implement MES Suspend and Resume APIs for GFX11

2024-08-15 Thread Alex Deucher
On Wed, Aug 14, 2024 at 7:28 PM Mukul Joshi wrote: > > Add implementation for MES Suspend and Resume APIs to unmap/map > all queues for GFX11. Support for GFX12 will be added when the > corresponding firmware support is in place. > > Signed-off-by: Mukul Joshi Reviewed-by: Alex Deucher > --- >

[PATCH 1/2] drm/amdgpu/gfx11: return early in preempt_ib()

2024-08-15 Thread Alex Deucher
When MES is enabled KIQ is not available. Return an error when someone uses the debugfs preempt test interface in that case. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/dr

[PATCH 2/2] drm/amdgpu/gfx12: return early in preempt_ib()

2024-08-15 Thread Alex Deucher
When MES is enabled KIQ is not available. Return an error when someone uses the debugfs preempt test interface in that case. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c b/dr

Re: [PATCH 3/3] drm/amdgpu: Change the timing of doing coredump

2024-08-15 Thread Alex Deucher
On Thu, Aug 15, 2024 at 7:50 AM wrote: > > From: Trigger Huang > > Do the coredump immediately after a job timeout to get a closer > representation of GPU's error status. For other code paths that > need to do the coredump, keep the original logic unchanged, except: > 1,All the coredump operation

Re: [PATCH 1/3] drm/amdgpu: Add gpu_coredump parameter

2024-08-15 Thread Alex Deucher
On Thu, Aug 15, 2024 at 7:39 AM wrote: > > From: Trigger Huang > > Add new separate parameter to control GPU coredump procedure. This can > be used to decouple the coredump procedure from gpu recovery procedure > > Signed-off-by: Trigger Huang > --- > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1

Re: [PATCH 1/2] Documentation/gpu: Document the situation with unqualified drm-memory-

2024-08-15 Thread Tvrtko Ursulin
On 13/08/2024 19:47, Rob Clark wrote: On Tue, Aug 13, 2024 at 6:57 AM Tvrtko Ursulin wrote: From: Tvrtko Ursulin Currently it is not well defined what is drm-memory- compared to other categories. In practice the only driver which emits these keys is amdgpu and in them exposes the current

Re: [PATCH 9/9] drm/tilcdc: Use backlight power constants

2024-08-15 Thread Tomi Valkeinen
Hi, On 15/08/2024 10:59, Thomas Zimmermann wrote: Ping. This patch still needs an ack. Am 31.07.24 um 14:17 schrieb Thomas Zimmermann: Replace FB_BLANK_ constants with their counterparts from the backlight subsystem. The values are identical, so there's no change in functionality or semantics.

[PATCH 0/3] Improve the dev coredump

2024-08-15 Thread Trigger.Huang
From: Trigger Huang The current dev coredump implementation sometimes cannot fully satisfy customer's requirements due to: 1, The enablement of dev coredump is under the control of gpu_recovery. Customer can not do dev coredump with gpu_recovery disabled 2, When job timeout happened, the dump G

[PATCH 3/3] drm/amdgpu: Change the timing of doing coredump

2024-08-15 Thread Trigger.Huang
From: Trigger Huang Do the coredump immediately after a job timeout to get a closer representation of GPU's error status. For other code paths that need to do the coredump, keep the original logic unchanged, except: 1,All the coredump operations will be under the control of parameter amdgpu_gpu_c

[PATCH 1/3] drm/amdgpu: Add gpu_coredump parameter

2024-08-15 Thread Trigger.Huang
From: Trigger Huang Add new separate parameter to control GPU coredump procedure. This can be used to decouple the coredump procedure from gpu recovery procedure Signed-off-by: Trigger Huang --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 8

[PATCH 2/3] drm/amdgpu: introduce new API for GPU core dump

2024-08-15 Thread Trigger.Huang
From: Trigger Huang Put ip dump and register to dev_coredumpm together Signed-off-by: Trigger Huang --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 73 ++ 2 files changed, 75 insertions(+) diff --git a/drivers/gpu/drm/amd

Re: AMD drm patch workflow is broken for stable trees

2024-08-15 Thread Greg KH
On Wed, Aug 14, 2024 at 05:30:08PM -0400, Alex Deucher wrote: > On Wed, Aug 14, 2024 at 4:55 PM Felix Kuehling wrote: > > > > On 2024-08-12 11:00, Greg KH wrote: > > > Hi all, > > > > > > As some of you have noticed, there's a TON of failure messages being > > > sent out for AMD gpu driver commits

Re: [PATCH v2] drm/radeon/evergreen_cs: fix int overflow errors in cs track offsets

2024-08-15 Thread Nikita Zhandarovich
Hi, On 8/6/24 10:19, Nikita Zhandarovich wrote: > Several cs track offsets (such as 'track->db_s_read_offset') > either are initialized with or plainly take big enough values that, > once shifted 8 bits left, may be hit with integer overflow if the > resulting values end up going over u32 limit. >

Re: AMD drm patch workflow is broken for stable trees

2024-08-15 Thread Greg KH
On Wed, Aug 14, 2024 at 04:39:29PM -0400, Felix Kuehling wrote: > On 2024-08-12 11:00, Greg KH wrote: > > Hi all, > > > > As some of you have noticed, there's a TON of failure messages being > > sent out for AMD gpu driver commits that are tagged for stable > > backports. In short, you all are do

[PATCH] drm/amd/display: avoid using null object of framebuffer

2024-08-15 Thread Ma Ke
Instead of using state->fb->obj[0] directly, get object from framebuffer by calling drm_gem_fb_get_obj() and return error code when object is null to avoid using null object of framebuffer. Cc: sta...@vger.kernel.org Fixes: 5d945cbcd4b1 ("drm/amd/display: Create a file dedicated to planes") Signed

RE: [PATCH] drm/amdkfd: keep create queue success if cwsr save area doesn't match

2024-08-15 Thread Zhang, Yifan
[AMD Official Use Only - AMD Internal Distribution Only] Got it. Will address this issue in ROCr. Best Regards, Yifan -Original Message- From: Kuehling, Felix Sent: Wednesday, August 14, 2024 10:47 PM To: Zhang, Yifan ; Christopher Snowhill Cc: Kasiviswanathan, Harish ; amd-gfx@lists.

Re: [PATCH 9/9] drm/tilcdc: Use backlight power constants

2024-08-15 Thread Thomas Zimmermann
Am 15.08.24 um 10:07 schrieb Tomi Valkeinen: Hi, On 15/08/2024 10:59, Thomas Zimmermann wrote: Ping. This patch still needs an ack. Am 31.07.24 um 14:17 schrieb Thomas Zimmermann: Replace FB_BLANK_ constants with their counterparts from the backlight subsystem. The values are identical, so

Re: [PATCH 9/9] drm/tilcdc: Use backlight power constants

2024-08-15 Thread Thomas Zimmermann
Ping. This patch still needs an ack. Am 31.07.24 um 14:17 schrieb Thomas Zimmermann: Replace FB_BLANK_ constants with their counterparts from the backlight subsystem. The values are identical, so there's no change in functionality or semantics. Signed-off-by: Thomas Zimmermann Cc: Jyri Sarha

[PATCH] drm/amdgpu: use cpu to query utcl2 poison status

2024-08-15 Thread Shikang Fan
Use CPU to read registers in interrupt situation. Signed-off-by: Shikang Fan --- drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c | 2 +- drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drive

RE: [PATCH] drm/amdgpu: Validate TA binary size

2024-08-15 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Candice Li Sent: Thursday, August 15, 2024 15:06 To: amd-gfx@lists.freedesktop.org Cc: Li, Candice Subject: [PATCH] drm/amdgpu: Validate TA

[PATCH] drm/amdgpu: Validate TA binary size

2024-08-15 Thread Candice Li
Add TA binary size validation to avoid OOB write. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp_ta.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp_ta.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp_ta.c index 0c856005df6b95..38face9