[PATCH] Revert "drm/display/dp_mst: Move all payload info into the atomic state"

2023-01-12 Thread Wayne Lin
This reverts commit 4d07b0bc403403438d9cf88450506240c5faf92f. [Why] Changes cause regression on amdgpu mst. E.g. In fill_dc_mst_payload_table_from_drm(), amdgpu expects to add/remove payload one by one and call fill_dc_mst_payload_table_from_drm() to update the HW maintained payload table. But pre

Re: [PATCH] drm/amdgpu: fix pipeline sync v2

2023-01-12 Thread Luben Tuikov
Acked-by: Luben Tuikov Regards, Luben On 2023-01-09 08:01, Christian König wrote: > This fixes a potential memory leak of dma_fence objects in the CS code > as well as glitches in firefox because of missing pipeline sync. > > v2: use the scheduler instead of the fence context for the test > >

Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO

2023-01-12 Thread Christian König
Am 11.01.23 um 21:50 schrieb Alex Deucher: On Wed, Jan 11, 2023 at 3:48 PM Alex Deucher wrote: On Wed, Jan 4, 2023 at 3:17 PM Marek Olšák wrote: Yes, it's meant to be like a spec sheet. We are not interested in the current bandwidth utilization. After chatting with Marek on IRC and thinking

Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO

2023-01-12 Thread Christian König
Am 11.01.23 um 21:48 schrieb Alex Deucher: On Wed, Jan 4, 2023 at 3:17 PM Marek Olšák wrote: Yes, it's meant to be like a spec sheet. We are not interested in the current bandwidth utilization. After chatting with Marek on IRC and thinking about this more, I think this patch is fine. It's no

[PATCH v2] drm: Optimize drm buddy top-down allocation method

2023-01-12 Thread Arunpravin Paneer Selvam
We are observing performance drop in many usecases which include games, 3D benchmark applications,etc.. To solve this problem, We are strictly not allowing top down flag enabled allocations to steal the memory space from cpu visible region. The idea is, we are sorting each order list entries in as

Re: [PATCH] drm/amdgpu: grab extra fence reference for drm_sched_job_add_dependency

2023-01-12 Thread Christian König
Am 10.01.23 um 19:21 schrieb Mikhail Gavrilov: On Mon, Jan 9, 2023 at 6:40 PM Christian König wrote: That looks like an out of memory situation is not gracefully handled. In other words we have a missing NULL check in drm_sched_job_cleanup(). Going to take a look. Very strange because it

Re: [PATCH 5.10 1/1] drm/amdkfd: Check for null pointer after calling kmemdup

2023-01-12 Thread Greg KH
On Wed, Jan 04, 2023 at 07:56:33PM +0200, Dragos-Marian Panait wrote: > From: Jiasheng Jiang > > [ Upstream commit abfaf0eee97925905e742aa3b0b72e04a918fa9e ] > > As the possible failure of the allocation, kmemdup() may return NULL > pointer. > Therefore, it should be better to check the 'props2'

[PATCH] drm/amd/display: drop unnecessary NULL check in dce60_should_enable_fbc()

2023-01-12 Thread Alexey Kodanev
pipe_ctx pointer cannot be NULL when getting the address of an element of the pipe_ctx array. Detected using the static analysis tool - Svace. Signed-off-by: Alexey Kodanev --- drivers/gpu/drm/amd/display/dc/dce60/dce60_hw_sequencer.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers

[PATCH] drm/amdgpu: fix amdgpu_job_free_resources

2023-01-12 Thread Christian König
It can be that neither fence were initialized when we run out of UVD streams for example. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/g

Re: [PATCH] drm/amdgpu: fix amdgpu_job_free_resources

2023-01-12 Thread Alex Deucher
On Thu, Jan 12, 2023 at 8:48 AM Christian König wrote: > > It can be that neither fence were initialized when we run out of UVD > streams for example. > > Signed-off-by: Christian König Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2324 Reviewed-by: Alex Deucher > --- > drivers/gpu/d

[PATCH] drm/amd/display: Conversion to bool not necessary

2023-01-12 Thread Deepak R Varma
A logical evaluation already results in bool. There is no need for using a ternary operator based evaluation and bool conversion of the outcome. Issue identified using boolconv.cocci Coccinelle semantic patch. This was also reported by the Kernel Test Robot. Hence Fixes: 473683a03495 ("drm/amd/dis

Re: [PATCH] drm/amdgpu: fix amdgpu_job_free_resources

2023-01-12 Thread Thong Thai
The patch solves the UVD issue. By the way, I had to change one of the "->"'s to a "." to compile: drivers/gpu/drm/amd/amdgpu/amdgpu_job.c: In function ‘amdgpu_job_free_resources’: drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:159:61: error: invalid type argument of ‘->’ (have ‘struct dma_fence’)

Re: [PATCH 5.10 1/1] drm/amdkfd: Check for null pointer after calling kmemdup

2023-01-12 Thread Daniel Vetter
On Thu, 12 Jan 2023 at 13:47, Greg KH wrote: > On Wed, Jan 04, 2023 at 07:56:33PM +0200, Dragos-Marian Panait wrote: > > From: Jiasheng Jiang > > > > [ Upstream commit abfaf0eee97925905e742aa3b0b72e04a918fa9e ] > > > > As the possible failure of the allocation, kmemdup() may return NULL > > point

Re: [PATCH] drm/amdgpu: fix amdgpu_job_free_resources

2023-01-12 Thread Christian König
Sorry for that I've only quickly hacked that together without testing. Good to know that it solves the issue. Thanks, Christian. Am 12.01.23 um 15:37 schrieb Thong Thai: The patch solves the UVD issue. By the way, I had to change one of the "->"'s to a "." to compile: drivers/gpu/drm/amd/am

Re: [PATCH] drm: amd: display: Fix memory leakage

2023-01-12 Thread Rodrigo Siqueira Jordao
On 11/29/22 21:50, Konstantin Meskhidze wrote: This commit fixes memory leakage in dc_construct_ctx() function. Signed-off-by: Konstantin Meskhidze --- drivers/gpu/drm/amd/display/dc/core/dc.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/

Re: [PATCH] drm/amd/display: Fix set scaling doesn's work

2023-01-12 Thread Rodrigo Siqueira Jordao
On 1/11/23 10:19, Harry Wentland wrote: On 1/10/23 10:58, Rodrigo Siqueira Jordao wrote: On 11/22/22 06:20, hongao wrote: [Why] Setting scaling does not correctly update CRTC state. As a result dc stream state's src (composition area) && dest (addressable area) was not calculated as expect

Re: [PATCH] drm/amd: Avoid ASSERT for some message failures

2023-01-12 Thread Harry Wentland
On 1/11/23 16:52, Mario Limonciello wrote: > On DCN314 when resuming from s0i3 an ASSERT is shown indicating that > `VBIOSSMC_MSG_SetHardMinDcfclkByFreq` returned `VBIOSSMC_Result_Failed`. > > This isn't a driver bug; it's a BIOS/configuration bug. To make this > easier to triage, add an explicit

Re: [PATCH AUTOSEL 6.1 5/7] drm/amdgpu: Fix size validation for non-exclusive domains (v4)

2023-01-12 Thread Luben Tuikov
Hi Sasha, The patch in the link is a Fixes patch of the quoted patch, and should also go in: https://lore.kernel.org/all/20230104221935.113400-1-luben.tui...@amd.com/ Regards, Luben On 2022-12-31 15:04, Sasha Levin wrote: > From: Luben Tuikov > > [ Upstream commit 7554886daa31eacc8e7fac9e15b

[PATCH 2/2] drm/amdgpu/pm: update hwmon power documentation

2023-01-12 Thread Alex Deucher
Power reporting is socket power. On APUs this includes the CPU. Update the documentation to clarify this. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm

[PATCH 1/2] drm/amdgpu/smu12: fix power reporting on CZN/BCL

2023-01-12 Thread Alex Deucher
The metrics interface exposes the socket power in W, but apparently RN systems exposed the power as mW. See commit 137aac26a2ed ("drm/amdgpu/smu12: fix power reporting on renoir"). So leave RN as mW and use W for CZN/BCL. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2321 Fixes: 137aac26a2

Re: [PATCH 6.0 108/148] drm/amdgpu: Fix size validation for non-exclusive domains (v4)

2023-01-12 Thread Luben Tuikov
Hi Greg, The patch in the link is a Fixes patch of the quoted patch, and should also go in: https://lore.kernel.org/all/20230104221935.113400-1-luben.tui...@amd.com/ Regards, Luben On 2023-01-10 13:03, Greg Kroah-Hartman wrote: > From: Luben Tuikov > > [ Upstream commit 7554886daa31eacc8e7fa

Re: [PATCH 1/2] drm/amdgpu: return the PCIe gen and lanes from the INFO

2023-01-12 Thread Alex Deucher
On Thu, Jan 12, 2023 at 6:50 AM Christian König wrote: > > Am 11.01.23 um 21:48 schrieb Alex Deucher: > > On Wed, Jan 4, 2023 at 3:17 PM Marek Olšák wrote: > >> Yes, it's meant to be like a spec sheet. We are not interested in the > >> current bandwidth utilization. > > After chatting with Marek

Re: [PATCH 5.10 1/1] drm/amdkfd: Check for null pointer after calling kmemdup

2023-01-12 Thread Greg KH
On Thu, Jan 12, 2023 at 04:26:45PM +0100, Daniel Vetter wrote: > On Thu, 12 Jan 2023 at 13:47, Greg KH wrote: > > On Wed, Jan 04, 2023 at 07:56:33PM +0200, Dragos-Marian Panait wrote: > > > From: Jiasheng Jiang > > > > > > [ Upstream commit abfaf0eee97925905e742aa3b0b72e04a918fa9e ] > > > > > > A

Re: [PATCH 6.0 108/148] drm/amdgpu: Fix size validation for non-exclusive domains (v4)

2023-01-12 Thread Greg Kroah-Hartman
On Thu, Jan 12, 2023 at 11:25:08AM -0500, Luben Tuikov wrote: > Hi Greg, > > The patch in the link is a Fixes patch of the quoted patch, and should also > go in: > > https://lore.kernel.org/all/20230104221935.113400-1-luben.tui...@amd.com/ Is that in Linus's tree already? if so, what is the gi

Re: [PATCH 1/2] drm/amdgpu/smu12: fix power reporting on CZN/BCL

2023-01-12 Thread Alex Deucher
On Thu, Jan 12, 2023 at 11:25 AM Alex Deucher wrote: > > The metrics interface exposes the socket power in W, but > apparently RN systems exposed the power as mW. See > commit 137aac26a2ed ("drm/amdgpu/smu12: fix power reporting on renoir"). > So leave RN as mW and use W for CZN/BCL. Just saw th

Re: [PATCH 6.0 108/148] drm/amdgpu: Fix size validation for non-exclusive domains (v4)

2023-01-12 Thread Luben Tuikov
On 2023-01-12 11:49, Greg Kroah-Hartman wrote: > On Thu, Jan 12, 2023 at 11:25:08AM -0500, Luben Tuikov wrote: >> Hi Greg, >> >> The patch in the link is a Fixes patch of the quoted patch, and should also >> go in: >> >> https://lore.kernel.org/all/20230104221935.113400-1-luben.tui...@amd.com/ >

[PATCH] drm/amdgpu: print bo inode number instead of ptr

2023-01-12 Thread Pierre-Eric Pelloux-Prayer
This allows to correlate the infos printed by /sys/kernel/debug/dri/n/amdgpu_gem_info to the ones found in /proc/.../fdinfo and /sys/kernel/debug/dma_buf/bufinfo. Signed-off-by: Pierre-Eric Pelloux-Prayer --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 ++-- 1 file changed, 2 insertions(+), 2

Re: [PATCH 6.0 108/148] drm/amdgpu: Fix size validation for non-exclusive domains (v4)

2023-01-12 Thread Greg Kroah-Hartman
On Thu, Jan 12, 2023 at 11:59:06AM -0500, Luben Tuikov wrote: > On 2023-01-12 11:49, Greg Kroah-Hartman wrote: > > On Thu, Jan 12, 2023 at 11:25:08AM -0500, Luben Tuikov wrote: > >> Hi Greg, > >> > >> The patch in the link is a Fixes patch of the quoted patch, and should > >> also go in: > >> > >>

Re: [PATCH] Revert "drm/display/dp_mst: Move all payload info into the atomic state"

2023-01-12 Thread Limonciello, Mario
On 1/12/2023 02:50, Wayne Lin wrote: This reverts commit 4d07b0bc403403438d9cf88450506240c5faf92f. [Why] Changes cause regression on amdgpu mst. E.g. In fill_dc_mst_payload_table_from_drm(), amdgpu expects to add/remove payload one by one and call fill_dc_mst_payload_table_from_drm() to update t

Re: [PATCH] drm/amd/display: Remove useless else if

2023-01-12 Thread Alex Deucher
Applied. Thanks! On Wed, Jan 11, 2023 at 10:21 PM Jiapeng Chong wrote: > > The assignment of the else and if branches is the same, so the if else > here is redundant, so we remove it. > > ./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:1951:2-4: WARNING: > possible condition with no effect

Re: [PATCH] drm/amd/display: Conversion to bool not necessary

2023-01-12 Thread Alex Deucher
Applied. Thanks! Alex On Thu, Jan 12, 2023 at 8:51 AM Deepak R Varma wrote: > > A logical evaluation already results in bool. There is no need for using > a ternary operator based evaluation and bool conversion of the outcome. > Issue identified using boolconv.cocci Coccinelle semantic patch. >

Re: [PATCH] drm/amdgpu: Skip specific mmhub and sdma registers accessing under sriov

2023-01-12 Thread Alex Deucher
On Wed, Jan 11, 2023 at 2:53 AM Yifan Zha wrote: > > [Why] > SDMA0_CNTL and MMHUB system aperture related registers are blocked by L1 > Policy. > Therefore, they cannot be accessed by VF and loged in violation. > > [How] > For MMHUB registers, they will be programmed by PF. So VF will skip to >

[PATCH v2 0/3] drm: Generic fbdev and vga-switcheroo

2023-01-12 Thread Thomas Zimmermann
(was: drm/fb-helper: Set framebuffer for vga-switcheroo clients) This patch has now turned into a little series. The first two patches are bug fixes for the existing code. The third patch cleans up the drivers. Patch 1 fixes i915 to do the correct thing if the device has not been initialized yet.

[PATCH v2 3/3] drm: Call vga_switcheroo_process_delayed_switch() in drm_lastclose

2023-01-12 Thread Thomas Zimmermann
Several lastclose helpers call vga_switcheroo_process_delayed_switch(). It's better to call the helper from drm_lastclose() after the kernel client's screen has been restored. This way, all drivers can benefit without having to implement their own lastclose helper. For drivers without vga-switchero

[PATCH v2 1/3] drm/i915: Allow switching away via vga-switcheroo if uninitialized

2023-01-12 Thread Thomas Zimmermann
Always allow switching away via vga-switcheroo if the display is uninitalized. Instead prevent switching to i915 if the device has not been initialized. This issue was introduced by commit 5df7bd130818 ("drm/i915: skip display initialization when there is no display") protected, which protects cod

[PATCH v2 2/3] drm/fb-helper: Set framebuffer for vga-switcheroo clients

2023-01-12 Thread Thomas Zimmermann
Set the framebuffer info for drivers that support VGA switcheroo. Only affects the amdgpu and nouveau drivers, which use VGA switcheroo and generic fbdev emulation. For other drivers, this does nothing. This fixes a potential regression in the console code. Both, amdgpu and nouveau, invoked vga_sw

Re: [PATCH] Revert "drm/display/dp_mst: Move all payload info into the atomic state"

2023-01-12 Thread Lyude Paul
Acked-by: Lyude Paul On Thu, 2023-01-12 at 16:50 +0800, Wayne Lin wrote: > This reverts commit 4d07b0bc403403438d9cf88450506240c5faf92f. > > [Why] > Changes cause regression on amdgpu mst. > E.g. > In fill_dc_mst_payload_table_from_drm(), amdgpu expects to add/remove payload > one by one and cal

[PATCH] drm/amdkfd: Support process XNACK mode dynamic change

2023-01-12 Thread Philip Yang
Update queue qpd is done for the first queue creation of the process, if the device support XNACK mode per process, update qpd setup sh_mem_config based on the process XNACK mode, to support the process destory all queues, change XNACK mode, and then create queues. Add helper macro KFD_SUPPORT_XNA

Coverity: gfx_v9_0_init_cp_compute_microcode(): Control flow issues

2023-01-12 Thread coverity-bot
Hello! This is an experimental semi-automated report about issues detected by Coverity from a scan of next-20230111 as part of the linux-next scan project: https://scan.coverity.com/projects/linux-next-weekly-scan You're getting this email because you were associated with the identified lines of

Coverity: dm_dmub_sw_init(): Incorrect expression

2023-01-12 Thread coverity-bot
Hello! This is an experimental semi-automated report about issues detected by Coverity from a scan of next-20230111 as part of the linux-next scan project: https://scan.coverity.com/projects/linux-next-weekly-scan You're getting this email because you were associated with the identified lines of

[PATCH] drm/amd: fix some dead code in `gfx_v9_0_init_cp_compute_microcode`

2023-01-12 Thread Mario Limonciello
Some dead code was introdcued as part of utilizing the `amdgpu_ucode_*` helpers. Adjust the control flow to make sure that firmware is released in the appropriate error flows. Reported-by: coverity-bot Addresses-Coverity-ID: 1530548 ("Control flow issues") Fixes: ec787deb2ddf ("drm/amd: Use `amdg

RE: Coverity: dm_dmub_sw_init(): Incorrect expression

2023-01-12 Thread Limonciello, Mario
[AMD Official Use Only - General] This particular one was fixed already in https://patchwork.freedesktop.org/patch/518050/ which got applied today. > -Original Message- > From: coverity-bot > Sent: Thursday, January 12, 2023 16:25 > To: Limonciello, Mario > Cc: linux-ker...@vger.kernel

Re: [PATCH 1/6] drm/amdgpu: Generalize KFD dmabuf import

2023-01-12 Thread Chen, Xiaogang
On 1/11/2023 7:31 PM, Felix Kuehling wrote: Use proper amdgpu_gem_prime_import function to handle all kinds of imports. Remember the dmabuf reference to enable proper multi-GPU attachment to multiple VMs without erroneously re-exporting the underlying BO multiple times. Signed-off-by: Felix Ku

Re: Coverity: dm_dmub_sw_init(): Incorrect expression

2023-01-12 Thread Kees Cook
On Thu, Jan 12, 2023 at 10:39:20PM +, Limonciello, Mario wrote: > This particular one was fixed already in > https://patchwork.freedesktop.org/patch/518050/ which got applied today. Ah-ha; thanks! -- Kees Cook

Re: [PATCH] drm/amd: fix some dead code in `gfx_v9_0_init_cp_compute_microcode`

2023-01-12 Thread Kees Cook
On Thu, Jan 12, 2023 at 04:37:01PM -0600, Mario Limonciello wrote: > Some dead code was introdcued as part of utilizing the `amdgpu_ucode_*` > helpers. Adjust the control flow to make sure that firmware is released > in the appropriate error flows. > > Reported-by: coverity-bot > Addresses-Coveri

Re: [PATCH] drm/amd: fix some dead code in `gfx_v9_0_init_cp_compute_microcode`

2023-01-12 Thread Alex Deucher
On Thu, Jan 12, 2023 at 5:37 PM Mario Limonciello wrote: > > Some dead code was introdcued as part of utilizing the `amdgpu_ucode_*` > helpers. Adjust the control flow to make sure that firmware is released > in the appropriate error flows. > > Reported-by: coverity-bot > Addresses-Coverity-ID: 1

[PATCH] drm/amdgpu: Renoir/Cezanne GPU power reporting issue

2023-01-12 Thread Zhang, Jesse(Jie)
[AMD Official Use Only - General] drm/amdgpu: Correct the power calcultion for Renior/Cezanne. From smu firmware,the value of power is transferred in units of watts. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2321 Fixes: 137aac26a2ed ("drm/amdgpu/smu12: fix power report

RE: [PATCH] drm/amdgpu: Renoir/Cezanne GPU power reporting issue

2023-01-12 Thread Liu, Aaron
Reviewed-by: Aaron Liu aaron@amd.com From: amd-gfx On Behalf Of Zhang, Jesse(Jie) Sent: Friday, January 13, 2023 10:07 AM To: Deucher, Alexander Cc: amd-gfx@lists.freedesktop.org Subject: [PATCH] drm/amdgpu: Renoir/Cezanne GPU power reporting issue [AMD Official

[PATCH] drm/amd/pm: Support RAS fatal error mode1 reset on smu v13_0_0 and v13_0_10

2023-01-12 Thread Candice Li
Support RAS fatal error mode1 reset on smu v13_0_0 and v13_0_10. Signed-off-by: Candice Li Reviewed-by: Evan Quan --- .../drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 42 +-- drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 6 +++ drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h|

RE: [PATCH] drm/amd/pm: Support RAS fatal error mode1 reset on smu v13_0_0 and v13_0_10

2023-01-12 Thread Zhang, Hawking
[AMD Official Use Only - General] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Candice Li Sent: Friday, January 13, 2023 10:40 To: amd-gfx@lists.freedesktop.org Cc: Li, Candice Subject: [PATCH] drm/amd/pm: Support RAS fatal error mode1 reset

[PATCH 1/2] drm/ttm: Check ttm_debugfs_root before creating files under it

2023-01-12 Thread Ma Jun
Check the ttm_debugfs_root before creating files under it. If the ttm_debugfs_root is NULL, all the files created for ttm/ will be placed in the /sys/kerne/debug/ but not /sys/kernel/debug/ttm/ Signed-off-by: Ma Jun --- drivers/gpu/drm/ttm/ttm_device.c | 3 ++- drivers/gpu/drm/ttm/ttm_pool.c

[PATCH 2/2] drm/ttm: Use debugfs_remove_recursive to remove ttm directory

2023-01-12 Thread Ma Jun
Use debugfs_remove_recursive to remove the /sys/kernel/debug/ttm directory for better compatibility. Becuase debugfs_remove fails on older kernel. Signed-off-by: Ma Jun --- drivers/gpu/drm/ttm/ttm_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/ttm/tt

Re: Wrong revert commit in stable channel

2023-01-12 Thread Yury Zhuravlev
Yes, this is right in 6.2-rc3 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit /drivers/gpu/drm/amd/pm/powerplay/hwmgr?h=v6.2-rc3&id=f936f535fa70f35ce3369b1418ebae0e657cda6a But somebody reverted it again for the stable stream: https://git.kernel.org/pub/scm/linux/kerne

[PATCH 1/5] drm/amdgpu: Add gfx ras function on gfx v11_0_3

2023-01-12 Thread YiPeng Chai
Add gfx ras function on gfx v11_0_3. V2: 1. Add separate source files for gfx v11_0_3. 2. Create a common function to initialize gfx ras block. V3: 1. Rename amdgpu_gfx_ras_block_init to amdgpu_gfx_ras_sw_init. 2. Adjust the calling position of amdgpu_gfx_ras_sw_init. 3. Remove gfx_v11_0_3_r

[PATCH 2/5] amd/amdgpu: Add RLC_RLCS_FED_STATUS_* to gc v11_0_3 ip headers

2023-01-12 Thread YiPeng Chai
V2: Add RLC_RLCS_FED_STATUS_0 and RLC_RLCS_FED_STATUS_1 register offset and shift masks. Signed-off-by: YiPeng Chai Reviewed-by: Hawking Zhang Reviewed-by: Tao Zhou Reviewed-by: Alex Deucher --- .../include/asic_reg/gc/gc_11_0_3_offset.h| 8 +++ .../include/asic_reg/gc/gc_11_0_3_sh

[PATCH 3/5] drm/amdgpu: Add gfx ras poison consumption irq handling on gfx v11_0_3

2023-01-12 Thread YiPeng Chai
Add gfx ras poison consumption irq handling on gfx v11_0_3. V2: Move ras poison consumption irq handling code of gfx v11_0_3 to gfx_v11_0_3.c. V5: Create dedicated irq handler for RLC_GC_FED_INTERRUPT. V6: Remove invalid function call. Signed-off-by: YiPeng Chai Reviewed-by: Hawking

[PATCH 4/5] drm/amdgpu: Add gfx cp ecc error irq handling on gfx v11_0_3

2023-01-12 Thread YiPeng Chai
V2: Optimize gfx_v11_0_set_cp_ecc_error_state function. V3: Define macro constant for me pipe instance address interval. V5: Register and handle gfx cp ecc error irq on gfx v11_0_3. V6: Remove invalid intermediate function call. Signed-off-by: YiPeng Chai Reviewed-by: Hawking Zhang Re

[PATCH 5/5] drm/amdgpu: Perform gpu reset after gfx finishes processing ras poison consumption on gfx_v11_0_3

2023-01-12 Thread YiPeng Chai
Perform gpu reset after gfx finishes processing ras poison consumption on gfx_v11_0_3. V2: Move gfx poison consumption handler from hw_ops to ip function level. V3: Adjust the calling position of amdgpu_gfx_poison_consumation_handler. V4: Since gfx v11_0_3 does not have .hw_ops instance, t

[PATCH 1/2] drm/amdgpu: Remove unnecessary ras block support check

2023-01-12 Thread YiPeng Chai
[Why]: For special asic with mem ecc enabled but sram ecc not enabled, some ras blocks can register their ras configuration to ras list, but these ras blocks are not enabled on .ras_enabled, so it can not get ras block object using amdgpu_ras_get_ras_block. [How]: Remove ras block support ch

[PATCH 2/2] drm/amdgpu: Adjust ras support check condition for special asic

2023-01-12 Thread YiPeng Chai
[Why]: Amdgpu ras uses amdgpu_ras_is_supported to check whether the ras block supports the ras function. amdgpu_ras_is_supported uses .ras_enabled to determine whether the ras function of the block is enabled. But for special asic with mem ecc enabled but sram ecc not enabled, som