Re: [PATCH 1/2] drm/amdgpu: Remove duplicate fdinfo fields

2023-10-27 Thread Christian König
Am 27.10.23 um 00:05 schrieb Umio Yasuno: From: Rob Clark Some of the fields that are handled by drm_show_fdinfo() crept back in when rebasing the patch. Remove them again. Fixes: 376c25f8ca47 ("drm/amdgpu: Switch to fdinfo helper") Signed-off-by: Rob Clark Reviewed-by: Co-developed-by: Umi

Re: [PATCH] MAINTAINERS: Update the GPU Scheduler email

2023-10-27 Thread Christian König
Am 26.10.23 um 21:32 schrieb Alex Deucher: On Thu, Oct 26, 2023 at 1:45 PM Luben Tuikov wrote: Update the GPU Scheduler maintainer email. Cc: Alex Deucher Cc: Christian König Cc: Daniel Vetter Cc: Dave Airlie Cc: AMD Graphics Cc: Direct Rendering Infrastructure - Development Signed-off-

RE: [PATCH] drm/amdgpu: use mode-2 reset for RAS poison consumption

2023-10-27 Thread Yang, Stanley
[AMD Official Use Only - General] Reviewed-by: Stanley.Yang Regards, Stanley > -Original Message- > From: amd-gfx On Behalf Of Tao > Zhou > Sent: Friday, October 27, 2023 12:04 PM > To: amd-gfx@lists.freedesktop.org > Cc: Zhou1, Tao > Subject: [PATCH] drm/amdgpu: use mode-2 reset for R

Re: [PATCH] drm/amdgpu/gfx10,11: use memcpy_to/fromio for MQDs

2023-10-27 Thread Christian König
Am 26.10.23 um 20:56 schrieb Alex Deucher: Since they were moved to VRAM, we need to use the IO variants of memcpy. Fixes: 1cfb4d612127 ("drm/amdgpu: put MQDs in VRAM") Signed-off-by: Alex Deucher Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 12 ++--

[PATCH] drm/amdgpu set doorbell range when gpu recovery in sriov environment

2023-10-27 Thread Lin . Cao
GFX doorbell range should be set after flr otherwise the GFX doorbell range will overlap with MEC. Signed-off-by: Lin.Cao --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_

Re: [PATCH 2/5] drm/amdgpu: Add xcc instance parameter to *REG32_SOC15_IP_NO_KIQ (v2)

2023-10-27 Thread Lazar, Lijo
On 10/26/2023 2:22 AM, Victor Lu wrote: The WREG32/RREG32_SOC15_IP_NO_KIQ call is using XCC0's RLCG interface when programming other XCCs. Add xcc instance parameter to them. v2: rebase Signed-off-by: Victor Lu --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 16 drivers

Re: [PATCH 3/5] drm/amdgpu: Use correct KIQ MEC engine for gfx9.4.3 (v3)

2023-10-27 Thread Lazar, Lijo
On 10/26/2023 2:22 AM, Victor Lu wrote: amdgpu_kiq_wreg/rreg is hardcoded to use MEC engine 0. Add an xcc_id parameter to amdgpu_kiq_wreg/rreg, define W/RREG32_XCC and amdgpu_device_xcc_wreg/rreg to to use the new xcc_id parameter. v3: use W/RREG32_XCC to handle non-kiq case v2: define amdg

[PATCH] drm/amdgpu: fix check order ras->in_recovery is earlier than ras feature

2023-10-27 Thread Bob Zhou
Checking ras->in_recovery is earlier than ras feature that causes the below null pointer issue. So update the check order to fix it. BUG: kernel NULL pointer dereference, address: 00e8 RIP: 0010:amdgpu_ras_reset_error_count+0xf6/0x190 [amdgpu] Call Trace: ? show_regs+0x72/0x90 ? __

[PATCH] drm/amdgpu: Drop deferred error in uncorrectable error check

2023-10-27 Thread Candice Li
Drop checking deferred error which can be handled by poison consumption. Signed-off-by: Candice Li --- drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c b/drivers/gpu/drm/amd/amdgpu/umc_v12_0.c ind

RE: [PATCH] drm/amdgpu: Drop deferred error in uncorrectable error check

2023-10-27 Thread Zhang, Hawking
[AMD Official Use Only - General] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Candice Li Sent: Friday, October 27, 2023 18:30 To: amd-gfx@lists.freedesktop.org Cc: Li, Candice Subject: [PATCH] drm/amdgpu: Drop deferred error in uncorrectabl

[PATCH 1/2] drm/amdgpu: set XGMI IP version manually for v6_4

2023-10-27 Thread Tao Zhou
The version can't be queried from discovery table. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index 0b711bac2092..d

[PATCH 2/2] drm/amdgpu: add RAS reset/query operations for XGMI v6_4

2023-10-27 Thread Tao Zhou
Reset/query RAS error status and count. v2: use XGMI IP version instead of WAFL version. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 46 ++-- 1 file changed, 43 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b

Re: [PATCH] drm/amdgpu: add unmap latency when gfx11 set kiq resources

2023-10-27 Thread Deucher, Alexander
[Public] Acked-by: Alex Deucher From: Tong Liu01 Sent: Thursday, October 26, 2023 11:41 PM To: amd-gfx@lists.freedesktop.org Cc: Evan Quan ; Chen, Horace ; Tuikov, Luben ; Koenig, Christian ; Deucher, Alexander ; Xiao, Jack ; Zhang, Hawking ; Liu, Monk ; Xu,

RE: [PATCH 2/2] drm/amdgpu: add RAS reset/query operations for XGMI v6_4

2023-10-27 Thread Zhang, Hawking
[AMD Official Use Only - General] Series is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Tao Zhou Sent: Friday, October 27, 2023 19:33 To: amd-gfx@lists.freedesktop.org Cc: Zhou1, Tao Subject: [PATCH 2/2] drm/amdgpu: add RAS reset/query ope

Re: [PATCH 2/2] drm/amdgpu: Remove unused variables from amdgpu_show_fdinfo

2023-10-27 Thread Alex Deucher
Applied the series. Thanks! Alex On Thu, Oct 26, 2023 at 6:43 PM Umio Yasuno wrote: > > Remove unused variables from amdgpu_show_fdinfo > > Signed-off-by: Umio Yasuno > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 6 -- > 1 file changed, 6 deletions(-) > > diff --git a/drivers/gpu/d

[PATCH 2/3] drm/amdgpu: don't use pci_is_thunderbolt_attached()

2023-10-27 Thread Alex Deucher
It's only valid on Intel systems with the Intel VSEC. Use dev_is_removable() instead. This should do the right thing regardless of the platform. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 dr

[PATCH 1/3] drm/amdgpu: don't use ATRM for external devices

2023-10-27 Thread Alex Deucher
The ATRM ACPI method is for fetching the dGPU vbios rom image on laptops and all-in-one systems. It should not be used for external add in cards. If the dGPU is thunderbolt connected, don't try ATRM. v2: pci_is_thunderbolt_attached only works for Intel. Use pdev->external_facing instead. v3

[PATCH 3/3] drm/amdgpu: add a retry for IP discovery init

2023-10-27 Thread Alex Deucher
AMD dGPUs have integrated FW that runs as soon as the device gets power and initializes the board (determines the amount of memory, provides configuration details to the driver, etc.). For direct PCIe attached cards this happens as soon as power is applied and normally completes well before the OS

Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Lakha, Bhawanpreet
[AMD Official Use Only - General] Thanks, Reviewed-by: Bhawanpreet Lakha From: Yuran Pereira Sent: October 26, 2023 5:25 PM To: airl...@gmail.com Cc: Yuran Pereira ; Wentland, Harry ; Li, Sun peng (Leo) ; Siqueira, Rodrigo ; Deucher, Alexander ; Koenig, Chr

Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Hamza Mahfooz
On 10/26/23 17:25, Yuran Pereira wrote: Since `pr_config` is not initialized after its declaration, the following operations with `replay_enable_option` may be performed when `replay_enable_option` is holding junk values which could possibly lead to undefined behaviour ``` ... pr_confi

Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Hamza Mahfooz
Also, please write the tagline in present tense. On 10/27/23 11:53, Hamza Mahfooz wrote: On 10/26/23 17:25, Yuran Pereira wrote: Since `pr_config` is not initialized after its declaration, the following operations with `replay_enable_option` may be performed when `replay_enable_option` is holdin

Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Lakha, Bhawanpreet
[AMD Official Use Only - General] There was a consensus to use memset instead of {0}. I remember making changes related to that previously. Bhawan From: Mahfooz, Hamza Sent: October 27, 2023 11:53 AM To: Yuran Pereira ; airl...@gmail.com Cc: Li, Sun peng (Le

Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Hamza Mahfooz
On 10/27/23 11:55, Lakha, Bhawanpreet wrote: [AMD Official Use Only - General] There was a consensus to use memset instead of {0}. I remember making changes related to that previously. Hm, seems like it's used rather consistently in the DM and in DC though. Bhawan --

[PATCH 0/7] drm/msm/gem: drm_exec conversion

2023-10-27 Thread Rob Clark
From: Rob Clark Simplify the exec path (removing a legacy optimization) and convert to drm_exec. One drm_exec patch to allow passing in the expected # of GEM objects to avoid re-allocation. I'd be a bit happier if I could avoid the extra objects table allocation in drm_exec in the first place,

[PATCH 6/7] drm/exec: Pass in initial # of objects

2023-10-27 Thread Rob Clark
From: Rob Clark In cases where the # is known ahead of time, it is silly to do the table resize dance. Signed-off-by: Rob Clark --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 4 ++-- drivers/gpu

RE: [PATCH] drm/radeon: replace 1-element arrays with flexible-array members

2023-10-27 Thread Deucher, Alexander
[Public] > -Original Message- > From: José Pekkarinen > Sent: Friday, October 27, 2023 12:59 PM > To: Deucher, Alexander ; Koenig, Christian > ; Pan, Xinhui ; > sk...@linuxfoundation.org > Cc: José Pekkarinen ; airl...@gmail.com; > dan...@ffwll.ch; amd-gfx@lists.freedesktop.org; dri- > de

[PATCH] drm/radeon: replace 1-element arrays with flexible-array members

2023-10-27 Thread José Pekkarinen
Reported by coccinelle, the following patch will move the following 1 element arrays to flexible arrays. drivers/gpu/drm/radeon/atombios.h:5523:32-48: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) dri

[PATCHv2 1/2] drm/amdkfd: Populate cache info for GFX 9.4.3

2023-10-27 Thread Mukul Joshi
GFX 9.4.3 uses a new version of the GC info table which contains the cache info. This patch adds a new function to populate the cache info from IP discovery for GFX 9.4.3. Signed-off-by: Mukul Joshi --- v1->v2: - Separate out the original patch into 2 patches. drivers/gpu/drm/amd/amdkfd/kfd_cra

[PATCHv2 2/2] drm/amdkfd: Update cache info for GFX 9.4.3

2023-10-27 Thread Mukul Joshi
Update cache info reporting based on compute and memory partitioning modes. Signed-off-by: Mukul Joshi --- v1->v2: - Separate into a separate patch. - Simplify the if condition to reduce indentation and make it logically more clear. drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 18 +

[pull] amdgpu, amdkfd drm-next-6.7

2023-10-27 Thread Alex Deucher
Hi Dave, Sima, Fixes for 6.7. The following changes since commit 5258dfd4a6adb5f45f046b0dd2e73c680f880d9d: usb: typec: altmodes/displayport: fixup drm internal api change vs new user. (2023-10-27 07:55:41 +1000) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/

Re: [PATCHv2 1/2] drm/amdkfd: Populate cache info for GFX 9.4.3

2023-10-27 Thread Felix Kuehling
On 2023-10-27 15:04, Mukul Joshi wrote: GFX 9.4.3 uses a new version of the GC info table which contains the cache info. This patch adds a new function to populate the cache info from IP discovery for GFX 9.4.3. Signed-off-by: Mukul Joshi --- v1->v2: - Separate out the original patch into 2 pat

[PATCH] drm/amdgpu: Add xcc instance parameter to *REG32_SOC15_IP_NO_KIQ (v3)

2023-10-27 Thread Victor Lu
The WREG32/RREG32_SOC15_IP_NO_KIQ call is using XCC0's RLCG interface when programming other XCCs. Add xcc instance parameter to them. v3: xcc not needed for MMMHUB v2: rebase Signed-off-by: Victor Lu --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 16 drivers/gpu/drm/amd/amd

[PATCH] drm/amdgpu: Use correct KIQ MEC engine for gfx9.4.3 (v4)

2023-10-27 Thread Victor Lu
amdgpu_kiq_wreg/rreg is hardcoded to use MEC engine 0. Add an xcc_id parameter to amdgpu_kiq_wreg/rreg, define W/RREG32_XCC and amdgpu_device_xcc_wreg/rreg to to use the new xcc_id parameter. Using amdgpu_sriov_runtime to determine whether to access via kiq or RLC is sufficient for now. v4: avoi

[PATCH] drm/amdgpu: Add xcc_inst param to amdgpu_virt_kiq_reg_write_reg_wait (v3)

2023-10-27 Thread Victor Lu
amdgpu_virt_kiq_reg_write_reg_wait is hardcoded to use MEC engine 0. Add xcc_inst as a parameter to allow it to use different MEC engines. v3: use first xcc for MMHUB in gmc_v9_0_flush_gpu_tlb v2: rebase Signed-off-by: Victor Lu --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 5 +++-- drivers/

[PATCH] drm/amdgpu: Change WREG32_RLC to WREG32_SOC15_RLC where inst != 0 (v2)

2023-10-27 Thread Victor Lu
W/RREG32_RLC is hardedcoded to use instance 0. W/RREG32_SOC15_RLC should be used instead when inst != 0. v2: rebase Signed-off-by: Victor Lu --- .../drm/amd/amdgpu/amdgpu_amdkfd_gc_9_4_3.c | 38 -- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 40 +-- drivers

[PATCH] drm/amd: Fix UBSAN array-index-out-of-bounds for Powerplay headers

2023-10-27 Thread Alex Deucher
For pptable structs that use flexible array sizes, use flexible arrays. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039926 Signed-off-by: Alex Deucher --- .../drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h | 4 ++-- .../amd/pm/powerplay/hwmgr/vega10_pptable.h | 24 +

Re: drm/amd: Fix UBSAN array-index-out-of-bounds for Powerplay headers

2023-10-27 Thread Mario Limonciello
On 10/27/2023 15:41, Alex Deucher wrote: For pptable structs that use flexible array sizes, use flexible arrays. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039926 Signed-off-by: Alex Deucher Reviewed-by: Mario Limonciello --- .../drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h

[RFC PATCH] drm/amdkfd: Run restore_workers on freezable WQs

2023-10-27 Thread Felix Kuehling
Make restore workers freezable so we don't have to explicitly flush them in suspend and GPU reset code paths, and we don't accidentally try to restore BOs while the GPU is suspended. Not having to flush restore_work also helps avoid lock/fence dependencies in the GPU reset case where we're not allo