BUG [RESEND]: kernel NULL pointer dereference, address: 0000000000000008

2024-01-22 Thread Mirsad Todorovac
Hi, The last email did not pass to the most of the recipients due to banned .xz attachment. As the .config is too big to send inline or uncompressed either, I will omit it in this attempt. In the meantime, I had some success in decoding the stack trace, but sadly not complete. I don't think

Re: BUG [RESEND]: kernel NULL pointer dereference, address: 0000000000000008

2024-01-22 Thread Ma, Jun
Perhaps similar to the problem I encountered earlier, you can try the following patch https://lists.freedesktop.org/archives/amd-gfx/2024-January/103259.html Regards, Ma Jun On 1/21/2024 3:54 AM, Mirsad Todorovac wrote: > Hi, > > The last email did not pass to the most of the recipients due to

[PATCH 1/2] drm/amdgpu/pm: Add default case for smu IH process func

2024-01-22 Thread Ma Jun
Add default case for smu IH process func. Signed-off-by: Ma Jun --- drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 4 drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 4 drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 5 - 3 files changed, 12 insertions(+), 1 d

[PATCH 2/2] drm/amdgpu/pm: Use macro definitions in the smu IH process function

2024-01-22 Thread Ma Jun
Replace the hard-coded numbers with macro definition Signed-off-by: Ma Jun --- .../pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h | 11 +-- .../pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h | 11 --- drivers/gpu/drm/amd/pm/swsmu/inc/smu_v11_0.h | 5 + drivers/gpu/drm/

[PATCH v2] drm/amdkfd: reserve the BO before validating it

2024-01-22 Thread Lang Yu
Fixes: 410f08516e0f ("drm/amdkfd: Move dma unmapping after TLB flush") v2: Avoid unmapping attachment twice when ERESTARTSYS. [ 41.708711] WARNING: CPU: 0 PID: 1463 at drivers/gpu/drm/ttm/ttm_bo.c:846 ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.708989] Call Trace: [ 41.708992] [ 41.708996]

Re: [PATCH] mm: Remove double faults once write a device pfn

2024-01-22 Thread Christian König
Am 22.01.24 um 04:32 schrieb Xianrong Zhou: The vmf_insert_pfn_prot could cause unnecessary double faults on a device pfn. Because currently the vmf_insert_pfn_prot does not make the pfn writable so the pte entry is normally read-only or dirty catching. What? How do you got to this conclusion?

Re: [PATCH v2 1/2] drm/amdgpu: Reset IH OVERFLOW_CLEAR bit

2024-01-22 Thread Christian König
Am 19.01.24 um 15:38 schrieb Alex Deucher: On Fri, Jan 19, 2024 at 3:11 AM Christian König wrote: Am 18.01.24 um 19:54 schrieb Friedrich Vock: Allows us to detect subsequent IH ring buffer overflows as well. Cc: Joshua Ashton Cc: Alex Deucher Cc: Christian König Cc: sta...@vger.kernel.or

[PATCH] drm/amdgpu: Fix null pointer dereference

2024-01-22 Thread Hawking Zhang
amdgpu_reg_state_sysfs_fini could be invoked at the time when asic_func is even not initialized, i.e., amdgpu_discovery_init fails for some reason. Signed-off-by: Hawking Zhang --- drivers/gpu/drm/amd/include/amdgpu_reg_state.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/

RE: [PATCH] drm/amdgpu: Fix null pointer dereference

2024-01-22 Thread Lazar, Lijo
[AMD Official Use Only - General] Reviewed-by: Lijo Lazar Thanks, Lijo -Original Message- From: Zhang, Hawking Sent: Monday, January 22, 2024 3:27 PM To: amd-gfx@lists.freedesktop.org; Lazar, Lijo ; Deucher, Alexander ; Ma, Le Cc: Zhang, Hawking Subject: [PATCH] drm/amdgpu: Fix null

Re: [PATCH 1/2] drm/amdgpu: Reset IH OVERFLOW_CLEAR bit after writing rptr

2024-01-22 Thread Christian König
Am 19.01.24 um 20:18 schrieb Felix Kuehling: On 2024-01-18 07:07, Christian König wrote: Am 18.01.24 um 00:44 schrieb Friedrich Vock: On 18.01.24 00:00, Alex Deucher wrote: [SNIP] Right now, IH overflows, even if they occur repeatedly, only get registered once. If not registering IH overflows

Re: [PATCH 1/2] drm/amdgpu: Reset IH OVERFLOW_CLEAR bit after writing rptr

2024-01-22 Thread Friedrich Vock
On 22.01.24 11:10, Christian König wrote: Am 19.01.24 um 20:18 schrieb Felix Kuehling: On 2024-01-18 07:07, Christian König wrote: Am 18.01.24 um 00:44 schrieb Friedrich Vock: On 18.01.24 00:00, Alex Deucher wrote: [SNIP] Right now, IH overflows, even if they occur repeatedly, only get regis

Re: [PATCH 1/2] drm/amdgpu: Reset IH OVERFLOW_CLEAR bit after writing rptr

2024-01-22 Thread Friedrich Vock
On 22.01.24 11:21, Friedrich Vock wrote: On 22.01.24 11:10, Christian König wrote: Am 19.01.24 um 20:18 schrieb Felix Kuehling: On 2024-01-18 07:07, Christian König wrote: Am 18.01.24 um 00:44 schrieb Friedrich Vock: On 18.01.24 00:00, Alex Deucher wrote: [SNIP] Right now, IH overflows, eve

Re: [PATCH] drm/amdgpu: check flag ring->no_scheduler before usage

2024-01-22 Thread Christian König
Am 21.01.24 um 01:19 schrieb vitaly.pros...@amd.com: From: Vitaly Prosyak The issue started to appear after the following commit 11b3b9f461c5c4f700f6c8da202fcc2fd6418e1f (scheduler to variable number of run-queues). The scheduler flag ready (ring->sched.ready) could not be used to val

Re: [PATCH] drm/amd/display: Address kdoc for eDP Panel Replay feature in 'amdgpu_dm_crtc_set_panel_sr_feature()'

2024-01-22 Thread Chung, ChiaHsuan (Tom)
Reviewed-by: Tom Chung On 1/22/2024 12:14 PM, Srinivasan Shanmugam wrote: Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_crtc.c:100: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * The DRM vb

Re: [PATCH 2/2] drm/amdgpu/pm: Use macro definitions in the smu IH process function

2024-01-22 Thread Lazar, Lijo
On 1/22/2024 2:12 PM, Ma Jun wrote: > Replace the hard-coded numbers with macro definition > > Signed-off-by: Ma Jun > --- > .../pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_0.h | 11 +-- > .../pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_7.h | 11 --- > drivers/gpu/drm/amd/pm/sws

Re: [PATCH 1/2] drm/amdgpu: Reset IH OVERFLOW_CLEAR bit after writing rptr

2024-01-22 Thread Christian König
Am 22.01.24 um 11:45 schrieb Friedrich Vock: On 22.01.24 11:21, Friedrich Vock wrote: On 22.01.24 11:10, Christian König wrote: Am 19.01.24 um 20:18 schrieb Felix Kuehling: On 2024-01-18 07:07, Christian König wrote: Am 18.01.24 um 00:44 schrieb Friedrich Vock: On 18.01.24 00:00, Alex Deuche

Re: [PATCH 1/2] drm/amdgpu/pm: Add default case for smu IH process func

2024-01-22 Thread Alex Deucher
On Mon, Jan 22, 2024 at 3:52 AM Ma Jun wrote: > > Add default case for smu IH process func. > > Signed-off-by: Ma Jun > --- > drivers/gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 4 > drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 4 > drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v

[PATCH AUTOSEL 6.7 13/88] drm/amd/display: Fix tiled display misalignment

2024-01-22 Thread Sasha Levin
From: Meenakshikumar Somasundaram [ Upstream commit c4b8394e76adba4f50a3c2696c75b214a291e24a ] [Why] When otg workaround is applied during clock update, otgs of tiled display went out of sync. [How] To call dc_trigger_sync() after clock update to sync otgs again. Reviewed-by: Nicholas Kazlausk

[PATCH AUTOSEL 6.7 18/88] drm/amd/display: Fix MST PBN/X.Y value calculations

2024-01-22 Thread Sasha Levin
From: Ilya Bakoulin [ Upstream commit 94bbf802efd0a8f13147d6664af6e653637340a8 ] Changing PBN calculation to be more in line with spec. We don't need to inflate PBN_NATIVE value by the 1.006 margin, since that is already taken care of in the get_pbn_per_slot function. Tested-by: Daniel Wheeler

[PATCH AUTOSEL 6.7 19/88] drm/amd/display: Fix disable_otg_wa logic

2024-01-22 Thread Sasha Levin
From: Nicholas Susanto [ Upstream commit 2ce156482a6fef349d2eba98e5070c412d3af662 ] [Why] When switching to another HDMI mode, we are unnecesarilly disabling/enabling FIFO causing both HPO and DIG registers to be set at the same time when only HPO is supposed to be set. This can lead to a syste

[PATCH AUTOSEL 6.7 20/88] drm/amd/display: Fix Replay Desync Error IRQ handler

2024-01-22 Thread Sasha Levin
From: Dennis Chan [ Upstream commit dd5c6362ddcd8bdb07704faff8648593885ecfa1 ] In previous case, Replay didn't identify the IRQ type, This commit fixes the issues for the interrupt. Tested-by: Daniel Wheeler Reviewed-by: Robin Chen Acked-by: Rodrigo Siqueira Signed-off-by: Dennis Chan Signe

[PATCH AUTOSEL 6.7 17/88] drm/amd/display: initialize all the dpm level's stutter latency

2024-01-22 Thread Sasha Levin
From: Charlene Liu [ Upstream commit 885c71ad791c1709f668a37f701d33e6872a902f ] Fix issue when override level bigger than default. Levels 5, 6, and 7 had zero stutter latency, this is because override level being initialized after stutter latency inits. Tested-by: Daniel Wheeler Reviewed-by: S

[PATCH AUTOSEL 6.7 21/88] drm/amd/display: add support for DTO genarated dscclk

2024-01-22 Thread Sasha Levin
From: Wenjing Liu [ Upstream commit 08a32addf17317b9fac55be9b31275cbf6e41fb7 ] Current implementation will choose to use refclk as dscclk. This is not recommended by hardware team as refclk is a fixed value which could cause unnecessary power consumption or it could be not enough for large DSC t

[PATCH AUTOSEL 6.7 22/88] drm/amd/display: Fix writeback_info never got updated

2024-01-22 Thread Sasha Levin
From: Alex Hung [ Upstream commit c09919e6ea5fefd49d8b7b54aa5b222937163108 ] [WHY] wb_enabled field is set to false before it is used, and the following code will never be executed. [HOW] Setting wb_enable to false after all removal work is completed. Tested-by: Daniel Wheeler Reviewed-by: Ha

[PATCH AUTOSEL 6.7 23/88] drm/amd/display: Fix writeback_info is not removed

2024-01-22 Thread Sasha Levin
From: Alex Hung [ Upstream commit ab37b88ed9de9de8d582683f7ea17059f1251a7f ] [WHY] Counter j was not updated to present the num of writeback_info when writeback pipes are removed. [HOW] update j (num of writeback info) under the correct condition. Tested-by: Daniel Wheeler Reviewed-by: Harry

[PATCH AUTOSEL 6.7 50/88] drm/amd/display: For prefetch mode > 0, extend prefetch if possible

2024-01-22 Thread Sasha Levin
From: Alvin Lee [ Upstream commit dd4e4bb28843393065eed279e869fac248d03f0f ] [Description] For mode programming we want to extend the prefetch as much as possible (up to oto, or as long as we can for equ) if we're not already applying the 60us prefetch requirement. This is to avoid intermittent

[PATCH AUTOSEL 6.7 52/88] drm/amdkfd: fix mes set shader debugger process management

2024-01-22 Thread Sasha Levin
From: Jonathan Kim [ Upstream commit bd33bb1409b494558a2935f7bbc7842def957fcd ] MES provides the driver a call to explicitly flush stale process memory within the MES to avoid a race condition that results in a fatal memory violation. When SET_SHADER_DEBUGGER is called, the driver passes a memo

[PATCH AUTOSEL 6.7 51/88] drm/amd/display: Force p-state disallow if leaving no plane config

2024-01-22 Thread Sasha Levin
From: Alvin Lee [ Upstream commit 9a902a9073c287353e25913c0761bfed49d75a88 ] [Description] - When we're in a no plane config, DCN is always asserting P-State allow - This creates a scenario where the P-State blackout can start just as VUPDATE takes place and transitions the DCN config to a

[PATCH] drm/amdgpu: covert some variable sized arrays to [] style

2024-01-22 Thread Alex Deucher
Replace [1] with []. Silences UBSAN warnings. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3107 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/include/pptable.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/include/pptable.h b/drive

[PATCH AUTOSEL 6.7 58/88] drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap

2024-01-22 Thread Sasha Levin
From: "Wang, Beyond" [ Upstream commit 94aeb4117343d072e3a35b9595bcbfc0058ee724 ] Issue: during evict or validate happened on amdgpu_bo, the 'from' and 'to' is always same in ftrace event of amdgpu_bo_move where calling the 'trace_amdgpu_bo_move', the comment says move_notify is called before m

[PATCH AUTOSEL 6.7 67/88] drm/amd/display: fix usb-c connector_type

2024-01-22 Thread Sasha Levin
From: Allen Pan [ Upstream commit 0d26644bc57d8737c8e2fb3145366f7d0b941935 ] [why] BIOS switches to use USB-C connector type 0x18, but VBIOS's objectInfo table not supported yet. driver needs to patch it based on enc_cap from system integration info table. Reviewed-by: Charlene Liu Acked-by: W

[PATCH AUTOSEL 6.7 68/88] drm/amd/display: Fix lightup regression with DP2 single display configs

2024-01-22 Thread Sasha Levin
From: Michael Strauss [ Upstream commit 5a82b8d6c05f9b30828ede1b103b9ee5cb5c912e ] [WHY] Previous fix for multiple displays downstream of DP2 MST hub caused regression [HOW] Match sink IDs instead of sink struct addresses Reviewed-by: Nicholas Kazlauskas Reviewed-by: Charlene Liu Acked-by: W

[PATCH AUTOSEL 6.7 66/88] drm/amd/display: make flip_timestamp_in_us a 64-bit variable

2024-01-22 Thread Sasha Levin
From: Josip Pavic [ Upstream commit 6fb12518ca58412dc51054e2a7400afb41328d85 ] [Why] This variable currently overflows after about 71 minutes. This doesn't cause any known functional issues but it does make debugging more difficult. [How] Make it a 64-bit variable. Reviewed-by: Aric Cyr Acked

[PATCH AUTOSEL 6.7 75/88] drm/amdgpu: Let KFD sync with VM fences

2024-01-22 Thread Sasha Levin
From: Felix Kuehling [ Upstream commit ec9ba4821fa52b5efdbc4cdf0a77497990655231 ] Change the rules for amdgpu_sync_resv to let KFD synchronize with VM fences on page table reservations. This fixes intermittent memory corruption after evictions when using amdgpu_vm_handle_moved to update page tab

[PATCH AUTOSEL 6.7 74/88] drm/amd/display: Fix minor issues in BW Allocation Phase2

2024-01-22 Thread Sasha Levin
From: Meenakshikumar Somasundaram [ Upstream commit aa5dc05340eb97486a631ce6bccb8d020bf6b56b ] [Why] Fix minor issues in BW Allocation Phase2. [How] - In set_usb4_req_bw_req(), link->dpia_bw_alloc_config.response_ready flag should be reset before writing DPCD REQUEST_BW. - Fix the granularity

[PATCH AUTOSEL 6.7 76/88] drm/amd/display: Fixing stream allocation regression

2024-01-22 Thread Sasha Levin
From: Relja Vojvodic [ Upstream commit 292c2116b2ae84c7e799ae340981e60551b18f5e ] For certain dual display configs that had one display using a 1080p mode, the DPM level used to drive the configs regressed from DPM 0 to DPM 3. This was caused by a missing check that should have only limited the

[PATCH AUTOSEL 6.7 69/88] drm/amd/display: Only clear symclk otg flag for HDMI

2024-01-22 Thread Sasha Levin
From: Alvin Lee [ Upstream commit dff45f03f508c92cd8eb2050e27b726726b8ae0b ] [Description] There is a corner case where the symclk otg flag is cleared when disabling the phantom pipe for subvp (because the phantom and main pipe share the same link). This is undesired because we need the maintain

[PATCH AUTOSEL 6.7 78/88] drm/amdgpu: Fix possible NULL dereference in amdgpu_ras_query_error_status_helper()

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit b8d55a90fd55b767c25687747e2b24abd1ef8680 ] Return invalid error code -EINVAL for invalid block id. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1183 amdgpu_ras_query_error_status_helper() error: we previously assumed 'info' could be nu

[PATCH AUTOSEL 6.7 77/88] Re-revert "drm/amd/display: Enable Replay for static screen use cases"

2024-01-22 Thread Sasha Levin
From: Ivan Lipski [ Upstream commit d6398866a6b47e92319ef6efdb0126a4fbb7796a ] This reverts commit 44e60b14d5a72f91fd0bdeae8da59ae37a3ca8e5. Since, it causes a regression in which eDP displays with PSR support, but no Replay support (Sink support <= 0x03), fail to enable PSR and consequently al

[PATCH AUTOSEL 6.7 73/88] drm/amdgpu: Fix ecc irq enable/disable unpaired

2024-01-22 Thread Sasha Levin
From: "Stanley.Yang" [ Upstream commit a32c6f7f5737cc7e31cd7ad5133f0d96fca12ea6 ] The ecc_irq is disabled while GPU mode2 reset suspending process, but not be enabled during GPU mode2 reset resume process. Changed from V1: only do sdma/gfx ras_late_init in aldebaran_mode2_restore_ip

[PATCH AUTOSEL 6.7 79/88] drm/amdgpu: Fix variable 'mca_funcs' dereferenced before NULL check in 'amdgpu_mca_smu_get_mca_entry()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit 4f32504a2f85a7b40fe149436881381f48e9c0c0 ] Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c:377 amdgpu_mca_smu_get_mca_entry() warn: variable dereferenced before check 'mca_funcs' (see line 368) 357 int amdgpu_mca_smu_get_mca_entry(struct a

[PATCH AUTOSEL 6.7 80/88] drm/amdgpu: Fix '*fw' from request_firmware() not released in 'amdgpu_ucode_request()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit 13a1851f923d9a7a78a477497295c2dfd16ad4a4 ] Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c:1404 amdgpu_ucode_request() warn: '*fw' from request_firmware() not released on lines: 1404. Cc: Mario Limonciello Cc: Lijo Lazar Cc: Christian K

[PATCH AUTOSEL 6.7 82/88] drm/amdkfd: Fix iterator used outside loop in 'kfd_add_peer_prop()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit b1a428b45dc7e47c7acc2ad0d08d8a6dda910c4c ] Fix the following about iterator use: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1456 kfd_add_peer_prop() warn: iterator used outside loop: 'iolink3' Cc: Felix Kuehling Cc: Christian König Cc: Al

[PATCH AUTOSEL 6.7 83/88] Revert "drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole"

2024-01-22 Thread Sasha Levin
From: Kaibo Ma [ Upstream commit 0f35b0a7b8fa402adbffa2565047cdcc4c480153 ] That commit causes NULL pointer dereferences in dmesgs when running applications using ROCm, including clinfo, blender, and PyTorch, since v6.6.1. Revert it to fix blender again. This reverts commit 96c211f1f9ef82183493

[PATCH AUTOSEL 6.7 81/88] drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit bf2ad4fb8adca89374b54b225d494e0b1956dbea ] Return value of container_of(...) can't be null, so null check is not required for 'fence'. Hence drop its NULL check. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c:93 to_amdgpu_amdkfd_fe

[PATCH AUTOSEL 6.7 84/88] drm/amdgpu: apply the RV2 system aperture fix to RN/CZN as well

2024-01-22 Thread Sasha Levin
From: Alex Deucher [ Upstream commit 16783d8ef08448815e149e40c82fc1e1fc41ddbf ] These chips needs the same fix. This was previously not seen on then since the AGP aperture expanded the system aperture, but this showed up again when AGP was disabled. Reviewed-and-tested-by: Jiadong Zhu Signed-

[PATCH AUTOSEL 6.6 11/73] drm/amd/display: Fix tiled display misalignment

2024-01-22 Thread Sasha Levin
From: Meenakshikumar Somasundaram [ Upstream commit c4b8394e76adba4f50a3c2696c75b214a291e24a ] [Why] When otg workaround is applied during clock update, otgs of tiled display went out of sync. [How] To call dc_trigger_sync() after clock update to sync otgs again. Reviewed-by: Nicholas Kazlausk

[PATCH AUTOSEL 6.6 16/73] drm/amd/display: Fix writeback_info never got updated

2024-01-22 Thread Sasha Levin
From: Alex Hung [ Upstream commit 8a30c36e15f38c9f23778babcd368144c7d8 ] [WHY] wb_enabled field is set to false before it is used, and the following code will never be executed. [HOW] Setting wb_enable to false after all removal work is completed. Reviewed-by: Harry Wentland Signed-off-by

[PATCH AUTOSEL 6.6 15/73] drm/amd/display: Fix MST PBN/X.Y value calculations

2024-01-22 Thread Sasha Levin
From: Ilya Bakoulin [ Upstream commit 94bbf802efd0a8f13147d6664af6e653637340a8 ] Changing PBN calculation to be more in line with spec. We don't need to inflate PBN_NATIVE value by the 1.006 margin, since that is already taken care of in the get_pbn_per_slot function. Tested-by: Daniel Wheeler

[PATCH AUTOSEL 6.6 17/73] drm/amd/display: Fix writeback_info is not removed

2024-01-22 Thread Sasha Levin
From: Alex Hung [ Upstream commit 5b89d2ccc8466e0445a4994cb288fc009b565de5 ] [WHY] Counter j was not updated to present the num of writeback_info when writeback pipes are removed. [HOW] update j (num of writeback info) under the correct condition. Reviewed-by: Harry Wentland Signed-off-by: Al

[PATCH AUTOSEL 6.6 44/73] drm/amd/display: Force p-state disallow if leaving no plane config

2024-01-22 Thread Sasha Levin
From: Alvin Lee [ Upstream commit 9a902a9073c287353e25913c0761bfed49d75a88 ] [Description] - When we're in a no plane config, DCN is always asserting P-State allow - This creates a scenario where the P-State blackout can start just as VUPDATE takes place and transitions the DCN config to a

[PATCH AUTOSEL 6.6 45/73] drm/amdkfd: fix mes set shader debugger process management

2024-01-22 Thread Sasha Levin
From: Jonathan Kim [ Upstream commit bd33bb1409b494558a2935f7bbc7842def957fcd ] MES provides the driver a call to explicitly flush stale process memory within the MES to avoid a race condition that results in a fatal memory violation. When SET_SHADER_DEBUGGER is called, the driver passes a memo

[PATCH AUTOSEL 6.6 43/73] drm/amd/display: For prefetch mode > 0, extend prefetch if possible

2024-01-22 Thread Sasha Levin
From: Alvin Lee [ Upstream commit dd4e4bb28843393065eed279e869fac248d03f0f ] [Description] For mode programming we want to extend the prefetch as much as possible (up to oto, or as long as we can for equ) if we're not already applying the 60us prefetch requirement. This is to avoid intermittent

[PATCH AUTOSEL 6.6 50/73] drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap

2024-01-22 Thread Sasha Levin
From: "Wang, Beyond" [ Upstream commit 94aeb4117343d072e3a35b9595bcbfc0058ee724 ] Issue: during evict or validate happened on amdgpu_bo, the 'from' and 'to' is always same in ftrace event of amdgpu_bo_move where calling the 'trace_amdgpu_bo_move', the comment says move_notify is called before m

[PATCH AUTOSEL 6.6 56/73] drm/amd/display: make flip_timestamp_in_us a 64-bit variable

2024-01-22 Thread Sasha Levin
From: Josip Pavic [ Upstream commit 6fb12518ca58412dc51054e2a7400afb41328d85 ] [Why] This variable currently overflows after about 71 minutes. This doesn't cause any known functional issues but it does make debugging more difficult. [How] Make it a 64-bit variable. Reviewed-by: Aric Cyr Acked

[PATCH AUTOSEL 6.6 57/73] drm/amd/display: Only clear symclk otg flag for HDMI

2024-01-22 Thread Sasha Levin
From: Alvin Lee [ Upstream commit dff45f03f508c92cd8eb2050e27b726726b8ae0b ] [Description] There is a corner case where the symclk otg flag is cleared when disabling the phantom pipe for subvp (because the phantom and main pipe share the same link). This is undesired because we need the maintain

[PATCH AUTOSEL 6.6 60/73] drm/amdgpu: Fix ecc irq enable/disable unpaired

2024-01-22 Thread Sasha Levin
From: "Stanley.Yang" [ Upstream commit a32c6f7f5737cc7e31cd7ad5133f0d96fca12ea6 ] The ecc_irq is disabled while GPU mode2 reset suspending process, but not be enabled during GPU mode2 reset resume process. Changed from V1: only do sdma/gfx ras_late_init in aldebaran_mode2_restore_ip

[PATCH AUTOSEL 6.6 62/73] drm/amdgpu: Let KFD sync with VM fences

2024-01-22 Thread Sasha Levin
From: Felix Kuehling [ Upstream commit ec9ba4821fa52b5efdbc4cdf0a77497990655231 ] Change the rules for amdgpu_sync_resv to let KFD synchronize with VM fences on page table reservations. This fixes intermittent memory corruption after evictions when using amdgpu_vm_handle_moved to update page tab

[PATCH AUTOSEL 6.6 61/73] drm/amd/display: Fix minor issues in BW Allocation Phase2

2024-01-22 Thread Sasha Levin
From: Meenakshikumar Somasundaram [ Upstream commit aa5dc05340eb97486a631ce6bccb8d020bf6b56b ] [Why] Fix minor issues in BW Allocation Phase2. [How] - In set_usb4_req_bw_req(), link->dpia_bw_alloc_config.response_ready flag should be reset before writing DPCD REQUEST_BW. - Fix the granularity

[PATCH AUTOSEL 6.6 66/73] drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit bf2ad4fb8adca89374b54b225d494e0b1956dbea ] Return value of container_of(...) can't be null, so null check is not required for 'fence'. Hence drop its NULL check. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c:93 to_amdgpu_amdkfd_fe

[PATCH AUTOSEL 6.6 67/73] drm/amdkfd: Fix iterator used outside loop in 'kfd_add_peer_prop()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit b1a428b45dc7e47c7acc2ad0d08d8a6dda910c4c ] Fix the following about iterator use: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1456 kfd_add_peer_prop() warn: iterator used outside loop: 'iolink3' Cc: Felix Kuehling Cc: Christian König Cc: Al

[PATCH AUTOSEL 6.6 63/73] drm/amd/display: Fixing stream allocation regression

2024-01-22 Thread Sasha Levin
From: Relja Vojvodic [ Upstream commit 292c2116b2ae84c7e799ae340981e60551b18f5e ] For certain dual display configs that had one display using a 1080p mode, the DPM level used to drive the configs regressed from DPM 0 to DPM 3. This was caused by a missing check that should have only limited the

[PATCH AUTOSEL 6.6 65/73] drm/amdgpu: Fix '*fw' from request_firmware() not released in 'amdgpu_ucode_request()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit 13a1851f923d9a7a78a477497295c2dfd16ad4a4 ] Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c:1404 amdgpu_ucode_request() warn: '*fw' from request_firmware() not released on lines: 1404. Cc: Mario Limonciello Cc: Lijo Lazar Cc: Christian K

[PATCH AUTOSEL 6.6 64/73] Re-revert "drm/amd/display: Enable Replay for static screen use cases"

2024-01-22 Thread Sasha Levin
From: Ivan Lipski [ Upstream commit d6398866a6b47e92319ef6efdb0126a4fbb7796a ] This reverts commit 44e60b14d5a72f91fd0bdeae8da59ae37a3ca8e5. Since, it causes a regression in which eDP displays with PSR support, but no Replay support (Sink support <= 0x03), fail to enable PSR and consequently al

[PATCH AUTOSEL 6.6 68/73] Revert "drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole"

2024-01-22 Thread Sasha Levin
From: Kaibo Ma [ Upstream commit 0f35b0a7b8fa402adbffa2565047cdcc4c480153 ] That commit causes NULL pointer dereferences in dmesgs when running applications using ROCm, including clinfo, blender, and PyTorch, since v6.6.1. Revert it to fix blender again. This reverts commit 96c211f1f9ef82183493

[PATCH AUTOSEL 6.6 69/73] drm/amdgpu: apply the RV2 system aperture fix to RN/CZN as well

2024-01-22 Thread Sasha Levin
From: Alex Deucher [ Upstream commit 16783d8ef08448815e149e40c82fc1e1fc41ddbf ] These chips needs the same fix. This was previously not seen on then since the AGP aperture expanded the system aperture, but this showed up again when AGP was disabled. Reviewed-and-tested-by: Jiadong Zhu Signed-

[PATCH AUTOSEL 6.1 10/53] drm/amd/display: Fix tiled display misalignment

2024-01-22 Thread Sasha Levin
From: Meenakshikumar Somasundaram [ Upstream commit c4b8394e76adba4f50a3c2696c75b214a291e24a ] [Why] When otg workaround is applied during clock update, otgs of tiled display went out of sync. [How] To call dc_trigger_sync() after clock update to sync otgs again. Reviewed-by: Nicholas Kazlausk

[PATCH AUTOSEL 6.1 14/53] drm/amd/display: Fix writeback_info never got updated

2024-01-22 Thread Sasha Levin
From: Alex Hung [ Upstream commit 8a30c36e15f38c9f23778babcd368144c7d8 ] [WHY] wb_enabled field is set to false before it is used, and the following code will never be executed. [HOW] Setting wb_enable to false after all removal work is completed. Reviewed-by: Harry Wentland Signed-off-by

[PATCH AUTOSEL 6.1 15/53] drm/amd/display: Fix writeback_info is not removed

2024-01-22 Thread Sasha Levin
From: Alex Hung [ Upstream commit 5b89d2ccc8466e0445a4994cb288fc009b565de5 ] [WHY] Counter j was not updated to present the num of writeback_info when writeback pipes are removed. [HOW] update j (num of writeback info) under the correct condition. Reviewed-by: Harry Wentland Signed-off-by: Al

[PATCH AUTOSEL 6.1 35/53] drm/amd/display: For prefetch mode > 0, extend prefetch if possible

2024-01-22 Thread Sasha Levin
From: Alvin Lee [ Upstream commit dd4e4bb28843393065eed279e869fac248d03f0f ] [Description] For mode programming we want to extend the prefetch as much as possible (up to oto, or as long as we can for equ) if we're not already applying the 60us prefetch requirement. This is to avoid intermittent

[PATCH AUTOSEL 6.1 44/53] drm/amdgpu: Fix ecc irq enable/disable unpaired

2024-01-22 Thread Sasha Levin
From: "Stanley.Yang" [ Upstream commit a32c6f7f5737cc7e31cd7ad5133f0d96fca12ea6 ] The ecc_irq is disabled while GPU mode2 reset suspending process, but not be enabled during GPU mode2 reset resume process. Changed from V1: only do sdma/gfx ras_late_init in aldebaran_mode2_restore_ip

[PATCH AUTOSEL 6.1 45/53] drm/amdgpu: Let KFD sync with VM fences

2024-01-22 Thread Sasha Levin
From: Felix Kuehling [ Upstream commit ec9ba4821fa52b5efdbc4cdf0a77497990655231 ] Change the rules for amdgpu_sync_resv to let KFD synchronize with VM fences on page table reservations. This fixes intermittent memory corruption after evictions when using amdgpu_vm_handle_moved to update page tab

[PATCH AUTOSEL 6.1 46/53] drm/amd/display: Fixing stream allocation regression

2024-01-22 Thread Sasha Levin
From: Relja Vojvodic [ Upstream commit 292c2116b2ae84c7e799ae340981e60551b18f5e ] For certain dual display configs that had one display using a 1080p mode, the DPM level used to drive the configs regressed from DPM 0 to DPM 3. This was caused by a missing check that should have only limited the

[PATCH AUTOSEL 6.1 38/53] drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap

2024-01-22 Thread Sasha Levin
From: "Wang, Beyond" [ Upstream commit 94aeb4117343d072e3a35b9595bcbfc0058ee724 ] Issue: during evict or validate happened on amdgpu_bo, the 'from' and 'to' is always same in ftrace event of amdgpu_bo_move where calling the 'trace_amdgpu_bo_move', the comment says move_notify is called before m

[PATCH AUTOSEL 6.1 42/53] drm/amd/display: make flip_timestamp_in_us a 64-bit variable

2024-01-22 Thread Sasha Levin
From: Josip Pavic [ Upstream commit 6fb12518ca58412dc51054e2a7400afb41328d85 ] [Why] This variable currently overflows after about 71 minutes. This doesn't cause any known functional issues but it does make debugging more difficult. [How] Make it a 64-bit variable. Reviewed-by: Aric Cyr Acked

[PATCH AUTOSEL 6.1 47/53] drm/amdgpu: Fix '*fw' from request_firmware() not released in 'amdgpu_ucode_request()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit 13a1851f923d9a7a78a477497295c2dfd16ad4a4 ] Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c:1404 amdgpu_ucode_request() warn: '*fw' from request_firmware() not released on lines: 1404. Cc: Mario Limonciello Cc: Lijo Lazar Cc: Christian K

[PATCH AUTOSEL 6.1 49/53] drm/amdkfd: Fix iterator used outside loop in 'kfd_add_peer_prop()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit b1a428b45dc7e47c7acc2ad0d08d8a6dda910c4c ] Fix the following about iterator use: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1456 kfd_add_peer_prop() warn: iterator used outside loop: 'iolink3' Cc: Felix Kuehling Cc: Christian König Cc: Al

[PATCH AUTOSEL 6.1 48/53] drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit bf2ad4fb8adca89374b54b225d494e0b1956dbea ] Return value of container_of(...) can't be null, so null check is not required for 'fence'. Hence drop its NULL check. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c:93 to_amdgpu_amdkfd_fe

[PATCH AUTOSEL 5.15 07/35] drm/amd/display: Fix tiled display misalignment

2024-01-22 Thread Sasha Levin
From: Meenakshikumar Somasundaram [ Upstream commit c4b8394e76adba4f50a3c2696c75b214a291e24a ] [Why] When otg workaround is applied during clock update, otgs of tiled display went out of sync. [How] To call dc_trigger_sync() after clock update to sync otgs again. Reviewed-by: Nicholas Kazlausk

[PATCH AUTOSEL 5.15 09/35] drm/amd/display: Fix writeback_info never got updated

2024-01-22 Thread Sasha Levin
From: Alex Hung [ Upstream commit 8a30c36e15f38c9f23778babcd368144c7d8 ] [WHY] wb_enabled field is set to false before it is used, and the following code will never be executed. [HOW] Setting wb_enable to false after all removal work is completed. Reviewed-by: Harry Wentland Signed-off-by

[PATCH AUTOSEL 5.15 26/35] drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap

2024-01-22 Thread Sasha Levin
From: "Wang, Beyond" [ Upstream commit 94aeb4117343d072e3a35b9595bcbfc0058ee724 ] Issue: during evict or validate happened on amdgpu_bo, the 'from' and 'to' is always same in ftrace event of amdgpu_bo_move where calling the 'trace_amdgpu_bo_move', the comment says move_notify is called before m

[PATCH AUTOSEL 5.15 30/35] drm/amd/display: make flip_timestamp_in_us a 64-bit variable

2024-01-22 Thread Sasha Levin
From: Josip Pavic [ Upstream commit 6fb12518ca58412dc51054e2a7400afb41328d85 ] [Why] This variable currently overflows after about 71 minutes. This doesn't cause any known functional issues but it does make debugging more difficult. [How] Make it a 64-bit variable. Reviewed-by: Aric Cyr Acked

[PATCH AUTOSEL 5.15 34/35] drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit bf2ad4fb8adca89374b54b225d494e0b1956dbea ] Return value of container_of(...) can't be null, so null check is not required for 'fence'. Hence drop its NULL check. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c:93 to_amdgpu_amdkfd_fe

[PATCH AUTOSEL 5.15 33/35] drm/amdgpu: Let KFD sync with VM fences

2024-01-22 Thread Sasha Levin
From: Felix Kuehling [ Upstream commit ec9ba4821fa52b5efdbc4cdf0a77497990655231 ] Change the rules for amdgpu_sync_resv to let KFD synchronize with VM fences on page table reservations. This fixes intermittent memory corruption after evictions when using amdgpu_vm_handle_moved to update page tab

[PATCH AUTOSEL 5.10 07/28] drm/amd/display: Fix tiled display misalignment

2024-01-22 Thread Sasha Levin
From: Meenakshikumar Somasundaram [ Upstream commit c4b8394e76adba4f50a3c2696c75b214a291e24a ] [Why] When otg workaround is applied during clock update, otgs of tiled display went out of sync. [How] To call dc_trigger_sync() after clock update to sync otgs again. Reviewed-by: Nicholas Kazlausk

[PATCH AUTOSEL 5.10 09/28] drm/amd/display: Fix writeback_info never got updated

2024-01-22 Thread Sasha Levin
From: Alex Hung [ Upstream commit 8a30c36e15f38c9f23778babcd368144c7d8 ] [WHY] wb_enabled field is set to false before it is used, and the following code will never be executed. [HOW] Setting wb_enable to false after all removal work is completed. Reviewed-by: Harry Wentland Signed-off-by

[PATCH AUTOSEL 5.10 26/28] drm/amd/display: make flip_timestamp_in_us a 64-bit variable

2024-01-22 Thread Sasha Levin
From: Josip Pavic [ Upstream commit 6fb12518ca58412dc51054e2a7400afb41328d85 ] [Why] This variable currently overflows after about 71 minutes. This doesn't cause any known functional issues but it does make debugging more difficult. [How] Make it a 64-bit variable. Reviewed-by: Aric Cyr Acked

[PATCH AUTOSEL 5.10 27/28] drm/amdgpu: Let KFD sync with VM fences

2024-01-22 Thread Sasha Levin
From: Felix Kuehling [ Upstream commit ec9ba4821fa52b5efdbc4cdf0a77497990655231 ] Change the rules for amdgpu_sync_resv to let KFD synchronize with VM fences on page table reservations. This fixes intermittent memory corruption after evictions when using amdgpu_vm_handle_moved to update page tab

[PATCH AUTOSEL 5.10 28/28] drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit bf2ad4fb8adca89374b54b225d494e0b1956dbea ] Return value of container_of(...) can't be null, so null check is not required for 'fence'. Hence drop its NULL check. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c:93 to_amdgpu_amdkfd_fe

[PATCH AUTOSEL 5.4 06/24] drm/amd/display: Fix writeback_info never got updated

2024-01-22 Thread Sasha Levin
From: Alex Hung [ Upstream commit 8a30c36e15f38c9f23778babcd368144c7d8 ] [WHY] wb_enabled field is set to false before it is used, and the following code will never be executed. [HOW] Setting wb_enable to false after all removal work is completed. Reviewed-by: Harry Wentland Signed-off-by

[PATCH AUTOSEL 5.4 22/24] drm/amd/display: make flip_timestamp_in_us a 64-bit variable

2024-01-22 Thread Sasha Levin
From: Josip Pavic [ Upstream commit 6fb12518ca58412dc51054e2a7400afb41328d85 ] [Why] This variable currently overflows after about 71 minutes. This doesn't cause any known functional issues but it does make debugging more difficult. [How] Make it a 64-bit variable. Reviewed-by: Aric Cyr Acked

[PATCH AUTOSEL 5.4 23/24] drm/amdgpu: Let KFD sync with VM fences

2024-01-22 Thread Sasha Levin
From: Felix Kuehling [ Upstream commit ec9ba4821fa52b5efdbc4cdf0a77497990655231 ] Change the rules for amdgpu_sync_resv to let KFD synchronize with VM fences on page table reservations. This fixes intermittent memory corruption after evictions when using amdgpu_vm_handle_moved to update page tab

[PATCH AUTOSEL 5.4 24/24] drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit bf2ad4fb8adca89374b54b225d494e0b1956dbea ] Return value of container_of(...) can't be null, so null check is not required for 'fence'. Hence drop its NULL check. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c:93 to_amdgpu_amdkfd_fe

[PATCH AUTOSEL 4.19 23/23] drm/amdgpu: Drop 'fence' check in 'to_amdgpu_amdkfd_fence()'

2024-01-22 Thread Sasha Levin
From: Srinivasan Shanmugam [ Upstream commit bf2ad4fb8adca89374b54b225d494e0b1956dbea ] Return value of container_of(...) can't be null, so null check is not required for 'fence'. Hence drop its NULL check. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_fence.c:93 to_amdgpu_amdkfd_fe

[PATCH AUTOSEL 4.19 21/23] drm/amd/display: make flip_timestamp_in_us a 64-bit variable

2024-01-22 Thread Sasha Levin
From: Josip Pavic [ Upstream commit 6fb12518ca58412dc51054e2a7400afb41328d85 ] [Why] This variable currently overflows after about 71 minutes. This doesn't cause any known functional issues but it does make debugging more difficult. [How] Make it a 64-bit variable. Reviewed-by: Aric Cyr Acked

[PATCH AUTOSEL 4.19 22/23] drm/amdgpu: Let KFD sync with VM fences

2024-01-22 Thread Sasha Levin
From: Felix Kuehling [ Upstream commit ec9ba4821fa52b5efdbc4cdf0a77497990655231 ] Change the rules for amdgpu_sync_resv to let KFD synchronize with VM fences on page table reservations. This fixes intermittent memory corruption after evictions when using amdgpu_vm_handle_moved to update page tab

Re: [PATCH] drm/amdkfd: Add cache line sizes to KFD topology

2024-01-22 Thread Alex Deucher
On Fri, Jan 19, 2024 at 9:46 PM Joseph Greathouse wrote: > > The KFD topology includes cache line size, but we have not been > filling that information out unless we are parsing a CRAT table. > Fill in this information for the devices where we have cache > information structs, and pipe this inform

Re: [PATCH v2 2/2] drm/amdgpu: Implement check_async_props for planes

2024-01-22 Thread Harry Wentland
On 2024-01-19 13:25, Ville Syrjälä wrote: On Fri, Jan 19, 2024 at 03:12:35PM -0300, André Almeida wrote: AMD GPUs can do async flips with changes on more properties than just the FB ID, so implement a custom check_async_props for AMD planes. Allow amdgpu to do async flips with IN_FENCE_ID an

[PATCH] drm/amdgpu/pptable: convert some variable sized arrays to [] style

2024-01-22 Thread Alex Deucher
Replace [1] with []. Silences UBSAN warnings. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039926 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/include/pptable.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/include/pptable.h b/dri

Re: [PATCH] drm/amdgpu/pptable: convert some variable sized arrays to [] style

2024-01-22 Thread Christian König
Am 22.01.24 um 17:00 schrieb Alex Deucher: Replace [1] with []. Silences UBSAN warnings. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039926 Signed-off-by: Alex Deucher Acked-by: Christian König --- drivers/gpu/drm/amd/include/pptable.h | 2 +- 1 file changed, 1 insertio

  1   2   >