Re: [PATCH] amdgpu: disable amdgpu_dpm on THTF-SW831-1W-DS25_MB board

2024-08-29 Thread WangYuli
On 2024/8/28 23:19, Mario Limonciello wrote: This is production hardware? Unfortunately, this device was released quite a while back. Have you already checked whether a BIOS upgrade for the device could help this issue? Sadly, there's no BIOS update to address this problem. It seems to

Re: [PATCH] amdgpu: disable amdgpu_dpm on THTF-SW831-1W-DS25_MB board

2024-08-29 Thread WangYuli
On 2024/8/28 23:30, Alex Deucher wrote: On Wed, Aug 28, 2024 at 7:28 AM WangYuli wrote: This will disable dpm on all devices that you might install on this platform. If this is specific to a particular platform and board combination, it might be better to check the platform in the dpm_init()

Re: [RESEND 3/3] drm/amd/display: switch to guid_gen() to generate valid GUIDs

2024-08-29 Thread Jani Nikula
On Wed, 28 Aug 2024, Jani Nikula wrote: > On Wed, 28 Aug 2024, Hamza Mahfooz wrote: >> On 8/12/24 08:23, Jani Nikula wrote: >>> Instead of just smashing jiffies into a GUID, use guid_gen() to generate >>> RFC 4122 compliant GUIDs. >>> >>> Signed-off-by: Jani Nikula >>> >>> --- >> >> Acked-by:

Re: [PATCH v2] resource: limit request_free_mem_region based on arch_get_mappable_range

2024-08-29 Thread Dan Williams
D Scott Phillips wrote: [..] > Hi Dan, sorry for my incredibly delayed response, I lost your message to > a filter on my end :( > > I'm happy to work toward your preferred approach here, though I'm not > sure I know how to achieve it. I think I understand how cxl is keeping > device_private_memory

[PATCH][next] drm/amd/display: Fix spelling mistake "recompte" -> "recompute"

2024-08-29 Thread Colin Ian King
There is a spelling mistake in a DRM_DEBUG_DRIVER message. Fix it. Signed-off-by: Colin Ian King --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/driv

[PATCH -next] drm/amd/display: Remove the redundant else if branch in the function amdgpu_dm_init()

2024-08-29 Thread Jiapeng Chong
The assignment of the else and if else branches is the same, so we remove it and add comments here to make the code easier to understand. ./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:1871:6-8: WARNING: possible condition with no effect (if == else). Reported-by: Abaci Robot Closes: https

[PATCH] amdgpu: disable amdgpu_dpm on THTF-SW831-1W-DS25_MB board

2024-08-29 Thread WangYuli
From: wenlunpeng The quirk is for reboot-stability. A device reboot stress test has been observed to cause random system hangs when amdgpu_dpm is enabled. Disabling amdgpu_dpm can fix this. However, a boot-param can still overwrite it to enable amdgpu_dpm. Serial log when error occurs: ... Co

Re: [RESEND 3/3] drm/amd/display: switch to guid_gen() to generate valid GUIDs

2024-08-29 Thread Jani Nikula
On Wed, 28 Aug 2024, Hamza Mahfooz wrote: > On 8/12/24 08:23, Jani Nikula wrote: >> Instead of just smashing jiffies into a GUID, use guid_gen() to generate >> RFC 4122 compliant GUIDs. >> >> Signed-off-by: Jani Nikula >> >> --- > > Acked-by: Hamza Mahfooz > > I would prefer to take this serie

[PATCH v4.19-v6.1] drm/amdgpu: Using uninitialized value *size when calling

2024-08-29 Thread Vamsi Krishna Brahmajosyula
From: Jesse Zhang [ Upstream commit 88a9a467c548d0b3c7761b4fd54a68e70f9c0944 ] Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x0301. V2: To really improve the handling we would actually need to have a separate value of 0x.(Christian) Signed-off-by: Jesse Zh

Re: [RESEND 3/3] drm/amd/display: switch to guid_gen() to generate valid GUIDs

2024-08-29 Thread Jani Nikula
On Wed, 28 Aug 2024, Daniel Vetter wrote: > On Mon, Aug 12, 2024 at 03:23:12PM +0300, Jani Nikula wrote: >> Instead of just smashing jiffies into a GUID, use guid_gen() to generate >> RFC 4122 compliant GUIDs. >> >> Signed-off-by: Jani Nikula >> >> --- >> >> Side note, it baffles me why amdgpu

Re: [PATCH v2] resource: limit request_free_mem_region based on arch_get_mappable_range

2024-08-29 Thread D Scott Phillips
Dan Williams writes: > D Scott Phillips wrote: >> On arm64 prior to commit 32697ff38287 ("arm64: vmemmap: Avoid base2 order >> of struct page size to dimension region"), the amdgpu driver could trip >> over the warning of: >> >> `WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END));` >> >> i

Re: [PATCH v5] drm/amdgpu/gfx9.4.3: Implement compute pipe reset

2024-08-29 Thread Lazar, Lijo
On 8/29/2024 9:17 AM, Prike Liang wrote: > Implement the compute pipe reset, and the driver will > fallback to pipe reset when queue reset fails. > The pipe reset only deactivates the queue which is > scheduled in the pipe, and meanwhile the MEC engine > will be reset to the firmware _start poin

RE: [PATCH v5] drm/amdgpu/gfx9.4.3: Implement compute pipe reset

2024-08-29 Thread Liang, Prike
[AMD Official Use Only - AMD Internal Distribution Only] Thanks for the detail review, and I will update those before pushing the commit. Thanks, Prike > -Original Message- > From: Lazar, Lijo > Sent: Thursday, August 29, 2024 4:13 PM > To: Liang, Prike ; amd-gfx@lists.freedesktop.org >

Re: [PATCH 2/3] drm/amdgpu: sync to KFD fences before clearing PTEs

2024-08-29 Thread Christian König
Am 29.08.24 um 00:40 schrieb Felix Kuehling: On 2024-08-22 05:07, Christian König wrote: Am 21.08.24 um 22:01 schrieb Felix Kuehling: On 2024-08-21 08:03, Christian König wrote: This patch tries to solve the basic problem we also need to sync to the KFD fences of the BO because otherwise it c

[PATCH v2] drm/amd/display: Add missing kdoc entry for 'bs_coeffs_updated' in dpp401_dscl_program_isharp

2024-08-29 Thread Srinivasan Shanmugam
This commit addresses a missing kdoc for the 'bs_coeffs_updated' parameter in the 'dpp401_dscl_program_isharp' function. The 'bs_coeffs_updated' is a flag indicating whether the Blur and Scale Coefficients have been updated. The 'dpp401_dscl_program_isharp' function is responsible for programming

Re: [PATCH] drm/amdkfd: restore_process_worker race with GPU reset

2024-08-29 Thread Philip Yang
On 2024-08-28 18:01, Felix Kuehling wrote: On 2024-08-23 15:49, Philip Yang wrote: If GPU reset kick in while KFD restore_process_worker running, this may causes different issues, for example below rcu stall warnin

[PATCH] drm/amdgpu: always allocate cleared VRAM for GEM allocations

2024-08-29 Thread Alex Deucher
This adds allocation latency, but aligns better with user expectations. The latency should improve with the drm buddy clearing patches that Arun has been working on. In addition this fixes the high CPU spikes seen when doing wipe on release. v2: always set AMDGPU_GEM_CREATE_VRAM_CLEARED (Christi

Re: [PATCH] drm/amdgpu: always allocate cleared VRAM for GEM allocations

2024-08-29 Thread Paneer Selvam, Arunpravin
this will fix performance issues, Acked-by: Arunpravin Paneer Selvam > On 8/29/2024 10:56 PM, Alex Deucher wrote: This adds allocation latency, but aligns better with user expectations. The latency should improve with the drm buddy clearing patches that

Re: [PATCH] drm/amdkfd: restore_process_worker race with GPU reset

2024-08-29 Thread Felix Kuehling
On 2024-08-23 15:49, Philip Yang wrote: If GPU reset kick in while KFD restore_process_worker running, this may causes different issues, for example below rcu stall warning, because restore work may move BOs and evict queues under VRAM pressure. Fix this race by taking adev reset_domain read sem

Re: [PATCH -next] drm/amd/display: Remove the redundant else if branch in the function amdgpu_dm_init()

2024-08-29 Thread Alex Deucher
On Wed, Aug 28, 2024 at 10:37 PM Jiapeng Chong wrote: > > The assignment of the else and if else branches is the same, so we > remove it and add comments here to make the code easier to understand. I think the code is clearer as is. If you force IPS on, you want to make sure it's enabled, regard

RE: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-29 Thread Errabolu, Ramesh
[AMD Official Use Only - AMD Internal Distribution Only] Responses inline. Will post new patch after testing Regards, Ramesh From: Kasiviswanathan, Harish Sent: Wednesday, August 28, 2024 1:37 PM To: Yang, Philip ; Errabolu, Ramesh ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH v2] drm/am

RE: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-29 Thread Errabolu, Ramesh
Responses inline, will post updated patch shortly. Look at my response for the Sysfs permission Regards, Ramesh   -Original Message- From: Kuehling, Felix Sent: Wednesday, August 28, 2024 3:59 PM To: amd-gfx@lists.freedesktop.org; Errabolu, Ramesh Subject: Re: [PATCH v2] drm/amdgpu: Su

[PATCH V3] drm/amdgpu: Surface svm_default_granularity, a RW module parameter

2024-08-29 Thread Ramesh Errabolu
Enables users to update SVM's default granularity, used in buffer migration and handling of recoverable page faults. Param value is set in terms of log(numPages(buffer)), e.g. 9 for a 2 MIB buffer Signed-off-by: Ramesh Errabolu --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/

Re: [PATCH] drm/amdkfd: restore_process_worker race with GPU reset

2024-08-29 Thread Philip Yang
On 2024-08-29 17:15, Felix Kuehling wrote: On 2024-08-23 15:49, Philip Yang wrote: If GPU reset kick in while KFD restore_process_worker running, this may causes different issues, for example below rcu stall warning,

Re: [PATCH V3] drm/amdgpu: Surface svm_default_granularity, a RW module parameter

2024-08-29 Thread Chen, Xiaogang
On 8/29/2024 5:13 PM, Ramesh Errabolu wrote: Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. Enables users to update SVM's default granularity, used in buffer migration and handling of recoverable page fau