[PATCH v5] drm/amdgpu/gfx9.4.3: Implement compute pipe reset

2024-08-28 Thread Prike Liang
Implement the compute pipe reset, and the driver will fallback to pipe reset when queue reset fails. The pipe reset only deactivates the queue which is scheduled in the pipe, and meanwhile the MEC engine will be reset to the firmware _start pointer. So, it seems pipe reset will cost more cycles tha

Re: [PATCH] drm/amd/display: Add missing kdoc entry for 'bs_coeffs_updated' in dpp401_dscl_program_isharp

2024-08-28 Thread Chung, ChiaHsuan (Tom)
On 8/28/2024 7:25 PM, Srinivasan Shanmugam wrote: Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/../display/dc/dpp/dcn401/dcn401_dpp_dscl.c:961: warning: Function parameter or struct member 'bs_coeffs_updated' not described in 'dpp401_dscl_program_isharp' Cc: Tom Chung Cc: Rodrigo S

RE: [PATCH] drm/amdgpu: revert "use CPU for page table update if SDMA is unavailable"

2024-08-28 Thread Zhang, Yifan
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Yifan Zhang Best Regards, Yifan -Original Message- From: Christian König Sent: Tuesday, August 27, 2024 10:16 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Yifan Subject: [PATCH] drm/amdgpu: revert "use CPU for p

Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-28 Thread Felix Kuehling
On 2024-08-28 17:38, Chen, Xiaogang wrote: On 8/28/2024 4:05 PM, Felix Kuehling wrote: On 2024-08-28 16:34, Chen, Xiaogang wrote: On 8/28/2024 3:26 PM, Errabolu, Ramesh wrote: Responses inline Regards, Ramesh *From:*Chen, Xiaogang *Sent:* Wednesday, August 28, 2024 3:01 PM *To:* Er

Re: [PATCH 2/3] drm/amdgpu: sync to KFD fences before clearing PTEs

2024-08-28 Thread Felix Kuehling
On 2024-08-22 05:07, Christian König wrote: Am 21.08.24 um 22:01 schrieb Felix Kuehling: On 2024-08-21 08:03, Christian König wrote: This patch tries to solve the basic problem we also need to sync to the KFD fences of the BO because otherwise it can be that we clear PTEs while the KFD queues

Re: [PATCH 1/3] drm/amdgpu: re-work VM syncing

2024-08-28 Thread Felix Kuehling
On 2024-08-22 03:28, Friedrich Vock wrote: On 21.08.24 22:46, Felix Kuehling wrote: On 2024-08-21 08:03, Christian König wrote: Rework how VM operations synchronize to submissions. Provide an amdgpu_sync container to the backends instead of an reservation object and fill in the amdgpu_sync o

Re: [PATCH] drm/amdkfd: restore_process_worker race with GPU reset

2024-08-28 Thread Felix Kuehling
On 2024-08-23 15:49, Philip Yang wrote: If GPU reset kick in while KFD restore_process_worker running, this may causes different issues, for example below rcu stall warning, because restore work may move BOs and evict queues under VRAM pressure. Fix this race by taking adev reset_domain read s

RE: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-28 Thread Errabolu, Ramesh
From: Chen, Xiaogang Sent: Wednesday, August 28, 2024 4:38 PM To: Kuehling, Felix ; Errabolu, Ramesh ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter On 8/28/2024 4:05 PM, Felix Kuehling wrote: On 2024-08-28 16:34, Chen, Xiaogan

Re: [PATCH] drm/amdgpu: revert "use CPU for page table update if SDMA is unavailable"

2024-08-28 Thread Felix Kuehling
On 2024-08-27 10:16, Christian König wrote: That is clearly not something we should do upstream. The SDMA is mandatory for the driver to work correctly. We could do this for emulation and bringup, but in those cases the engineer should probably enabled CPU based updates manually. This reverts

Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-28 Thread Chen, Xiaogang
On 8/28/2024 4:05 PM, Felix Kuehling wrote: On 2024-08-28 16:34, Chen, Xiaogang wrote: On 8/28/2024 3:26 PM, Errabolu, Ramesh wrote: Responses inline Regards, Ramesh *From:*Chen, Xiaogang *Sent:* Wednesday, August 28, 2024 3:01 PM *To:* Errabolu, Ramesh ; amd-gfx@lists.freedesktop.org

Re: [PATCH] drm/amdkfd: fix missed queue reset on queue destroy

2024-08-28 Thread Felix Kuehling
On 2024-08-22 11:17, Jonathan Kim wrote: If a queue is being destroyed but causes a HWS hang on removal, the KFD may issue an unnecessary gpu reset if the destroyed queue can be fixed by a queue reset. This is because the queue has been removed from the KFD's queue list prior to the preemption

Re: [PATCH] amdgpu: disable amdgpu_dpm on THTF-SW831-1W-DS25_MB board

2024-08-28 Thread Mario Limonciello
On 8/28/2024 11:14, Alex Deucher wrote: On Wed, Aug 28, 2024 at 11:47 AM WangYuli wrote: On 2024/8/28 23:30, Alex Deucher wrote: On Wed, Aug 28, 2024 at 7:28 AM WangYuli wrote: This will disable dpm on all devices that you might install on this platform. If this is specific to a particula

Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-28 Thread Felix Kuehling
On 2024-08-28 16:34, Chen, Xiaogang wrote: On 8/28/2024 3:26 PM, Errabolu, Ramesh wrote: Responses inline Regards, Ramesh *From:*Chen, Xiaogang *Sent:* Wednesday, August 28, 2024 3:01 PM *To:* Errabolu, Ramesh ; amd-gfx@lists.freedesktop.org *Subject:* Re: [PATCH v2] drm/amdgpu: Surfac

Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-28 Thread Felix Kuehling
On 2024-08-26 15:34, Ramesh Errabolu wrote: Enables users to update the default size of buffer used in migration either from Sysmem to VRAM or vice versa. The param GOBM refers to granularity of buffer migration, and is specified in terms of log(numPages(buffer)). It facilitates users of unregi

Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-28 Thread Chen, Xiaogang
On 8/28/2024 3:26 PM, Errabolu, Ramesh wrote: Responses inline Regards, Ramesh *From:*Chen, Xiaogang *Sent:* Wednesday, August 28, 2024 3:01 PM *To:* Errabolu, Ramesh ; amd-gfx@lists.freedesktop.org *Subject:* Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter On 8/

RE: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-28 Thread Errabolu, Ramesh
Responses inline Regards, Ramesh From: Chen, Xiaogang Sent: Wednesday, August 28, 2024 3:01 PM To: Errabolu, Ramesh ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter On 8/28/2024 2:52 PM, Errabolu, Ramesh wrote: Response inline

Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-28 Thread Chen, Xiaogang
On 8/28/2024 2:52 PM, Errabolu, Ramesh wrote: Response inline Regards, Ramesh -Original Message- From: Chen, Xiaogang Sent: Wednesday, August 28, 2024 2:43 PM To: Errabolu, Ramesh;amd-gfx@lists.freedesktop.org Subject: Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module

RE: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-28 Thread Errabolu, Ramesh
Response inline Regards, Ramesh   -Original Message- From: Chen, Xiaogang Sent: Wednesday, August 28, 2024 2:43 PM To: Errabolu, Ramesh ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter Why need this driver parameter? kfd h

Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-28 Thread Chen, Xiaogang
Why need this driver parameter? kfd has KFD_IOCTL_SVM_ATTR_GRANULARITY api that allows user space to set migration granularity per prange. If both got set which will take precedence? Regards Xiaogang On 8/26/2024 2:34 PM, Ramesh Errabolu wrote: Caution: This message originated from an Ext

[pull] amdgpu drm-fixes-6.11

2024-08-28 Thread Alex Deucher
Hi Dave, Sima, Fixes for 6.11. The following changes since commit 5be63fc19fcaa4c236b307420483578a56986a37: Linux 6.11-rc5 (2024-08-25 19:07:11 +1200) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-fixes-6.11-2024-08-28 for you to fetc

RE: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-28 Thread Kasiviswanathan, Harish
[AMD Official Use Only - AMD Internal Distribution Only] Some comments inline. From: amd-gfx On Behalf Of Philip Yang Sent: Wednesday, August 28, 2024 11:49 AM To: Errabolu, Ramesh ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

Re: [PATCH] amdgpu: disable amdgpu_dpm on THTF-SW831-1W-DS25_MB board

2024-08-28 Thread Alex Deucher
On Wed, Aug 28, 2024 at 11:47 AM WangYuli wrote: > > > On 2024/8/28 23:30, Alex Deucher wrote: > > On Wed, Aug 28, 2024 at 7:28 AM WangYuli wrote: > > > > This will disable dpm on all devices that you might install on this > > platform. If this is specific to a particular platform and board > >

Re: [PATCH v2] drm/amdgpu: Surface svm_attr_gobm, a RW module parameter

2024-08-28 Thread Philip Yang
On 2024-08-26 15:34, Ramesh Errabolu wrote: Enables users to update the default size of buffer used in migration either from Sysmem to VRAM or vice versa. The param GOBM refers to granularity of buffer migration, and is specified in terms of log(numPages(buff

Re: [PATCH] amdgpu: disable amdgpu_dpm on THTF-SW831-1W-DS25_MB board

2024-08-28 Thread Alex Deucher
On Wed, Aug 28, 2024 at 7:28 AM WangYuli wrote: > > From: wenlunpeng > > The quirk is for reboot-stability. > > A device reboot stress test has been observed to cause > random system hangs when amdgpu_dpm is enabled. > > Disabling amdgpu_dpm can fix this. > > However, a boot-param can still overw

Re: [PATCH v1] drm/ci: increase timeout for all jobs

2024-08-28 Thread Helen Mae Koike Fornazier
On Tue, 27 Aug 2024 19:04:42 -0300 Rob Clark wrote --- > On Tue, Aug 20, 2024 at 12:09 AM Vignesh Raman > vignesh.ra...@collabora.com> wrote: > > > > Set the timeout of all drm-ci jobs to 1h30m since > > some jobs takes more than 1 hour to complete. > > > > Signed-off-by: V

Re: [PATCH] amdgpu: disable amdgpu_dpm on THTF-SW831-1W-DS25_MB board

2024-08-28 Thread Mario Limonciello
On 8/28/2024 05:59, WangYuli wrote: From: wenlunpeng The quirk is for reboot-stability. A device reboot stress test has been observed to cause random system hangs when amdgpu_dpm is enabled. Disabling amdgpu_dpm can fix this. However, a boot-param can still overwrite it to enable amdgpu_dpm.

Re: [PATCH v2] drm/amdgpu/: Add missing kdoc entry in amdgpu_vm_handle_fault function

2024-08-28 Thread Chen, Xiaogang
Reviewed-by: Xiaogang Chen On 8/27/2024 10:09 PM, Srinivasan Shanmugam wrote: Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. This commit adds a description for the 'ts' parameter in the amdgpu_vm_handle_

Re: [RESEND 3/3] drm/amd/display: switch to guid_gen() to generate valid GUIDs

2024-08-28 Thread Hamza Mahfooz
On 8/28/24 10:06, Jani Nikula wrote: On Wed, 28 Aug 2024, Hamza Mahfooz wrote: On 8/12/24 08:23, Jani Nikula wrote: Instead of just smashing jiffies into a GUID, use guid_gen() to generate RFC 4122 compliant GUIDs. Signed-off-by: Jani Nikula --- Acked-by: Hamza Mahfooz I would prefer to

Re: [RESEND 3/3] drm/amd/display: switch to guid_gen() to generate valid GUIDs

2024-08-28 Thread Harry Wentland
On 2024-08-28 09:58, Alex Deucher wrote: > On Wed, Aug 28, 2024 at 9:53 AM Jani Nikula wrote: >> >> On Wed, 28 Aug 2024, Daniel Vetter wrote: >>> On Mon, Aug 12, 2024 at 03:23:12PM +0300, Jani Nikula wrote: Instead of just smashing jiffies into a GUID, use guid_gen() to generate RFC

Re: [PATCH] drm/amd/display: Determine IPS mode by ASIC and PMFW versions

2024-08-28 Thread Leo Li
On 2024-08-27 16:38, Harry Wentland wrote: On 2024-08-27 15:53, sunpeng...@amd.com wrote: From: Leo Li [Why] DCN IPS interoperates with other system idle power features, such as Zstates. On DCN35, there is a known issue where system Z8 + DCN IPS2 causes a hard hang. We observe this on s

Re: [PATCH v4] drm/amdgpu/gfx9.4.3: Implement compute pipe reset

2024-08-28 Thread Alex Deucher
On Wed, Aug 28, 2024 at 3:17 AM Liang, Prike wrote: > > [AMD Official Use Only - AMD Internal Distribution Only] > > > From: Lazar, Lijo > > Sent: Wednesday, August 28, 2024 2:45 PM > > To: Liang, Prike ; amd-gfx@lists.freedesktop.org > > Cc: Deucher, Alexander ; Ma, Le > > > > Subject: Re: [PAT

Re: [RESEND 3/3] drm/amd/display: switch to guid_gen() to generate valid GUIDs

2024-08-28 Thread Alex Deucher
On Wed, Aug 28, 2024 at 9:53 AM Jani Nikula wrote: > > On Wed, 28 Aug 2024, Daniel Vetter wrote: > > On Mon, Aug 12, 2024 at 03:23:12PM +0300, Jani Nikula wrote: > >> Instead of just smashing jiffies into a GUID, use guid_gen() to generate > >> RFC 4122 compliant GUIDs. > >> > >> Signed-off-by: J

Re: [RESEND 3/3] drm/amd/display: switch to guid_gen() to generate valid GUIDs

2024-08-28 Thread Hamza Mahfooz
On 8/12/24 08:23, Jani Nikula wrote: Instead of just smashing jiffies into a GUID, use guid_gen() to generate RFC 4122 compliant GUIDs. Signed-off-by: Jani Nikula --- Acked-by: Hamza Mahfooz I would prefer to take this series through the amdgpu tree though, assuming nobody minds. Side n

Re: [PATCH v2] drm/amdgpu: Normalize reg offsets on JPEG v4.0.3

2024-08-28 Thread Sundararaju, Sathishkumar
On 8/28/2024 11:41 AM, Lijo Lazar wrote: On VFs and SOCs with GC 9.4.4, VCN RRMT is disabled. Only local register offsets should be used on JPEG v4.0.3 as they cannot handle remote access to other AIDs. Since only local offsets are used, the special write to MCM_ADDR register is no longer neede

Re: [PATCH v5 03/44] drm/vkms: Add kunit tests for VKMS LUT handling

2024-08-28 Thread Harry Wentland
On 2024-08-27 13:49, Louis Chauvet wrote: > Le 19/08/24 - 16:56, Harry Wentland a écrit : > > [...] > >> diff --git a/drivers/gpu/drm/vkms/vkms_composer.c >> b/drivers/gpu/drm/vkms/vkms_composer.c >> index 3d6785d081f2..3ecda70c2b55 100644 >> --- a/drivers/gpu/drm/vkms/vkms_composer.c >> +++

Re: [RESEND 3/3] drm/amd/display: switch to guid_gen() to generate valid GUIDs

2024-08-28 Thread Daniel Vetter
On Mon, Aug 12, 2024 at 03:23:12PM +0300, Jani Nikula wrote: > Instead of just smashing jiffies into a GUID, use guid_gen() to generate > RFC 4122 compliant GUIDs. > > Signed-off-by: Jani Nikula > > --- > > Side note, it baffles me why amdgpu has a copy of this instead of > plumbing it into drm

Re: [RESEND 2/3] drm/mst: switch to guid_gen() to generate valid GUIDs

2024-08-28 Thread Daniel Vetter
On Mon, Aug 12, 2024 at 03:23:11PM +0300, Jani Nikula wrote: > Instead of just smashing jiffies into a GUID, use guid_gen() to generate > RFC 4122 compliant GUIDs. > > Signed-off-by: Jani Nikula Read a bit the RFC, definitely sounds better than stuffing jiffies into the guid ... Reviewed-by: Da

Re: [RESEND 1/3] drm/mst: switch to guid_t type for GUID

2024-08-28 Thread Daniel Vetter
On Mon, Aug 12, 2024 at 03:23:10PM +0300, Jani Nikula wrote: > The kernel has a guid_t type for GUIDs. Switch to using it, but avoid > any functional changes here. > > Signed-off-by: Jani Nikula I didn't cross-check everything, I'll trust the compiler on this. But functionally lgtm Reviewed-by:

[PATCH] drm/amd/display: Add missing kdoc entry for 'bs_coeffs_updated' in dpp401_dscl_program_isharp

2024-08-28 Thread Srinivasan Shanmugam
Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/../display/dc/dpp/dcn401/dcn401_dpp_dscl.c:961: warning: Function parameter or struct member 'bs_coeffs_updated' not described in 'dpp401_dscl_program_isharp' Cc: Tom Chung Cc: Rodrigo Siqueira Cc: Roman Li Cc: Alex Hung Cc: Aurabindo

RE: [PATCH] drm/amdgpu: Move the dumping log out of for loop

2024-08-28 Thread Huang, Trigger
[AMD Official Use Only - AMD Internal Distribution Only] Acked-by: Trigger Huang > -Original Message- > From: Sunil Khatri > Sent: Wednesday, August 28, 2024 4:09 PM > To: Deucher, Alexander ; Huang, Trigger > > Cc: amd-gfx@lists.freedesktop.org; Khatri, Sunil > Subject: [PATCH] drm/a

[PATCH] drm/amdgpu: Move the dumping log out of for loop

2024-08-28 Thread Sunil Khatri
log message "Dumping IP State Completed" needs to be logged only once when state dumping is complete. Hence moving it out of the for loop. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu

Re: [PATCH AUTOSEL 6.1 05/61] drm/amd/pm: Fix negative array index read

2024-08-28 Thread Pavel Machek
Hi! > From: Jesse Zhang > > [ Upstream commit c8c19ebf7c0b202a6a2d37a52ca112432723db5f ] > > Avoid using the negative values > for clk_idex as an index into an array pptable->DpmDescriptor. > > V2: fix clk_index return check (Tim Huang) > dpm_desc = &pptable->DpmDescriptor[clk_index]; >

Re: [PATCH AUTOSEL 6.10 034/121] drm/amdgpu: Fix out-of-bounds read of df_v1_7_channel_number

2024-08-28 Thread Pavel Machek
Hi! > [ Upstream commit d768394fa99467bcf2703bde74ddc96eeb0b71fa ] > > Check the fb_channel_number range to avoid the array out-of-bounds > read error We can still have array out-of-bounds, right? As soon as that function returns 0x8000 . drivers/gpu/drm/amd/amdgpu/amdgpu_df.h: u32 (*get_fb

RE: [PATCH v4] drm/amdgpu/gfx9.4.3: Implement compute pipe reset

2024-08-28 Thread Liang, Prike
[AMD Official Use Only - AMD Internal Distribution Only] > From: Lazar, Lijo > Sent: Wednesday, August 28, 2024 2:45 PM > To: Liang, Prike ; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander ; Ma, Le > > Subject: Re: [PATCH v4] drm/amdgpu/gfx9.4.3: Implement compute pipe reset > > > > On 8/