Re: [PATCH v5 1/6] drm/sched: Add internal job peek/pop API

2025-02-18 Thread Matthew Brost
On Tue, Feb 18, 2025 at 06:26:15PM +, Tvrtko Ursulin wrote: > > On 18/02/2025 12:26, Philipp Stanner wrote: > > Thx for the updated version. Overlooked it, I was out on Friday. See > > below > > > > On Fri, 2025-02-14 at 10:19 +, Tvrtko Ursulin wrote: > > > Idea is to add helpers for peek

[PATCH v2] drm/amdkfd: Preserve cp_hqd_pq_control on update_mqd

2025-02-18 Thread David Yat Sin
When userspace applications call AMDKFD_IOC_UPDATE_QUEUE. Preserve bitfields that do not need to be modified as they contain flags to track queue states that are used by CP FW. Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 3 ++- drivers/gpu/drm/amd/amdkfd/k

[PATCH] drm/amdgpu: Remove redundant check of adev

2025-02-18 Thread Xiang Liu
There is no need to check adev for sure. Signed-off-by: Xiang Liu --- drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c index c0da9096a7fa..d11593cd1922 10

RE: [PATCH] drm/amdgpu: Remove redundant check of adev

2025-02-18 Thread Wang, Yang(Kevin)
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Yang Wang Best Regards, Kevin -Original Message- From: amd-gfx On Behalf Of Xiang Liu Sent: Wednesday, February 19, 2025 12:37 To: amd-gfx@lists.freedesktop.org Cc: k...@ijzerbout.nl; Zhang, Hawking ; Zhou1, Tao ; L

RE: [PATCH] drm/amdgpu: Check aca enabled inside cper init/fini func

2025-02-18 Thread Zhou1, Tao
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Tao Zhou > -Original Message- > From: Liu, Xiang(Dean) > Sent: Wednesday, February 19, 2025 12:29 PM > To: amd-gfx@lists.freedesktop.org > Cc: Zhang, Hawking ; Zhou1, Tao > ; Liu, Xiang(Dean) > Subject: [PATCH] drm/a

Re: [PATCH 2/2] drm/amdgpu: Optimize VM invalidation engine allocation and synchronize GPU TLB flush

2025-02-18 Thread Lazar, Lijo
On 2/19/2025 11:50 AM, jesse.zh...@amd.com wrote: > From: "jesse.zh...@amd.com" > > - Modify the VM invalidation engine allocation logic to handle SDMA page > rings. > SDMA page rings now share the VM invalidation engine with SDMA gfx rings > instead of > allocating a separate engine. Th

Re: [PATCH 2/2] drm/amdgpu: Optimize VM invalidation engine allocation and synchronize GPU TLB flush

2025-02-18 Thread Lazar, Lijo
On 2/19/2025 12:21 PM, Lazar, Lijo wrote: > > > On 2/19/2025 11:50 AM, jesse.zh...@amd.com wrote: >> From: "jesse.zh...@amd.com" >> >> - Modify the VM invalidation engine allocation logic to handle SDMA page >> rings. >> SDMA page rings now share the VM invalidation engine with SDMA gfx ri

Re: [PATCH] drm/amdgpu: Remove redundant logic in GC v9.4.3

2025-02-18 Thread Lazar, Lijo
On 2/17/2025 10:44 AM, Lijo Lazar wrote: > GFXOFF check is not need for GC v9.4.3. Also, save/restore list is > available by default. > > Signed-off-by: Lijo Lazar > --- > drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 17 + > 1 file changed, 1 insertion(+), 16 deletions(-) > > dif

[PATCH] drm/amdgpu: Check aca enabled inside cper init/fini func

2025-02-18 Thread Xiang Liu
Move code about checking aca enabled to the cper init/fini function to make code clean. Signed-off-by: Xiang Liu --- drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 6 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 ++ 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/driver

[PATCH 2/2] drm/amdgpu: Optimize VM invalidation engine allocation and synchronize GPU TLB flush

2025-02-18 Thread jesse.zhang
From: "jesse.zh...@amd.com" - Modify the VM invalidation engine allocation logic to handle SDMA page rings. SDMA page rings now share the VM invalidation engine with SDMA gfx rings instead of allocating a separate engine. This change ensures efficient resource management and avoids the is

[PATCH 1/2] drm/amd/amdgpu: Increase max rings to enable SDMA page ring

2025-02-18 Thread jesse.zhang
From: "jesse.zh...@amd.com" Increase the maximum number of rings supported by the AMDGPU driver from 132 to 148. This change is necessary to enable support for the SDMA page ring. Signed-off-by: Jesse Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +- 1 file changed, 1 insertion(+), 1

RE: [PATCH] drm/amdgpu: Remove redundant logic in GC v9.4.3

2025-02-18 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Lazar, Lijo Sent: Monday, February 17, 2025 13:15 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Ma, Le Subject: [PATCH] drm/amdgp

[RFC PATCH] drm/amd/display: fix page fault on dpms off with MST

2025-02-18 Thread Melissa Wen
A page fault occurs when running the IGT test: amdgpu@amd_vrr_range@freesync-parsing-suspend with MST. Fix that by skipping handling a stream state when stream is NULL when setting DPMS off. [ +7.435304] [drm] DM_MST: stopping TM on aconnector: 951db0f4 [id: 101] [ +0.535828] BUG: unabl

RE: [PATCH 00/16] DC Patches February 14, 2025

2025-02-18 Thread Wheeler, Daniel
[Public] Hi all, This week this patchset was tested on 4 systems, two dGPU and two APU based, and tested across multiple display and connection types. APU * Single Display eDP -> 1080p 60hz, 2560x1600 120hz, 1920x1200 165hz * Single Display DP (SST DSC) -> 4k144hz, 4k240hz

[PATCH 1/8] drm/amdgpu: grab an additional reference on the gang fence

2025-02-18 Thread Christian König
We keep the gang submission fence around in adev, make sure that it stays alive. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dev

[PATCH 6/8] drm/amdgpu: stop reserving VMIDs to enforce isolation

2025-02-18 Thread Christian König
That was quite troublesome for gang submit. Completely drop this approach and enforce the isolation separately. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 9 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 11

[PATCH 4/8] drm/amdgpu: rework how isolation is enforced v2

2025-02-18 Thread Christian König
Limiting the number of available VMIDs to enforce isolation causes some issues with gang submit and applying certain HW workarounds which require multiple VMIDs to work correctly. So instead start to track all submissions to the relevant engines in a per partition data structure and use the dma_fe

[PATCH 2/8] drm/amdgpu: use GFP_NOWAIT for memory allocations

2025-02-18 Thread Christian König
In the critical submission path memory allocations can't wait for reclaim since that can potentially wait for submissions to finish. Finally clean that up and mark most memory allocations in the critical path with GFP_NOWAIT. The only exception left is the dma_fence_array() used when no VMID is av

[PATCH 7/8] drm/amdgpu: add isolation trace point

2025-02-18 Thread Christian König
Note when we switch from one isolation owner to another. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 17 + 2 files changed, 18 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_devi

[PATCH 3/8] drm/amdgpu: overwrite signaled fence in amdgpu_sync

2025-02-18 Thread Christian König
This allows using amdgpu_sync even without peeking into the fences for a long time. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gp

[PATCH 8/8] drm/amdgpu: add cleaner shader trace point

2025-02-18 Thread Christian König
Note when the cleaner shader is executed. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 15 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 1 + 2 files changed, 16 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/

[PATCH 5/8] drm/amdgpu: rework how the cleaner shader is emitted v3

2025-02-18 Thread Christian König
Instead of emitting the cleaner shader for every job which has the enforce_isolation flag set only emit it for the first submission from every client. v2: add missing NULL check v3: fix another NULL pointer deref Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 27 +++

[PATCH] drm/amdkfd: Preserve cp_hqd_pq_control on update_mqd

2025-02-18 Thread David Yat Sin
When userspace applications call AMDKFD_IOC_UPDATE_QUEUE. Preserve bitfields that do not need to be modified as they contain flags to track queue states that are used by CP FW. Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 4 +++- drivers/gpu/drm/amd/amdkfd/

Re: [PATCH] drm/amdgpu: disable BAR resize on Dell G5 SE

2025-02-18 Thread Lazar, Lijo
On 2/18/2025 8:38 PM, Alex Deucher wrote: > There was a quirk added to add a workaround for a Sapphire > RX 5600 XT Pulse that didn't allow BAR resizing. However, > the quirk casused a regression on Dell laptops using those > chips, rather than narrowing the scope of the resizing > quirk, add a

Re: [PATCH v6 0/9] Add jump table support for objtool on LoongArch

2025-02-18 Thread Josh Poimboeuf
On Mon, Feb 17, 2025 at 11:13:43AM +0800, Huacai Chen wrote: > On Thu, Feb 13, 2025 at 10:51 AM Josh Poimboeuf wrote: > > > > On Wed, Feb 12, 2025 at 03:22:45PM +0800, Huacai Chen wrote: > > > > The new series now has 7 patches: > > > > > > > > Tiezhu Yang (7): > > > > objtool: Handle various sy

RE: [PATCH V7 3/9] drm/amdgpu: Add common lock and reset caller parameter for SDMA reset synchronization

2025-02-18 Thread Kim, Jonathan
[Public] > -Original Message- > From: Lazar, Lijo > Sent: Monday, February 17, 2025 10:36 PM > To: Zhang, Jesse(Jie) ; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander ; Kuehling, Felix > ; Kim, Jonathan ; Zhu, > Jiadong > Subject: Re: [PATCH V7 3/9] drm/amdgpu: Add common lock and

RE: [PATCH 1/4] drm/amdkfd: Rename grace_period to wait_times

2025-02-18 Thread Kim, Jonathan
[Public] > -Original Message- > From: amd-gfx On Behalf Of Harish > Kasiviswanathan > Sent: Wednesday, February 12, 2025 5:04 PM > To: amd-gfx@lists.freedesktop.org > Cc: Kasiviswanathan, Harish > Subject: [PATCH 1/4] drm/amdkfd: Rename grace_period to wait_times > > Rename .set_grace_pe

RE: [PATCH 3/4] drm/amdgpu: Don't modify grace_period in helper function

2025-02-18 Thread Kim, Jonathan
[Public] > -Original Message- > From: amd-gfx On Behalf Of Harish > Kasiviswanathan > Sent: Wednesday, February 12, 2025 5:04 PM > To: amd-gfx@lists.freedesktop.org > Cc: Kasiviswanathan, Harish > Subject: [PATCH 3/4] drm/amdgpu: Don't modify grace_period in helper function > > build_gra

RE: [PATCH 2/4] drm/amdkfd: Use asic specifc function to set CP wait time

2025-02-18 Thread Kim, Jonathan
[Public] > -Original Message- > From: amd-gfx On Behalf Of Harish > Kasiviswanathan > Sent: Wednesday, February 12, 2025 5:04 PM > To: amd-gfx@lists.freedesktop.org > Cc: Kasiviswanathan, Harish > Subject: [PATCH 2/4] drm/amdkfd: Use asic specifc function to set CP wait time > > Currentl

RE: [PATCH 4/4] drm/amdgpu: Set lower queue retry timeout for gfx9 family

2025-02-18 Thread Kim, Jonathan
[Public] > -Original Message- > From: amd-gfx On Behalf Of Harish > Kasiviswanathan > Sent: Wednesday, February 12, 2025 5:04 PM > To: amd-gfx@lists.freedesktop.org > Cc: Kasiviswanathan, Harish > Subject: [PATCH 4/4] drm/amdgpu: Set lower queue retry timeout for gfx9 family > > Set more

Re: [PATCH] drm/amdkfd: Fix Circular Locking Dependency in 'svm_range_cpu_invalidate_pagetables'

2025-02-18 Thread Philip Yang
On 2025-02-18 11:01, Srinivasan Shanmugam wrote: This commit addresses a circular locking dependency in the svm_range_cpu_invalidate_pagetables function. The function previously held a lock while determining whether to perform an unmap or eviction operation, which could lead to deadlocks. To r

Re: [PATCH] drm/amdkfd: Preserve cp_hqd_pq_control on update_mqd

2025-02-18 Thread Jay Cornwall
On 2/18/2025 11:24, David Yat Sin wrote: --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c @@ -167,7 +167,9 @@ static void update_mqd(struct mqd_manager *mm, void *mqd, m = get_mqd(mqd); - m->cp_hqd_pq_control = 5 << CP_HQD_PQ_

Re: [PATCH] drm/amdkfd: Preserve cp_hqd_pq_control on update_mqd

2025-02-18 Thread Philip Yang
On 2025-02-18 12:24, David Yat Sin wrote: When userspace applications call AMDKFD_IOC_UPDATE_QUEUE. Preserve bitfields that do not need to be modified as they contain flags to track queue states that are used by CP FW. Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_mqd_mana

[PATCH] drm/amdkfd: Fix Circular Locking Dependency in 'svm_range_cpu_invalidate_pagetables'

2025-02-18 Thread Srinivasan Shanmugam
This commit addresses a circular locking dependency in the svm_range_cpu_invalidate_pagetables function. The function previously held a lock while determining whether to perform an unmap or eviction operation, which could lead to deadlocks. To resolve this issue, a flag named `needs_unmap_or_evict

Re: [PATCH] drm/amdkfd: Fix error handling for missing PASID in 'kfd_process_device_init_vm'

2025-02-18 Thread Felix Kuehling
On 2025-02-17 01:37, Srinivasan Shanmugam wrote: In the kfd_process_device_init_vm function, a valid error code is now returned when the associated Process Address Space ID (PASID) is not present. If the address space virtual memory (avm) does not have an associated PASID, the function sets t

Re: [PATCH] PCI: fix Sapphire PCI rebar quirk

2025-02-18 Thread Christian König
Am 18.02.25 um 10:58 schrieb Lazar, Lijo: > On 2/18/2025 1:33 PM, Christian König wrote: >> Am 17.02.25 um 17:04 schrieb Mario Limonciello: >>> On 2/17/2025 10:00, Alex Deucher wrote: On Mon, Feb 17, 2025 at 10:45 AM Alex Deucher wrote: > On Mon, Feb 17, 2025 at 10:38 AM Christian K

[PATCH] drm/amd/pm: Fetch current power limit from PMFW

2025-02-18 Thread Lijo Lazar
On SMU v13.0.12, always query the firmware to get the current power limit as it could be updated through other means also. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/driv

Re: [PATCH] amdgpu/pm/legacy: fix suspend/resume issues

2025-02-18 Thread chr[]
On 17.02.25 16:26, Alex Deucher wrote: From: "chr[]" resume and irq handler happily races in set_power_state() * amdgpu_legacy_dpm_compute_clocks() needs lock * protect irq work handler * fix dpm_enabled usage v2: fix clang build, integrate Lijo's comments (Alex) Closes: https://gitlab.freed

RE: [PATCH] drm/amd/pm: Fetch current power limit from PMFW

2025-02-18 Thread Kamal, Asad
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Asad Kamal Thanks & Regards Asad -Original Message- From: Lazar, Lijo Sent: Tuesday, February 18, 2025 5:47 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Kamal, Asad Subject: [PATCH

Re: [PATCH v5 1/6] drm/sched: Add internal job peek/pop API

2025-02-18 Thread Philipp Stanner
Thx for the updated version. Overlooked it, I was out on Friday. See below On Fri, 2025-02-14 at 10:19 +, Tvrtko Ursulin wrote: > Idea is to add helpers for peeking and popping jobs from entities > with > the goal of decoupling the hidden assumption in the code that > queue_node > is the first

Re: [PATCH] PCI: fix Sapphire PCI rebar quirk

2025-02-18 Thread Lazar, Lijo
On 2/18/2025 1:33 PM, Christian König wrote: > Am 17.02.25 um 17:04 schrieb Mario Limonciello: >> On 2/17/2025 10:00, Alex Deucher wrote: >>> On Mon, Feb 17, 2025 at 10:45 AM Alex Deucher wrote: On Mon, Feb 17, 2025 at 10:38 AM Christian König wrote: > > Am 17.02.25 um 1

[PATCH v3] drm/amdgpu: fix the memleak caused by fence not released

2025-02-18 Thread Arvind Yadav
Encountering a taint issue during the unloading of gpu_sched due to the fence not being released/put. In this context, amdgpu_vm_clear_freed is responsible for creating a job to update the page table (PT). It allocates kmem_cache for drm_sched_fence and returns the finished fence associated with jo

[PATCH] drm/amdgpu: disable BAR resize on Dell G5 SE

2025-02-18 Thread Alex Deucher
There was a quirk added to add a workaround for a Sapphire RX 5600 XT Pulse that didn't allow BAR resizing. However, the quirk casused a regression on Dell laptops using those chips, rather than narrowing the scope of the resizing quirk, add a quirk to prevent amdgpu from resizing the BAR on those

Re: [PATCH] PCI: fix Sapphire PCI rebar quirk

2025-02-18 Thread Christian König
Am 17.02.25 um 17:04 schrieb Mario Limonciello: > On 2/17/2025 10:00, Alex Deucher wrote: >> On Mon, Feb 17, 2025 at 10:45 AM Alex Deucher wrote: >>> >>> On Mon, Feb 17, 2025 at 10:38 AM Christian König >>> wrote: Am 17.02.25 um 16:10 schrieb Alex Deucher: > There was a quirk added

Re: [PATCH 1/6] drm/sched: Add internal job peek/pop API

2025-02-18 Thread Tvrtko Ursulin
On 18/02/2025 08:12, Philipp Stanner wrote: On Thu, 2025-02-13 at 14:05 -0800, Matthew Brost wrote: On Wed, Feb 12, 2025 at 01:36:58PM +0100, Philipp Stanner wrote: On Wed, 2025-02-12 at 12:30 +, Tvrtko Ursulin wrote: On 12/02/2025 10:40, Philipp Stanner wrote: On Wed, 2025-02-12 at 09

Re: [PATCH 1/6] drm/sched: Add internal job peek/pop API

2025-02-18 Thread Philipp Stanner
On Thu, 2025-02-13 at 14:05 -0800, Matthew Brost wrote: > On Wed, Feb 12, 2025 at 01:36:58PM +0100, Philipp Stanner wrote: > > On Wed, 2025-02-12 at 12:30 +, Tvrtko Ursulin wrote: > > > > > > On 12/02/2025 10:40, Philipp Stanner wrote: > > > > On Wed, 2025-02-12 at 09:32 +, Tvrtko Ursulin