[PATCH] drm/amdkfd: Identical code for different branches

2025-05-23 Thread Sunday Clement
This patch removes the if/else statement in the cik_event_interrupt_wq function because it is redundant with both branches resulting in identical outcomes, this improves code readibility. BUG:SWDEV-534537 Signed-off-by: Sunday Clement --- drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c | 6 +--

Re: 6.15-rc6/regression/bisected - after commit f1c6be3999d2 error appeared: *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error

2025-05-23 Thread Mikhail Gavrilov
On Wed, May 21, 2025 at 10:13 PM Pillai, Aurabindo wrote: > > [AMD Official Use Only - AMD Internal Distribution Only] > > > Hi Mike, > > Thanks for the details. We tried to repro the issue at our end on 9000 and > 7000 series dgpu, but we're not seeing the dmub errors. We were on Ubunti, so > w

Re: [PATCH v4] PCI: Prevent power state transition of erroneous device

2025-05-23 Thread Rafael J. Wysocki
On Wed, May 21, 2025 at 1:27 PM Rafael J. Wysocki wrote: > > On Wed, May 21, 2025 at 10:54 AM Raag Jadav wrote: > > > > On Tue, May 20, 2025 at 01:56:28PM -0500, Mario Limonciello wrote: > > > On 5/20/2025 1:42 PM, Raag Jadav wrote: > > > > On Tue, May 20, 2025 at 12:39:12PM -0500, Mario Limoncie

Re: [PATCH V2 00/10] Reset improvements for GC10+

2025-05-23 Thread Alex Deucher
On Fri, May 23, 2025 at 10:12 AM Alex Deucher wrote: > > On Fri, May 23, 2025 at 10:03 AM Christian König > wrote: > > > > On 5/23/25 15:58, Alex Deucher wrote: > > > I think that's probably the best option. I was thinking we could > > > mirror the ring frames for each gang and after a reset, we

Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency

2025-05-23 Thread Danilo Krummrich
On Fri, May 23, 2025 at 04:11:39PM +0200, Danilo Krummrich wrote: > On Fri, May 23, 2025 at 02:56:40PM +0200, Christian König wrote: > > It turned out that we can actually massively optimize here. > > > > The previous code was horrible inefficient since it constantly released > > and re-acquired t

Re: [PATCH V2 00/10] Reset improvements for GC10+

2025-05-23 Thread Alex Deucher
On Fri, May 23, 2025 at 10:03 AM Christian König wrote: > > On 5/23/25 15:58, Alex Deucher wrote: > > I think that's probably the best option. I was thinking we could > > mirror the ring frames for each gang and after a reset, we submit the > > unprocessed frames again. That way we can still do

Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency

2025-05-23 Thread Danilo Krummrich
On Fri, May 23, 2025 at 02:56:40PM +0200, Christian König wrote: > It turned out that we can actually massively optimize here. > > The previous code was horrible inefficient since it constantly released > and re-acquired the lock of the xarray and started each iteration from the > base of the arra

Re: [PATCH V2 00/10] Reset improvements for GC10+

2025-05-23 Thread Alex Deucher
On Fri, May 23, 2025 at 9:27 AM Christian König wrote: > > On 5/23/25 05:04, Alex Deucher wrote: > > On Thu, May 22, 2025 at 5:57 PM Alex Deucher > > wrote: > >> > >> This set improves per queue reset support for GC10+. > >> This uses vmid resets for GFX. GFX resets all state > >> associated wi

Re: [PATCH V2 00/10] Reset improvements for GC10+

2025-05-23 Thread Christian König
On 5/23/25 15:58, Alex Deucher wrote: > I think that's probably the best option. I was thinking we could > mirror the ring frames for each gang and after a reset, we submit the > unprocessed frames again. That way we can still do a ring test to > make sure the ring is functional after the reset a

Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency

2025-05-23 Thread Tvrtko Ursulin
On 23/05/2025 13:56, Christian König wrote: It turned out that we can actually massively optimize here. The previous code was horrible inefficient since it constantly released and re-acquired the lock of the xarray and started each iteration from the base of the array to avoid concurrent modif

Re: [PATCH V2 00/10] Reset improvements for GC10+

2025-05-23 Thread Christian König
On 5/23/25 05:04, Alex Deucher wrote: > On Thu, May 22, 2025 at 5:57 PM Alex Deucher > wrote: >> >> This set improves per queue reset support for GC10+. >> This uses vmid resets for GFX. GFX resets all state >> associated with a vmid and then continues where it >> left off. Since once the IB us

Fixing AMDGPUs gang submit error handling

2025-05-23 Thread Christian König
Hi guys, fives try to those patches. I think I finally manage to understand how xarray works. There are the high level and lower level API and we can actually save tons of CPU cycles when we switch to the lower level API for adding the fences to the xarray. Looks like this is working now, but I

[PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency

2025-05-23 Thread Christian König
It turned out that we can actually massively optimize here. The previous code was horrible inefficient since it constantly released and re-acquired the lock of the xarray and started each iteration from the base of the array to avoid concurrent modification which in our case doesn't exist. Additi

[PATCH 4/4] drm/amdgpu: fix gang submission error handling

2025-05-23 Thread Christian König
For the unlikely case that we ran into an ENOMEM while fixing up the gang submission dependencies we can't clean up any more since the gang members are already armed. Fix this by using pre-allocated dependency slots and re-ordering the code, also fix a double unref since the fence reference is als

[PATCH 2/4] drm/sched: add drm_sched_prealloc_dependency_slots

2025-05-23 Thread Christian König
Sometimes drivers need to be able to submit multiple jobs which depend on each other to different schedulers at the same time, but using drm_sched_job_add_dependency() can't fail any more after the first job is initialized. This function preallocate memory for dependency slots so that no ENOMEM ca

[PATCH 3/4] drm/sched: Add a test for prealloced fence slots

2025-05-23 Thread Christian König
Just to exercise the functionality. Signed-off-by: Christian König --- drivers/gpu/drm/scheduler/tests/tests_basic.c | 56 ++- 1 file changed, 55 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/scheduler/tests/tests_basic.c b/drivers/gpu/drm/scheduler/tests/tests_basi

Re: [RFC PATCH 1/2] drm/amdgpu: amdgpu_vram_mgr_new(): Clamp lpfn to total vram

2025-05-23 Thread Paneer Selvam, Arunpravin
On 5/15/2025 9:19 PM, Paneer Selvam, Arunpravin wrote: On 5/12/2025 12:41 PM, Paneer Selvam, Arunpravin wrote: On 5/12/2025 12:39 PM, Christian König wrote: On 5/11/25 22:37, Paneer Selvam, Arunpravin wrote: On 5/12/2025 2:03 AM, Paneer Selvam, Arunpravin wrote: On 5/3/2025 5:53 PM,

Re: [PATCH v6 1/3] drm: Create a task info option for wedge events

2025-05-23 Thread Raag Jadav
On Wed, May 21, 2025 at 12:33:21PM -0300, André Almeida wrote: > When a device get wedged, it might be caused by a guilty application. > For userspace, knowing which task was the cause can be useful for some s/cause/involved > situations, like for implementing a policy, logs or for giving a chanc

Re: [PATCH 1/4] drm/sched: optimize drm_sched_job_add_dependency a bit

2025-05-23 Thread Tvrtko Ursulin
On 22/05/2025 17:19, Christian König wrote: On 5/22/25 16:27, Tvrtko Ursulin wrote: On 22/05/2025 14:41, Christian König wrote: Since we already iterated over the xarray we know at which index the new entry should be stored. So instead of using xa_alloc use xa_store and write into the index

[PATCH] drm/amdkfd: Map wptr BO to GART unconditionally

2025-05-23 Thread Lang Yu
This avoids potential reference count imbalance in amdgpu_amdkfd_free_gtt_mem(dev->adev, (void **)&pqn->q->wptr_bo_gart) and warning on unpinned BO in amdgpu_bo_gpu_offset(q->properties.wptr_bo). Compared with adding version check here and there, this simplifies things. Signed-off-by: Lang Yu --