In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 52 ++---
1 file changed, 37 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
b/
On 3/21/25 10:43, Bert Karwatzki wrote:
> I did some monitoring using this patch (on top of 6.12.18):
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
> index 0760e70402ec..ccd0c9058cee 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt
Add support for DPC recover based on refactored code
Signed-off-by: Ce Sun
Reviewed-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h| 5 +
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 172 ++---
drivers/gpu/drm/amd/amdgpu/soc15.c | 5 +
3 files chang
Add link reset implementation
Signed-off-by: Ce Sun
Reviewed-by: Hawking Zhang
---
drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 28 +++
drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 2 ++
drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 26 +
drivers/gpu/drm
This section describes the DPC high-level workflow for Multi-GPU configurations.
The GPUs are connected to root ports or switches using PCIe, and there are xGMI
links between the GPUs.
Multi-GPU DPC Workflow???
1.When an uncorrectable AER error occurs (assuming GPU-a encountered the error),
the err
The 'flags' parameter, which specifies memory allocation behavior while
creating a sync entry,
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c:162: warning: Function parameter or
struct member 'flags' not described in 'amdgpu_sync_fence'
Cc: Christian König
Cc: Alex Deuch
On 3/20/25 20:01, Ingo Molnar wrote:
>
> * Balbir Singh wrote:
>
>> On 3/17/25 00:09, Bert Karwatzki wrote:
>>> This is related to the admgpu.gttsize. My laptop has the maximum amount
>>> of memory (64G) and usually gttsize is half of main memory size. I just
>>> tested with cmdline="nokaslr a
Follow the same logic as the other IP types.
Reviewed-by: Prike Liang
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 8 +++-
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_
Hi Dave, Simona,
Fixes for 6.14.
The following changes since commit 6cc30748e17ea2a64051ceaf83a8372484e597f1:
drm/amdgpu: NULL-check BO's backing store when determining GFX12 PTE flags
(2025-03-12 14:59:21 -0400)
are available in the Git repository at:
https://gitlab.freedesktop.org/agd5f
The driver currently sets up one kgq per pipe. As such
adev->gfx.me.num_queue_per_pipe is hardcoded to 1 everywhere.
This is fine for kernel queues, but when we enable user queues
we need to know that actual number of queues per pipe. Decouple
the kgq setup from the actual hardware count. For de
On Thu, 20 Mar 2025 at 03:38, Feng, Kenneth wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Thanks Tomasz.
> I confirmed that this change is not in the latest driver-if file.
> However, this is a fw interface provided by firmware team, we can not change
> it.
> That means
In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 64 --
1 file changed, 51 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
b/d
In dev core dump, dump the full header fifo for
each queue. Each FIFO has 8 entries.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 41 ++
1 file changed, 35 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c
b/dr
So we can iterate across them when we need to manage
all user queues.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 3 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 15 ++-
drivers/gpu/dr
Similar to KFD, prevent runtime pm while user queues are active.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 30 +
1 file changed, 30 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv
GC12 only has 1 mec.
Fixes: 52cb80c12e8a ("drm/amdgpu: Add gfx v12_0 ip block support (v6)")
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c
b/drivers/gpu/drm/amd/am
On 3/20/2025 10:39 AM, jesse.zh...@amd.com wrote:
> From: "jesse.zh...@amd.com"
>
> - Modify the VM invalidation engine allocation logic to handle SDMA page
> rings.
> SDMA page rings now share the VM invalidation engine with SDMA gfx rings
> instead of
> allocating a separate engine. Th
[AMD Official Use Only - AMD Internal Distribution Only]
Should the SDMA mask be ASIC specific ? I think some ASIC might need to be set
to 0x3fc .
Shaoyun.liu
-Original Message-
From: amd-gfx On Behalf Of Alex Deucher
Sent: Thursday, March 20, 2025 10:23 AM
To: amd-gfx@lists.freedeskt
On Wed, 19 Mar 2025 10:28:53 +0100, Jiri Slaby (SUSE) wrote:
> tl;dr if patches are agreed upon, I ask subsys maintainers to take the
> respective ones via their trees (as they are split per subsys), so that
> the IRQ tree can take only the rest. That would minimize churn/conflicts
> during merges.
Warn if the number of pipes exceeds what the MES supports.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 23 +++
1 file changed, 19 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
b/drivers/gpu/drm/amd/amdgpu/am
On 3/20/2025 7:23 PM, Alex Deucher wrote:
> On Thu, Mar 20, 2025 at 9:44 AM Lazar, Lijo wrote:
>>
>>
>>
>> On 3/20/2025 6:21 PM, Alex Deucher wrote:
>>> On Thu, Mar 20, 2025 at 7:14 AM Lazar, Lijo wrote:
On 3/20/2025 12:38 AM, Alex Deucher wrote:
> Break when we get to
On Thu, Mar 20, 2025 at 9:44 AM Lazar, Lijo wrote:
>
>
>
> On 3/20/2025 6:21 PM, Alex Deucher wrote:
> > On Thu, Mar 20, 2025 at 7:14 AM Lazar, Lijo wrote:
> >>
> >>
> >>
> >> On 3/20/2025 12:38 AM, Alex Deucher wrote:
> >>> Break when we get to the end of the supported pipes
> >>> rather than co
On Fri, Mar 14, 2025 at 6:03 AM Flora Cui wrote:
>
> From: Alex Deucher
>
> On chips without native IP discovery support, use the fw binary
> if available, otherwise we can continue without it.
>
> Signed-off-by: Alex Deucher
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 38 ++
On Thu, Mar 20, 2025 at 4:29 AM Feng, Kenneth wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Hi Alex,
> The call trace is generated when the gdm is launched, as below.
> I tried running on a standalone workqueue but still see the workqueue is
> flushed.
I think that shou
On Thu, Mar 20, 2025 at 9:02 AM Christian König
wrote:
>
> This looks unnecessary and actually extremely harmful since using kmap()
> is not possible while inside the ring reset.
>
> Remove all the extra mapping and unmapping of the MQDs.
>
> v2: also fix debugfs
>
> Signed-off-by: Christian König
This looks unnecessary and actually extremely harmful since using kmap()
is not possible while inside the ring reset.
Remove all the extra mapping and unmapping of the MQDs.
v2: also fix debugfs
Signed-off-by: Christian König
Reviewed-by: Alex Deucher (v1)
---
drivers/gpu/drm/amd/amdgpu/amdgp
On Thu, Mar 20, 2025 at 7:15 AM Lazar, Lijo wrote:
>
>
>
> On 3/20/2025 12:38 AM, Alex Deucher wrote:
> > Enable pipes on both MECs for MES.
> >
> > Fixes: 745f46b6a99f ("drm/amdgpu: enable mes v12 self test")
> > Signed-off-by: Alex Deucher
> > ---
> > drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c |
On Thu, Mar 20, 2025 at 7:14 AM Lazar, Lijo wrote:
>
>
>
> On 3/20/2025 12:38 AM, Alex Deucher wrote:
> > Break when we get to the end of the supported pipes
> > rather than continuing the loop.
> >
> > Reviewed-by: Shaoyun.liu
> > Signed-off-by: Alex Deucher
> > ---
> > drivers/gpu/drm/amd/amd
[Public]
> From: amd-gfx On Behalf Of Lazar,
> Lijo
> Sent: Thursday, March 20, 2025 7:14 PM
> To: Deucher, Alexander ; amd-
> g...@lists.freedesktop.org
> Cc: Liu, Shaoyun
> Subject: Re: [PATCH 1/4] drm/amdgpu/mes: optimize compute loop handling
>
>
>
> On 3/20/2025 12:38 AM, Alex Deucher wrote
On 19/03/2025 11:23, Christian König wrote:
+ *
+ * Return:
+ * 0 on success, or an error on failing to expand the array.
+ */
+int drm_sched_job_prealloc_dependency_slots(struct drm_sched_job
*job,
+ unsigned int num_deps)
+{
+ struct dma_fence *
[Public]
Reviewed-by: Prike Liang
Regards,
Prike
> -Original Message-
> From: amd-gfx On Behalf Of Alex
> Deucher
> Sent: Thursday, March 20, 2025 3:09 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander
> Subject: [PATCH 4/4] drm/amdgpu/mes: clean up SDMA HQD loop
>
On 3/20/2025 12:38 AM, Alex Deucher wrote:
> Enable pipes on both MECs for MES.
>
> Fixes: 745f46b6a99f ("drm/amdgpu: enable mes v12 self test")
> Signed-off-by: Alex Deucher
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --
On 3/20/2025 12:38 AM, Alex Deucher wrote:
> Break when we get to the end of the supported pipes
> rather than continuing the loop.
>
> Reviewed-by: Shaoyun.liu
> Signed-off-by: Alex Deucher
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(
On 20/03/2025 09:58, Pierre-Eric Pelloux-Prayer wrote:
Its only purpose was for trace events, but jobs can already be
uniquely identified using their fence.
The downside of using the fence is that it's only available
after 'drm_sched_job_arm' was called which is true for all trace
events that
* Balbir Singh wrote:
> On 3/17/25 00:09, Bert Karwatzki wrote:
> > This is related to the admgpu.gttsize. My laptop has the maximum amount
> > of memory (64G) and usually gttsize is half of main memory size. I just
> > tested with cmdline="nokaslr amdgpi.gttsize=2048" and the problem does
>
[Public]
Acked-by: Prike Liang
Regards,
Prike
> -Original Message-
> From: amd-gfx On Behalf Of Alex
> Deucher
> Sent: Thursday, March 20, 2025 3:09 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander
> Subject: [PATCH 3/4] drm/amdgpu/mes: drop MES 10.x leftovers
>
>
[AMD Official Use Only - AMD Internal Distribution Only]
Reviewed-by: Prike Liang
Regards,
Prike
> -Original Message-
> From: amd-gfx On Behalf Of Alex
> Deucher
> Sent: Thursday, March 20, 2025 3:09 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander
> Subject: [PAT
[Public]
Reviewed-by: Prike Liang
Regards,
Prike
> -Original Message-
> From: amd-gfx On Behalf Of Alex
> Deucher
> Sent: Thursday, March 20, 2025 3:09 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Liu, Shaoyun
>
> Subject: [PATCH 1/4] drm/amdgpu/mes: optimiz
Am 20.03.25 um 10:58 schrieb Pierre-Eric Pelloux-Prayer:
> Its only purpose was for trace events, but jobs can already be
> uniquely identified using their fence.
>
> The downside of using the fence is that it's only available
> after 'drm_sched_job_arm' was called which is true for all trace
> eve
Am 20.03.25 um 10:58 schrieb Pierre-Eric Pelloux-Prayer:
> Log fences using the same format for coherency.
>
> Signed-off-by: Pierre-Eric Pelloux-Prayer
Oh, good catch! It's like a decade or so that we switched to 64bit sequence
numbers :)
Reviewed-by: Christian König
> ---
> drivers/gpu/drm
Log fences using the same format for coherency.
Signed-off-by: Pierre-Eric Pelloux-Prayer
---
drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 22 ++
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
b/drivers/gpu/drm/amd/am
Its only purpose was for trace events, but jobs can already be
uniquely identified using their fence.
The downside of using the fence is that it's only available
after 'drm_sched_job_arm' was called which is true for all trace
events that used job.id so they can safely switch to using it.
Suggest
This will be used in a later commit to trace the drm client_id in
some of the gpu_scheduler trace events.
This requires changing all the users of drm_sched_job_init to
add an extra parameter.
The newly added drm_client_id field in the drm_sched_fence is a bit
of a duplicate of the owner one. One
Hi,
The initial goal of this series was to improve the drm and amdgpu
trace events to be able to expose more of the inner workings of
the scheduler and drivers to developers via tools.
Then, the series evolved to become focused only on gpu_scheduler.
The changes around vblank events will be part
Em qua., 19 de mar. de 2025 às 10:24, Vignesh Raman
escreveu:
>
> Hi Helen,
>
> On 19/03/25 00:22, Helen Mae Koike Fornazier wrote:
> > Em sex., 14 de mar. de 2025 às 05:59, Vignesh Raman
> > escreveu:
> >>
> >> LAVA was recently patched [1] with a fix on how parameters are parsed in
> >> `lava-t
In the case of injecting uncorrected error with background workload,
the deferred error among uncorrected errors need to be specified
by checking the deferred and poison bits of status register.
v2: refine checking for deferred error
v2: log possiable DEs among CEs
v2: generate CPER records for DE
[AMD Official Use Only - AMD Internal Distribution Only]
Hi Alex,
The call trace is generated when the gdm is launched, as below.
I tried running on a standalone workqueue but still see the workqueue is
flushed.
Thanks.
[ 21.558439] [ cut here ]
[ 21.558443] workqueue
[AMD Official Use Only - AMD Internal Distribution Only]
Thanks, as the discuss offline, will improve it.
Best Regards,
Dean
From: Wang, Yang(Kevin)
Sent: Thursday, March 20, 2025 3:05 PM
To: Liu, Xiang(Dean) ; amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only]
Series is
Reviewed-by: Hawking Zhang
Regards,
Hawking
-Original Message-
From: Kamal, Asad
Sent: Thursday, March 20, 2025 15:04
To: amd-gfx@lists.freedesktop.org; Lazar, Lijo
Cc: Zhang, Hawking ; Ma, Le ; Zhang,
Morris ; Kamal,
[AMD Official Use Only - AMD Internal Distribution Only]
+ if (type == ACA_ERROR_TYPE_UE)
+ aca_log_aca_error(handle, ACA_ERROR_TYPE_DEFERRED, err_data);
+
return aca_log_aca_error(handle, type, err_data); }
It seems that the above code is incorrect, which may lead to
Few of the metrics data for smu_v13_0_12 has not been reported
in Q10 format, remove UQ10 to UINT conversion for those
Signed-off-by: Asad Kamal
Reviewed-by: Lijo Lazar
---
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_12_ppt.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --gi
Few of the metrics data for smu_v13_0_6 has not been reported
in Q10 format, remove UQ10 to UINT conversion for those
v2: Move smu_v13_0_12 changes to separate patch(Kevin)
Signed-off-by: Asad Kamal
Reviewed-by: Lijo Lazar
---
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 4 ++--
1 fi
52 matches
Mail list logo