Some items were defined in both the general and DC glossaries.
Remove the duplicate entries.
Fixes: 2df30ae0ba0b ("Documentation/gpu: Add acronyms for some firmware
components")
Reported-by: Stephen Rothwell
Cc: Rodrigo Siqueira
Signed-off-by: Alex Deucher
---
Documentation/gpu/amdgpu/display
For cores 1 through 9 repair the core reset sequence by
adjusting offsets to access the expected registers.
Signed-off-by: Sathishkumar S
---
drivers/gpu/drm/amd/amdgpu/jpeg_v5_0_1.c | 14 +-
1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/jpe
On 2/26/25 04:10, Vitaliy Shevtsov wrote:
There is a type mismatch between what CalculateDynamicMetadataParameters()
takes and what is passed to it. Currently this function accepts several
args as signed long but it's called with unsigned integers. On some systems
where long is 32 bits and one
Reviewed-by: Alex Hung
On 2/26/25 01:37, Ma Ke wrote:
Null pointer dereference issue could occur when pipe_ctx->plane_state
is null. The fix adds a check to ensure 'pipe_ctx->plane_state' is not
null before accessing. This prevents a null pointer dereference.
Found by code review.
Cc: sta...@
Applied. Thanks!
Alex
On Wed, Feb 26, 2025 at 2:04 PM Alex Hung wrote:
>
> Reviewed-by: Alex Hung
>
> On 2/26/25 01:37, Ma Ke wrote:
> > Null pointer dereference issue could occur when pipe_ctx->plane_state
> > is null. The fix adds a check to ensure 'pipe_ctx->plane_state' is not
> > null bef
Applied. Thanks!
On Wed, Feb 26, 2025 at 8:11 AM André Almeida wrote:
>
> Prior to the addition of ring reset, the debug option
> `debug_disable_soft_recovery` could be used to force a full device
> reset. Now that we have ring reset, create a debug option to disable
> them in amdgpu, forcing th
Update pm_update_grace_period() to more cleaner
pm_config_dequeue_wait_counts(). Previously, grace_period variable was
overloaded as a variable and a macro, making it inflexible to configure
additional dequeue wait times.
pm_config_dequeue_wait_counts() now takes in a cmd / variable. This
allows f
Dequeue retry timeout controls the interval between checks for unmet
conditions. On MI series, reduce this from 0x40 to 0x1 (~ 1 uS). The
cost of additional bandwidth consumed by CP when polling memory
shouldn't be substantial.
Signed-off-by: Harish Kasiviswanathan
---
.../drm/amd/amdgpu/amdgpu_
Hi Dave, Simona,
Fixes for 6.14.
The following changes since commit d082ecbc71e9e0bf49883ee4afd435a77a5101b6:
Linux 6.14-rc4 (2025-02-23 12:32:57 -0800)
are available in the Git repository at:
https://gitlab.freedesktop.org/agd5f/linux.git
tags/amd-drm-fixes-6.14-2025-02-26
for you to fe
Prior to the addition of ring reset, the debug option
`debug_disable_soft_recovery` could be used to force a full device
reset. Now that we have ring reset, create a debug option to disable
them in amdgpu, forcing the driver to go with the full device
reset path again when both options are combined
There is a spelling mistake and a grammatical error in a dev_err
message. Fix it.
Signed-off-by: Colin Ian King
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c
b/drivers/gpu/drm/amd/amdgpu/
Applied. Thanks!
On Wed, Feb 26, 2025 at 4:13 AM Zhou1, Tao wrote:
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>
> Reviewed-by: Tao Zhou
>
> > -Original Message-
> > From: Colin Ian King
> > Sent: Wednesday, February 26, 2025 4:58 PM
> > To: Deucher, Alexander ; Koenig
On 02/26, Alex Deucher wrote:
> Some items were defined in both the general and DC glossaries.
> Remove the duplicate entries.
>
> Fixes: 2df30ae0ba0b ("Documentation/gpu: Add acronyms for some firmware
> components")
> Reported-by: Stephen Rothwell
> Cc: Rodrigo Siqueira
> Signed-off-by: Alex
add fan abnormal detection on smu v14.0.2&smu v14.0.3
Signed-off-by: Kenneth Feng
---
.../gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c| 52 +++
1 file changed, 52 insertions(+)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0.c
b/drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v1
[AMD Official Use Only - AMD Internal Distribution Only]
Reviewed-by: Yang Wang
Best Regards,
Kevin
-Original Message-
From: Kenneth Feng
Sent: Thursday, February 27, 2025 10:16
To: amd-gfx@lists.freedesktop.org
Cc: Wang, Yang(Kevin) ; Feng, Kenneth
Subject: [PATCH] drm/amd/pm: add f
For cores 1 through 7 repair the core reset sequence by
adjusting offsets to access the expected registers.
Signed-off-by: Sathishkumar S
---
drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 14 +-
1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/jpe
If GPU in reset, destroy_queue return -EIO, pqm_destroy_queue should
delete the queue from process_queue_list and free the resource.
Signed-off-by: Philip Yang
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 2 +-
1 file changed, 1 insertion(+), 1 deleti
If HW scheduler hangs and mode1 reset is used to recover GPU, KFD signal
user space to abort the processes. After process abort exit, user queues
still use the GPU to access system memory before h/w is reset while KFD
cleanup worker free system memory and free VRAM.
There is use-after-free race bu
debugfs hang_hws is used by GPU reset test with HWS, for MES this crash
the kernel with NULL pointer access because dqm->packet_mgr is not setup
for MES path.
Skip GPU with MES for now, MES hang_hws debugfs interface will be
supported later.
Signed-off-by: Philip Yang
Reviewed-by: Kent Russell
If waiting for gpu reset done in KFD release_work, thers is WARNING:
possible circular locking dependency detected
#2 kfd_create_process
kfd_process_mutex
flush kfd release work
#1 kfd release work
wait for amdgpu reset work
#0 amdgpu_device_gpu_reset
k
With GPU reset-domain worker implemented, KFD hw_exception worker is not
needed any more, just call amdgpu_amdkfd_gpu_reset directly from
kfd_hws_hang.
Suggested-by: Felix Kuehling
Signed-off-by: Philip Yang
Reviewed-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 11
mode1 reset test running with compute applications trigger many different
failures, such as machine reboot, kernel crash with general protection fault,
NULL pointer access or cpu page fault etc from random calling backtrace.
With KASAN and slub_debug enabled kernel, we capture slub left-redzone
ov
There is a type mismatch between what CalculateDynamicMetadataParameters()
takes and what is passed to it. Currently this function accepts several
args as signed long but it's called with unsigned integers and integer. On
some systems where long is 32 bits and one of these unsigned int params is
gr
The series is
Reviewed-by: Felix Kuehling
On 2025-02-26 12:14, Philip Yang wrote:
mode1 reset test running with compute applications trigger many different
failures, such as machine reboot, kernel crash with general protection fault,
NULL pointer access or cpu page fault etc from random calli
Similar to compute queue reset, flag SDMA queue reset capabilities to
user space for safe testing.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 5 +
drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 1 +
include/uapi/linux/kfd_sysfs.h| 3 +++
3 files chan
To reset hung SDMA queues on GFX 9.4+ for the GFX9 family, a soft reset
must be issued through SMU. Since soft resets will reset an entire SDMA
engine, use a common KGD call to do the reset as the KGD will handle
avoiding a reset of in flight GFX and paging queues on that engine.
In addition, cre
[AMD Official Use Only - AMD Internal Distribution Only]
A minor comment. With that Reviewed-by: Harish Kasiviswanathan
Please add pr_debug() comment stating that the size is incorrect.
-Original Message-
From: Cornwall, Jay
Sent: Wednesday, February 26, 2025 3:55 PM
To: Yat Sin, Davi
On 2/25/2025 20:41, David Yat Sin wrote:
If queue size is less than minimum, clamp it to minimum to prevent
underflow when writing queue mqd.
Signed-off-by: David Yat Sin
Reviewed-by: Jay Cornwall
Reviewed-by: Alex Hung
On 2/26/25 13:28, Vitaliy Shevtsov wrote:
There is a type mismatch between what CalculateDynamicMetadataParameters()
takes and what is passed to it. Currently this function accepts several
args as signed long but it's called with unsigned integers and integer. On
some sys
[AMD Official Use Only - AMD Internal Distribution Only]
ping
-Original Message-
From: Xiaogang.Chen
Sent: Monday, February 24, 2025 5:00 PM
To: amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix ; Chen, Xiaogang
Subject: [PATCH] drm/amdkfd: remove kfd_pasid.c from amdgpu driver build
On 2025-02-25 21:41, David Yat Sin wrote:
If queue size is less than minimum, clamp it to minimum to prevent
underflow when writing queue mqd.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 4
include/uapi/linux/kfd_ioctl.h | 2 ++
2 files chang
On 2025-02-24 17:59, Xiaogang.Chen wrote:
From: Xiaogang Chen
Since kfd uses pasid values from graphic driver now do not need use kfd pasid
fucntions.
Signed-off-by: Xiaogang Chen
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/Makefile| 1 -
drivers/gpu/drm/amd/amdkf
disable gfxoff on the specific sku based on the requirement
Signed-off-by: Kenneth Feng
---
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c
b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu
On 2/26/2025 1:37 PM, Kenneth Feng wrote:
> disable gfxoff on the specific sku based on the requirement
>
> Signed-off-by: Kenneth Feng
Reviewed-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c | 5 +
> 1 file changed, 5 insertions(+)
>
> diff
There is a type mismatch between what CalculateDynamicMetadataParameters()
takes and what is passed to it. Currently this function accepts several
args as signed long but it's called with unsigned integers. On some systems
where long is 32 bits and one of these input params is greater than INT_MAX
A lot of the workloads create jobs with just one to two IBs, and if we re-order
some struct members and shrink some others we can stop those allocations
spilling into the 1k SLAB bucket.
Before:
sizeof(struct amdgpu_job) + 2 * sizeof(struct amdgpu_ib) = 480 + 80 = 560
After:
sizeof(struct a
Lets group same width types closer together to reduce the number and size
of the holes in the struct.
Before:
/* size: 480, cachelines: 8, members: 30 */
/* sum members: 469, holes: 3, sum holes: 11 */
/* forced alignments: 1 */
/* last cacheline: 32 bytes */
Afte
Group the 32- vs 64- members together to remove hole from the struct.
Before:
/* size: 40, cachelines: 1, members: 5 */
/* sum members: 32, holes: 1, sum holes: 4 */
/* padding: 4 */
/* last cacheline: 40 bytes */
After:
/* size: 32, cachelines: 1, members
By moving the booleans to flags and shrinking some fields we can stop
spilling job allocation into the 1k SLAB even with two appended indirect
buffers.
End result for struct amdgpu_job:
/* size: 448, cachelines: 7, members: 24 */
/* forced alignments: 1 */
So appending two IB buf
Null pointer dereference issue could occur when pipe_ctx->plane_state
is null. The fix adds a check to ensure 'pipe_ctx->plane_state' is not
null before accessing. This prevents a null pointer dereference.
Found by code review.
Cc: sta...@vger.kernel.org
Fixes: 3be5262e353b ("drm/amd/display: Ren
[AMD Official Use Only - AMD Internal Distribution Only]
Reviewed-by: Tao Zhou
> -Original Message-
> From: Colin Ian King
> Sent: Wednesday, February 26, 2025 4:58 PM
> To: Deucher, Alexander ; Koenig, Christian
> ; David Airlie ; Simona Vetter
> ; Zhou1, Tao ; amd-
> g...@lists.freede
Leftover from the MES self tests that were removed previously.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 800
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h | 41 --
2 files changed, 841 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdg
Just use the default values. There's not need to
get the value from hardware and it could cause problems
if we do that at runtime and gfxoff is active.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 47 ++
1 file changed, 32 insertions(+), 15 de
Make sure these are set properly to ensure compatibility if
we ever update the IOCTL interface.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 14 ++
1 file changed, 14 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c
b/drive
Take a reference when we create a queue and drop it
when we destroy the queue. We need to keep the device
active while user queues are active.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/mes_userqueue.c | 14 +-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --
Just use the default values. There's not need to
get the value from hardware and it could cause problems
if we do that at runtime and gfxoff is active.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 48 ++
1 file changed, 33 insertions(+), 15 de
Found some typos while exploring amdgpu code.
Signed-off-by: Alexandre Demers
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 2 +-
drivers/gpu/drm/amd/amdgpu/uvd_v3_1.c | 6 +++---
drivers/gpu/drm/amd/amdgpu/uvd_v4_2.c | 3 ++-
drivers/gpu/drm/amd/amdgpu/vce_v2_0.c | 2 +-
4 files changed,
Fix typos
Signed-off-by: Alexandre Demers
---
drivers/gpu/drm/amd/amdgpu/si.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/si.c b/drivers/gpu/drm/amd/amdgpu/si.c
index d1c06d0d6a2d..68f6f4ec8a47 100644
--- a/drivers/gpu/drm/amd/amdgpu/si.c
Found some typos while exploring radeon code.
Signed-off-by: Alexandre Demers
---
drivers/gpu/drm/radeon/radeon_device.c | 6 +++---
drivers/gpu/drm/radeon/radeon_fence.c | 2 +-
drivers/gpu/drm/radeon/si.c| 2 +-
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/driver
dce_v6_0_set_crtc_vline_interrupt_state() was empty without any info to
inform the user.
Based on DCE8 and DCE10 code.
Signed-off-by: Alexandre Demers
---
drivers/gpu/drm/amd/amdgpu/dce_v6_0.c | 44 +++
1 file changed, 44 insertions(+)
diff --git a/drivers/gpu/drm/amd/a
This series uniformizes some value definitions between DCE6, 8 and 10.
It also adds missing code for dce_v6_0_soft_reset() and
dce_v6_0_set_crtc_vline_interrupt_state()
Alexandre Demers (6):
drm/amdgpu: add or move defines for DCE6 in sid.h
drm/amdgpu: add dce_v6_0_soft_reset() to DCE6
drm/
For coherence with DCE8 et DCE10, add or move some values under sid.h.
Signed-off-by: Alexandre Demers
---
drivers/gpu/drm/amd/amdgpu/dce_v6_0.c | 63 ++-
drivers/gpu/drm/amd/amdgpu/si_enums.h | 7 ---
drivers/gpu/drm/amd/amdgpu/sid.h | 29 +---
3 files chan
A few returns not where they should be.
Signed-off-by: Alexandre Demers
---
drivers/gpu/drm/amd/amdgpu/dce_v6_0.c | 14 +-
1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
index fd2eb454a5d8..
Signed-off-by: Alexandre Demers
---
drivers/gpu/drm/amd/amdgpu/dce_v6_0.c | 18 --
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
index e805c4f9222c..fd2eb454a5d8 100644
--- a/drivers/gp
DCE6 was missing soft reset, but it was easily identifiable under radeon.
This should be it, pretty much as it is done under DCE8 and DCE10.
Signed-off-by: Alexandre Demers
---
drivers/gpu/drm/amd/amdgpu/dce_v6_0.c | 62 ---
1 file changed, 57 insertions(+), 5 deletions(-
Define pin_offsets values in the same way it is done in DCE8
Signed-off-by: Alexandre Demers
---
drivers/gpu/drm/amd/amdgpu/cikd.h | 9 +
drivers/gpu/drm/amd/amdgpu/dce_v8_0.c | 14 +++---
2 files changed, 16 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/am
56 matches
Mail list logo