On 2/13/2025 12:16 PM, Lazar, Lijo wrote:
On 2/13/2025 8:24 AM, Sathishkumar S wrote:
Add helper functions to handle per-instance and per-core
initialization and deinitialization in JPEG4_0_3.
Signed-off-by: Sathishkumar S
Acked-by: Christian König
Reviewed-by: Leo Liu
---
drivers/gpu/d
On 2/13/2025 8:24 AM, Sathishkumar S wrote:
> Add helper functions to handle per-instance and per-core
> initialization and deinitialization in JPEG4_0_3.
>
> Signed-off-by: Sathishkumar S
> Acked-by: Christian König
> Reviewed-by: Leo Liu
> ---
> drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c |
On 2/12/2025 9:27 PM, Jonathan Kim wrote:
> Deprecate KFD XGMI peer info calls in favour of calling directly from
> simplified XGMI peer info functions.
>
> Signed-off-by: Jonathan Kim
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 42 --
> drivers/gpu/drm/amd/amdgpu/amd
From: "jesse.zh...@amd.com"
This patch adds a reset function pointer to the SDMA v4.4.2 page ring
functionality. The new function pointer `reset` is set to
`sdma_v4_4_2_reset_queue`, which is responsible for resetting the SDMA queue.
Changes:
- Add `reset` function pointer to `sdma_v4_4_2_page_r
From: "jesse.zh...@amd.com"
This patch updates the SDMA scheduler mask handling to include the page queue
if it exists. The scheduler mask is calculated based on the number of SDMA
instances and the presence of the page queue. The mask is updated to reflect
the state of both the SDMA gfx ring and
From: "jesse.zh...@amd.com"
This patch includes the remaining improvements to the SDMA reset logic:
- Added `gfx_guilty` and `page_guilty` flags to track guilty queues.
- Updated the reset and resume functions to handle the guilty state.
- Cached the `rptr` before reset.
v2:
1.replace the cal
From: "jesse.zh...@amd.com"
This patch introduces the `is_guilty` callbacks for the GFX and PAGE rings.
These callbacks check if a ring is guilty of causing a timeout or error.
Suggested-by: Alex Deucher
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 30
From: "jesse.zh...@amd.com"
This patch updates the `amdgpu_job_timedout` function to check if
the ring is actually guilty of causing the timeout. If not, it
skips error handling and fence completion.
Suggested-by: Alex Deucher
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_j
From: "jesse.zh...@amd.com"
This commit introduces a caller parameter to the amdgpu_sdma_reset_instance
function to differentiate
between reset requests originating from the KGD and KFD.
This change ensures proper synchronization between KGD and KFD during SDMA
resets.
If the caller is KFD, th
From: "jesse.zh...@amd.com"
This patch introduces the following changes:
- Add `cached_rptr` to the `amdgpu_ring` structure to store the read pointer
before a reset.
- Add `is_guilty` callback to the `amdgpu_ring_funcs` structure to check if a
ring is guilty of causing a timeout.
Suggested-by:
From: "jesse.zh...@amd.com"
This patch refactors the SDMA reset functionality in the `sdma_v4_4_2` driver
to improve modularity and support shared usage between AMDGPU and KFD. The
changes include:
1. **Refactored SDMA Reset Logic**:
- Split the `sdma_v4_4_2_reset_queue` function into two sep
From: "jesse.zh...@amd.com"
This patch introduces shared SDMA reset functionality between AMDGPU and KFD.
The implementation includes the following key changes:
1. Added `amdgpu_sdma_reset_queue`:
- Resets a specific SDMA queue by instance ID.
- Invokes registered pre-reset and post-reset
[AMD Official Use Only - AMD Internal Distribution Only]
From: Yang, Philip
Sent: Wednesday, February 12, 2025 10:31 PM
To: Deng, Emily ; Yang, Philip ; Chen,
Xiaogang ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdkfd: Fix the deadlock in svm_range_restore_work
On 2025-02-12 0
Add helper functions to handle per-instance and per-core
initialization and deinitialization in JPEG4_0_3.
Signed-off-by: Sathishkumar S
Acked-by: Christian König
Reviewed-by: Leo Liu
---
drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 190 ---
1 file changed, 98 insertions(+),
Add ring reset function callback for JPEG2_5_0 to
recover from job timeouts without a full gpu reset.
Signed-off-by: Sathishkumar S
Acked-by: Christian König
Reviewed-by: Leo Liu
---
drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c | 16 +++-
1 file changed, 15 insertions(+), 1 deletion(-)
Add helper functions to handle per-instance initialization
and deinitialization in JPEG2_5_0.
Signed-off-by: Sathishkumar S
Acked-by: Christian König
Reviewed-by: Leo Liu
---
drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c | 102 +
1 file changed, 55 insertions(+), 47 deletions(
Add ring reset function callback for JPEG2_0_0 to
recover from job timeouts without a full gpu reset.
Signed-off-by: Sathishkumar S
Acked-by: Christian König
Reviewed-by: Leo Liu
---
drivers/gpu/drm/amd/amdgpu/jpeg_v2_0.c | 15 ++-
1 file changed, 14 insertions(+), 1 deletion(-)
d
Add helper functions to handle per-instance initialization
and deinitialization in JPEG2_5_0.
Signed-off-by: Sathishkumar S
Acked-by: Christian König
Reviewed-by: Leo Liu
---
drivers/gpu/drm/amd/amdgpu/jpeg_v2_5.c | 102 +
1 file changed, 55 insertions(+), 47 deletions(
Add ring reset function callback for JPEG3_0_0 to
recover from job timeouts without a full gpu reset.
Signed-off-by: Sathishkumar S
Acked-by: Christian König
Reviewed-by: Leo Liu
---
drivers/gpu/drm/amd/amdgpu/jpeg_v3_0.c | 15 ++-
1 file changed, 14 insertions(+), 1 deletion(-)
d
Add ring reset function callback for JPEG4_0_0 to
recover from job timeouts without a full gpu reset.
Signed-off-by: Sathishkumar S
Acked-by: Christian König
Reviewed-by: Leo Liu
---
drivers/gpu/drm/amd/amdgpu/jpeg_v4_0.c | 19 +--
1 file changed, 13 insertions(+), 6 deletions(
Add helper functions to handle per-instance and per-core
initialization and deinitialization in JPEG4_0_3.
Signed-off-by: Sathishkumar S
Acked-by: Christian König
Reviewed-by: Leo Liu
---
drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 190 ---
1 file changed, 98 insertions(+),
This patch series enables jpeg ring reset callback to recover
from job timeouts without having to do a full gpu reset.
V2:
- sched->ready flag shouldn't be modified by HW backend (Christian)
V3:
- Dont modifying sched/job-submission state from HW backend (Christian)
Sathishkumar S (6):
drm/a
On Wed, Feb 12, 2025 at 03:22:45PM +0800, Huacai Chen wrote:
> > The new series now has 7 patches:
> >
> > Tiezhu Yang (7):
> > objtool: Handle various symbol types of rodata
> > objtool: Handle different entry size of rodata
> > objtool: Handle PC relative relocation type
> > objtool/Loong
[AMD Official Use Only - AMD Internal Distribution Only]
Reviewed-by: Hawking Zhang
When dynamic GECC platform is detected and default mem ecc is disabled, Let's
add kernel message to remind users explicitly set amdgpu_ras_enable=1 before
driver loading to enable GECC if needed.
Regards,
Hawk
On Wed, Jan 29, 2025 at 7:12 PM Philip Yang wrote:
>
> To workaround queue full h/w issue on Gfx7/8, when application create
> AQL queues, the ring buffer bo allocate size is queue_size/2 and
> mapped to GPU twice using 2 attachments with same ring_bo backing
> memory.
>
> For this case, user queu
On 2025-02-12 17:42, Uwe Kleine-König
wrote:
#regzbot introduced: 68e599db7a549f010a329515f3508d8a8c3467a4
#regzbot monitor: https://bugs.debian.org/1093124
Hello,
On Thu, Jul 18, 2024 at 05:05:53PM -0400, Philip Yang wrote:
Find user queue
ping...
On 2025-01-29 19:04, Philip Yang wrote:
To workaround queue full h/w issue on Gfx7/8, when application create
AQL queues, the ring buffer bo allocate size is queue_size/2 and
mapped to GPU twice using 2 attachments with same ring_bo backing
memory.
For this
Currently, grace period (SCH_WAVE) is set only for gfx943 APU. This
could change as other wait times also needs to be set. Move ASIC
specific settings to ASIC specific function.
Signed-off-by: Harish Kasiviswanathan
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 28 ---
.../
Rename .set_grace_period() to .set_compute_queue_wait_counts(). The
function not only sets grace_period but also sets other compute queue
wait times. Up until now only grace_period was set/updated, however
other wait times also needs set/update. Change function name to reflect
this.
No functional
Set more optimized queue retry timeout for gfx9 family starting with
arcturus.
Signed-off-by: Harish Kasiviswanathan
---
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 7 ++
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.h| 1 +
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 8 +-
...
build_grace_period_packet_info is asic helper function that fetches the
correct format. It is the responsibility of the caller to validate the
value.
Signed-off-by: Harish Kasiviswanathan
---
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 18 +--
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd
Return an error if the IP version doesn't match otherwise
we end up passing a NULL string to amdgpu_ucode_request.
We should never hit this in practice today since we only
enable the umsch code on the supported IP versions, but
add a check to be safe.
Reported-by: kernel test robot
Closes:
https
Needed to be properly picked up for the initrd, etc.
Signed-off-by: Alex Deucher
Cc: Lang Yu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c
index
Make the constant parts of the name part of the string
we pass to amdgpu_ucode_request(). Only the version
number varies from IP to IP.
Signed-off-by: Alex Deucher
Cc: Lang Yu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git
On Tue, Feb 11, 2025 at 12:22 AM jesse.zh...@amd.com
wrote:
>
> From: "jesse.zh...@amd.com"
>
> This patch introduces shared SDMA reset functionality between AMDGPU and KFD.
> The implementation includes the following key changes:
>
> 1. Added `amdgpu_sdma_reset_queue`:
>- Resets a specific S
On Tue, Feb 11, 2025 at 9:42 AM jesse.zh...@amd.com wrote:
>
> From: "jesse.zh...@amd.com"
>
> This commit introduces several improvements to the SDMA reset logic:
>
> 1. Added `cached_rptr` to the `amdgpu_ring` structure to store the read
> pointer
>before a reset, ensuring proper state res
On Tue, Feb 11, 2025 at 12:22 AM jesse.zh...@amd.com
wrote:
>
> From: "jesse.zh...@amd.com"
>
> This commit introduces a caller parameter to the amdgpu_sdma_reset_instance
> function to differentiate
> between reset requests originating from the KGD and KFD.
> This change ensures proper synchron
On Tue, Feb 11, 2025 at 12:22 AM jesse.zh...@amd.com
wrote:
>
> From: "jesse.zh...@amd.com"
>
> This patch refactors the SDMA reset functionality in the `sdma_v4_4_2` driver
> to improve modularity and support shared usage between AMDGPU and KFD. The
> changes include:
>
> 1. **Refactored SDMA Re
As far as the number of XCCs, the number of compute partitions, and the
number of memory partitions qualify, CPX is valid.
Change-Id: I65696f25e2afd75f2f4a177dabc0991b15293d9a
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 5 -
1 file changed, 4 insertions(+), 1 de
On Tue, Feb 11, 2025 at 4:02 AM jesse.zh...@amd.com wrote:
>
> This patch updates the sdma engine to support scheduling for
> the page queue. The main changes include:
>
> - Introduce a new variable `page` to handle the page queue if it exists.
> - Update the scheduling logic to conditionally set
On Tue, Feb 11, 2025 at 11:42 PM Candice Li wrote:
>
> Enable GECC only when the default memory ECC mode or
> the module parameter amdgpu_ras_enable is activated.
>
> Signed-off-by: Candice Li
Acked-by: Alex Deucher
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
> .../gpu/drm/a
On Wed, Feb 12, 2025 at 08:57:10AM +0200, Raag Jadav wrote:
> On Tue, Feb 04, 2025 at 12:35:23PM +0530, Raag Jadav wrote:
> > This series introduces device wedged event in DRM subsystem and uses it
> > in xe, i915 and amdgpu drivers. Detailed description in commit message.
> >
> > This was earlier
[Public]
Greetings, sending peace.
Still seeking a review on this. These changes are simple and should not take
much of your time, so please send me a "Reviewed By:" or your comments.
One love!
> -Original Message-
> From: amd-gfx On Behalf Of Martin,
> Andrew
> Sent: Monday, Decembe
[Public]
Greetings, sending peace.
Still seeking a review on this. These changes are simple and should not take
much of your time, so please send me a "Reviewed By:" or your comments.
@Russell, Kent, for the record Coverity in only flagging variable as
Uninitialized if and only if that are se
On Wed, Feb 12, 2025 at 10:22 AM Alex Deucher wrote:
>
> VCN 2.5 doesn't support powergating so there is
> no need to call these.
>
> Signed-off-by: Alex Deucher
Dropping this one. VCN 2.5 doesn't support powergating, but this
function gets used for other stuff in the smu code.
Alex
> ---
>
From: "chr[]"
resume and irq handler happily races in set_power_state()
* amdgpu_legacy_dpm_compute_clocks() needs lock
* protect irq work handler
* fix dpm_enabled usage
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2524
Fixes: 3712e7a49459 ("drm/amd/pm: unified lock protections in a
Deprecate KFD XGMI peer info calls in favour of calling directly from
simplified XGMI peer info functions.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 42 --
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 5 ---
drivers/gpu/drm/amd/amdgpu/amdgpu_xg
On 2/12/2025 8:45 PM, Alex Deucher wrote:
> VCN 4.0.3 doesn't support powergating so there is
> no need to call these.
>
> Signed-off-by: Alex Deucher
Patches 1, 2 & 4
Reviewed-by: Lijo Lazar
Patch 3
Acked-by: Lijo Lazar
Thanks,
Lijo
> ---
> drivers/gpu/drm/amd/amdgpu/vc
VCN 4.0.3 doesn't support powergating so there is
no need to call these.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 10 --
1 file changed, 10 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
index
The VCN and UVD helpers were split in
commit ff69bba05f08 ("drm/amd/pm: add inst to dpm_set_powergating_by_smu")
However, this happened in parallel to the vcn 5.0.1
development so it was missed there.
Fixes: 346492f30ce3 ("drm/amdgpu: Add VCN_5_0_1 support")
Signed-off-by: Alex Deucher
Cc: Sonny
VCN 2.5 doesn't support powergating so there is
no need to call these.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c | 6 --
1 file changed, 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
index b9be304aa294b
VCN 5.0.1 doesn't support powergating so there is
no need to call these.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c | 10 --
1 file changed, 10 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c
b/drivers/gpu/drm/amd/amdgpu/vcn_v5_0_1.c
index
Acked-by: Alex Deucher for the series.
On Thu, Feb 6, 2025 at 5:37 PM Harry Wentland wrote:
>
>
>
> On 2025-01-27 14:59, André Almeida wrote:
> > amdgpu can handle async flips on overlay planes, so allow it for atomic
> > async checks.
> >
> > Signed-off-by: André Almeida
>
> Reviewed-by: Harry
Xinhui's email is no longer valid.
Signed-off-by: Alex Deucher
---
MAINTAINERS | 1 -
1 file changed, 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index c8b35ca294a02..d39b272a6a751 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -19213,7 +19213,6 @@ F: drivers/net/wireless/quantenna
On 2025-02-12 03:54, Deng, Emily wrote:
[AMD Official Use Only - AMD Internal Distribution Only]
Ping……
Emily
Deng
Best
Wishes
On Wed, 2025-02-12 at 09:32 +, Tvrtko Ursulin wrote:
>
> On 12/02/2025 09:02, Philipp Stanner wrote:
> > On Fri, 2025-02-07 at 14:50 +, Tvrtko Ursulin wrote:
> > > Idea is to add helpers for peeking and popping jobs from entities
> > > with
> > > the goal of decoupling the hidden assumptio
On Tue, 2025-02-11 at 12:14 +0100, Philipp Stanner wrote:
> drm_sched_init() has a great many parameters and upcoming new
> functionality for the scheduler might add even more. Generally, the
> great number of parameters reduces readability and has already caused
> one missnaming, addressed in:
>
On Fri, 2025-02-07 at 14:50 +, Tvrtko Ursulin wrote:
> Idea is to add helpers for peeking and popping jobs from entities
> with
> the goal of decoupling the hidden assumption in the code that
> queue_node
> is the first element in struct drm_sched_job.
>
> That assumption usually comes in the
://download.01.org/0day-ci/archive/20250212/202502121000.ebcedoo9-...@intel.com/config)
compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project
ab51eccf88f5321e7c60591c5546b254b6afab99)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit
On 12/02/2025 09:02, Philipp Stanner wrote:
On Fri, 2025-02-07 at 14:50 +, Tvrtko Ursulin wrote:
Idea is to add helpers for peeking and popping jobs from entities
with
the goal of decoupling the hidden assumption in the code that
queue_node
is the first element in struct drm_sched_job.
Th
btw. I still believe that it would be helpful (and congruent with the
established norm) to have the version in all patch titles. I do use
threaded view, but inboxes are huge, and everything that helps you
orient yourself is welcome
On Fri, 2025-02-07 at 14:51 +, Tvrtko Ursulin wrote:
> Helper
> … This patch changes its return type …
See also:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.14-rc2#n94
> Signed-off-by: Wentao Liang
How good does such an email address fit to the Developer's Certificate of
Origi
On Wed, 2025-02-12 at 12:30 +, Tvrtko Ursulin wrote:
>
> On 12/02/2025 10:40, Philipp Stanner wrote:
> > On Wed, 2025-02-12 at 09:32 +, Tvrtko Ursulin wrote:
> > >
> > > On 12/02/2025 09:02, Philipp Stanner wrote:
> > > > On Fri, 2025-02-07 at 14:50 +, Tvrtko Ursulin wrote:
> > > > >
On 12/02/2025 10:40, Philipp Stanner wrote:
On Wed, 2025-02-12 at 09:32 +, Tvrtko Ursulin wrote:
On 12/02/2025 09:02, Philipp Stanner wrote:
On Fri, 2025-02-07 at 14:50 +, Tvrtko Ursulin wrote:
Idea is to add helpers for peeking and popping jobs from entities
with
the goal of decoup
On 12/02/2025 09:02, Philipp Stanner wrote:
btw. I still believe that it would be helpful (and congruent with the
established norm) to have the version in all patch titles. I do use
threaded view, but inboxes are huge, and everything that helps you
orient yourself is welcome
On Fri, 2025-02-07
Am 12.02.25 um 12:30 schrieb Le Ma:
On systems with CONFIG_SLUB_DEBUG enabled, the memleak like below
will show up explicitly during driver unloading if created bo without
drm_timeline object before.
BUG drm_sched_fence (Tainted: G OE ): Objects remaining in
drm_sched_fence o
On systems with CONFIG_SLUB_DEBUG enabled, the memleak like below
will show up explicitly during driver unloading if created bo without
drm_timeline object before.
BUG drm_sched_fence (Tainted: G OE ): Objects remaining in
drm_sched_fence on __kmem_cache_shutdown()
[AMD Official Use Only - AMD Internal Distribution Only]
Ping……
Emily Deng
Best Wishes
From: Deng, Emily
Sent: Tuesday, February 11, 2025 8:21 PM
To: Deng, Emily ; Yang, Philip ; Chen,
Xiaogang ; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdkfd: Fix the deadlock in svm_range_restor
current test is more intrusive for user queue test
Signed-off-by: Saleemkhan Jamadar
Suggested-by: Christian Koenig
Reviewed-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_umsch_mm.c | 459 +--
1 file changed, 1 insertion(+), 458 deletions(-)
diff --git a/drivers/gp
The function amdgpu_ras_error_data_init() always returns 0, making its
return value checks redundant. This patch changes its return type to
void and removes all unnecessary checks in the callers.
This simplifies the code and avoids confusion about the function's
behavior. Additionally, this change
On Tue, Feb 04, 2025 at 12:35:23PM +0530, Raag Jadav wrote:
> This series introduces device wedged event in DRM subsystem and uses it
> in xe, i915 and amdgpu drivers. Detailed description in commit message.
>
> This was earlier attempted as xe specific uevent in v1 and v2 on [1].
> Similar work b
71 matches
Mail list logo