From: "jesse.zh...@amd.com"
- Added `sdma_v4_4_2_update_reset_mask` function to update the reset mask.
- update the sysfs reset mask to the `late_init` stage to ensure that the SMU
initialization
and capability setup are completed before checking the SDMA reset
capability.
- For IP versio
From: "jesse.zh...@amd.com"
- Added `sdma_v4_4_2_update_reset_mask` function to update the reset mask.
- update the sysfs reset mask to the `late_init` stage to ensure that the SMU
initialization
and capability setup are completed before checking the SDMA reset
capability.
- For IP versio
From: "jesse.zh...@amd.com"
This patch introduces a new function to check if the SMU supports resetting the
SDMA engine.
This capability check ensures that the driver does not attempt to reset the
SDMA engine
on hardware that does not support it.
The following changes are included:
- New funct
From: "jesse.zh...@amd.com"
- Introduce a new function `sdma_v4_4_2_init_sysfs_reset_mask` to initialize
the sysfs reset mask for SDMA.
- Move the initialization of the sysfs reset mask to the `late_init` stage to
ensure that the SMU initialization
and capability setup are completed befor
From: "jesse.zh...@amd.com"
- Modify the `sdma_v4_4_2_sw_init` function to conditionally enable per-queue
reset support.
- For IP versions 9.4.3 and 9.4.4, enable per-queue reset if the MEC firmware
version is at least 0xb0 and PMFW supports queue reset.
- Add a TODO comment for future support
From: "jesse.zh...@amd.com"
This patch introduces a new function to check if the SMU supports resetting the
SDMA engine.
This capability check ensures that the driver does not attempt to reset the
SDMA engine
on hardware that does not support it.
The following changes are included:
- New funct
From: "jesse.zh...@amd.com"
- Modify the VM invalidation engine allocation logic to handle SDMA page rings.
SDMA page rings now share the VM invalidation engine with SDMA gfx rings
instead of
allocating a separate engine. This change ensures efficient resource
management and
avoids the is
From: "jesse.zh...@amd.com"
Increase the maximum number of rings supported by the AMDGPU driver from 132 to
148.
This change is necessary to enable support for the SDMA page ring.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
1 file changed, 1 insertion(+), 1
From: "jesse.zh...@amd.com"
- Modify the VM invalidation engine allocation logic to handle SDMA page rings.
SDMA page rings now share the VM invalidation engine with SDMA gfx rings
instead of
allocating a separate engine. This change ensures efficient resource
management and
avoids the is
From: "jesse.zh...@amd.com"
Increase the maximum number of rings supported by the AMDGPU driver from 132 to
148.
This change is necessary to enable support for the SDMA page ring.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
1 file changed, 1 insertion(+), 1
From: "jesse.zh...@amd.com"
Increase the maximum number of rings supported by the AMDGPU driver from 132 to
148.
This change is necessary to enable support for the SDMA page ring.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
1 file changed, 1 insertion(+), 1
From: "jesse.zh...@amd.com"
- Modify the VM invalidation engine allocation logic to handle SDMA page rings.
SDMA page rings now share the VM invalidation engine with SDMA gfx rings
instead of
allocating a separate engine. This change ensures efficient resource
management and
avoids the is
From: "jesse.zh...@amd.com"
This patch updates the SDMA v4.4.2 software initialization to enable per-queue
reset support when the MEC firmware version is 0xb0 or higher and the PMFW
supports SDMA reset.
The following changes are included:
- Added a condition to check if the MEC firmware version
From: "jesse.zh...@amd.com"
This patch introduces a new function to check if the SMU supports resetting the
SDMA engine.
This capability check ensures that the driver does not attempt to reset the
SDMA engine
on hardware that does not support it.
The following changes are included:
- New funct
From: "jesse.zh...@amd.com"
This patch adds a reset function pointer to the SDMA v4.4.2 page ring
functionality. The new function pointer `reset` is set to
`sdma_v4_4_2_reset_queue`, which is responsible for resetting the SDMA queue.
Changes:
- Add `reset` function pointer to `sdma_v4_4_2_page_r
From: "jesse.zh...@amd.com"
This patch updates the SDMA scheduler mask handling to include the page queue
if it exists. The scheduler mask is calculated based on the number of SDMA
instances and the presence of the page queue. The mask is updated to reflect
the state of both the SDMA gfx ring and
From: "jesse.zh...@amd.com"
This patch includes the remaining improvements to the SDMA reset logic:
- Added `gfx_guilty` and `page_guilty` flags to track guilty queues.
- Updated the reset and resume functions to handle the guilty state.
- Cached the `rptr` before reset.
v2:
1.replace the cal
From: "jesse.zh...@amd.com"
This patch introduces the `is_guilty` callbacks for the GFX and PAGE rings.
These callbacks check if a ring is guilty of causing a timeout or error.
Suggested-by: Alex Deucher
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 30
From: "jesse.zh...@amd.com"
This patch updates the `amdgpu_job_timedout` function to check if
the ring is actually guilty of causing the timeout. If not, it
skips error handling and fence completion.
Suggested-by: Alex Deucher
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_j
From: "jesse.zh...@amd.com"
This commit introduces a caller parameter to the amdgpu_sdma_reset_instance
function to differentiate
between reset requests originating from the KGD and KFD.
This change ensures proper synchronization between KGD and KFD during SDMA
resets.
If the caller is KFD, th
From: "jesse.zh...@amd.com"
This patch introduces the following changes:
- Add `cached_rptr` to the `amdgpu_ring` structure to store the read pointer
before a reset.
- Add `is_guilty` callback to the `amdgpu_ring_funcs` structure to check if a
ring is guilty of causing a timeout.
Suggested-by:
From: "jesse.zh...@amd.com"
This patch refactors the SDMA reset functionality in the `sdma_v4_4_2` driver
to improve modularity and support shared usage between AMDGPU and KFD. The
changes include:
1. **Refactored SDMA Reset Logic**:
- Split the `sdma_v4_4_2_reset_queue` function into two sep
From: "jesse.zh...@amd.com"
This patch introduces shared SDMA reset functionality between AMDGPU and KFD.
The implementation includes the following key changes:
1. Added `amdgpu_sdma_reset_queue`:
- Resets a specific SDMA queue by instance ID.
- Invokes registered pre-reset and post-reset
From: "jesse.zh...@amd.com"
This commit introduces several improvements to the SDMA reset logic:
1. Added `cached_rptr` to the `amdgpu_ring` structure to store the read pointer
before a reset, ensuring proper state restoration after reset.
2. Introduced `gfx_guilty` and `page_guilty` flags i
From: "jesse.zh...@amd.com"
This commit introduces a caller parameter to the amdgpu_sdma_reset_instance
function to differentiate
between reset requests originating from the KGD and KFD.
This change ensures proper synchronization between KGD and KFD during SDMA
resets.
If the caller is KFD, th
From: "jesse.zh...@amd.com"
This patch refactors the SDMA reset functionality in the `sdma_v4_4_2` driver
to improve modularity and support shared usage between AMDGPU and KFD. The
changes include:
1. **Refactored SDMA Reset Logic**:
- Split the `sdma_v4_4_2_reset_queue` function into two sep
From: "jesse.zh...@amd.com"
This patch introduces shared SDMA reset functionality between AMDGPU and KFD.
The implementation includes the following key changes:
1. Added `amdgpu_sdma_reset_queue`:
- Resets a specific SDMA queue by instance ID.
- Invokes registered pre-reset and post-reset
From: "jesse.zh...@amd.com"
Added ring id schedule to switch scheduling policy when cs submits.
Schedule the ring by setting the ring id.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 9 +++--
drivers/gpu/drm/amd/amd
From: "jesse.zh...@amd.com"
Added ring ID scheduling.
In some cases, userspace needs to run a job on a specific ring.
Instead of selecting the best ring to run based on the ring score.
For example, The user want to run a bad job on a specific ring to check
whether the ring can recover from a queu
From: "jesse.zh...@amd.com"
To avoid memory leaks, release q_extra_data when exiting the restore queue.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager
From: Jesse Zhang
Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x0301.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
b/drivers/gpu/d
From: Jesse Zhang
Initialize the new_state.jpeg before it used
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 677eb141554e..
From: Jesse Zhang
The parameter "last_jump_jiffies" should be initialized before being used in
the function atom_op_jump.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/atom.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/atom.c
b/drivers/gpu/drm/a
From: Jesse Zhang
check if ring is not mes queue before free wb entry.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 3 ++-
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 3 ++-
drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 3 ++-
3 files changed, 6 insertions(+), 3 deletions(-)
From: Jesse Zhang
remove the unsed the paramter in the function
ttm_bo_bounce_temp_buffer and ttm_bo_add_move_fence.
V2:rebase the patch on top of drm-misc-next (Christian)
Signed-off-by: Jesse Zhang
Reviewed-by: Christian König
---
drivers/gpu/drm/ttm/ttm_bo.c | 8 +++-
1 file changed,
From: Jesse Zhang
Remove the unused function - amdgpu_vm_pt_is_root_clean
and remove the impossible condition
v1: entries == 0 is not possible any more,
so this condition could probably be removed (Felix)
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h| 2 -
driv
From: Jesse Zhang
[ 3810.410040] UBSAN: shift-out-of-bounds in
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_int_process_v10.c:345:5
[ 3810.410044] shift exponent 4294967295 is too large for 64-bit type 'long
long unsigned int'
[ 3810.410047] CPU: 6 PID: 331 Comm: kworker/6:1H Not tainted 6.5.0+ #50
From: "Jesse.Zhang"
fix the issue:
"amdgpu: Failed to create process VM object".
[Why]when amdgpu initialized, seq64 do mampping and update bo mapping in vm
page table.
But when clifo run. It also initializes a vm for a process device through the
function kfd_process_devic
38 matches
Mail list logo