Initialize the queue type before resetting the queue using mmio.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index f7d5d4f08a53..10b61
Variable hub_inst is unused.
Related the commit "bde7ae79ca40":
"drm/amdkfd: Drop poison hanlding from gfx v10"
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c | 13 -
1 file changed, 13 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_pro
To avoid memory leaks, release q_extra_data when exiting the restore queue.
v2: Correct the proto (Alex)
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.
Not all ASICs support the queue reset feature.
Therefore, userspace can query this feature
via AMDGPU_INFO_QUEUE_RESET before validating a queue reset.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 27 +
include/uapi/drm/amdgpu_drm.h |
compute/gfx may have multiple rings on some hardware.
In some cases, userspace wants to run jobs on a specific ring for validation
purposes.
This debugfs entry helps to disable or enable submitting jobs to a specific
ring.
This entry is populated only if there are at least two or more cores in th
Userspace wants to run jobs on a specific sdma ring for verification purposes.
This debugfs entry helps to disable or enable submitting jobs to a specific
ring.
This entry is populated only if there are at least two or more cores in the
sdma ip.
Signed-off-by: Jesse Zhang
Suggested-by:Alex Deuc
Add the sysfs interface for vpe:
vpe_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
Sign
Add two sysfs interfaces for gfx and compute:
gfx_reset_mask
compute_reset_mask
These interfaces are read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead
Add the sysfs interface for sdma:
sdma_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
Signed
Add the sysfs interface for vcn:
vcn_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
Signed-o
Add the sysfs interface for jpeg:
jpeg_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
Signed
From: "jesse.zh...@amd.com"
Add the sysfs interface for vpe:
vpe_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
Signed-off-by: Jesse Zhang
Suggest
Add the sysfs interface for sdma:
sdma_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
Signed-off-by: Jesse Zhang
Suggested-by:Alex Deucher
---
drivers/gpu/drm/amd/am
Add two sysfs interfaces for gfx and compute:
gfx_reset_mask
compute_reset_mask
These interfaces are read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
Signed-off-by: Jesse Zhang
Suggested-by:Alex De
From: "jesse.zh...@amd.com"
Add the sysfs interface for vcn:
vcn_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
Signed-off-by: Jesse Zhang
Suggested-by:Al
From: "jesse.zh...@amd.com"
Add the sysfs interface for jpeg:
jpeg_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
Signed-off-by: Jesse Zhang
Suggest
Add two sysfs interfaces for gfx and compute:
gfx_reset_mask
compute_reset_mask
These interfaces are read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead
From: "jesse.zh...@amd.com"
Add two sysfs interfaces for gfx and compute:
gfx_reset_mask
compute_reset_mask
These interfaces are read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2:
From: "jesse.zh...@amd.com"
Add the sysfs interface for sdma:
sdma_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string
Add the sysfs interface for vcn:
vcn_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
V2: the
Add the sysfs interface for jpeg:
jpeg_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
v3: add
Add the sysfs interface for vpe:
vpe_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
v3: a
Add the sysfs interface for sdma:
sdma_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
v3: add
Add two sysfs interfaces for gfx and compute:
gfx_reset_mask
compute_reset_mask
These interfaces are read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead
[ 90.441868] [ cut here ]
[ 90.441873] kernel BUG at mm/slub.c:553!
[ 90.441885] Oops: invalid opcode: [#1] PREEMPT SMP NOPTI
[ 90.441892] CPU: 0 PID: 1523 Comm: amd_pci_unplug Tainted: GE
6.10.0+ #47
[ 90.441900] Hardware name: AMD Splinter/
From: "jesse.zh...@amd.com"
Add the sysfs interface for sdma:
sdma_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string
From: "jesse.zh...@amd.com"
Add the sysfs interface for vcn:
vcn_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string
From: "jesse.zh...@amd.com"
Add the sysfs interface for vpe:
vpe_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text stri
sysfs: cannot create duplicate filename
'/devices/pci:00/:00:01.1/:01:00.0/:02:00.0/:03:00.0/vcn_reset_mask'
[ 562.443738] CPU: 13 PID: 4888 Comm: modprobe Tainted: GE
6.10.0+ #51
[ 562.443740] Hardware name: AMD Splinter/Splinter-RPL, BIOS VS2683299N.FD
05
From: "jesse.zh...@amd.com"
sysfs: cannot create duplicate filename
'/devices/pci:00/:00:01.1/:01:00.0/:02:00.0/:03:00.0/vcn_reset_mask'
[ 562.443738] CPU: 13 PID: 4888 Comm: modprobe Tainted: GE
6.10.0+ #51
[ 562.443740] Hardwar
[ 2875.870277] [drm:amdgpu_device_init [amdgpu]] *ERROR* sw_init of IP block
failed -22
[ 2875.880494] amdgpu :01:00.0: amdgpu: amdgpu_device_ip_init failed
[ 2875.887689] amdgpu :01:00.0: amdgpu: Fatal error during GPU init
[ 2875.894791] amdgpu :01:00.0: amdgpu: amdgpu: finishing de
For multiple vcn instances, to avoid creating reset sysfs multiple times,
add the instance paramter in reset mask init.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 8
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h | 4 ++--
drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
From: "jesse.zh...@amd.com"
Add the sysfs interface for jpeg:
jpeg_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string
Fix similar warning when running IGT:
[ 155.585721] kernfs: can not remove 'enforce_isolation', no directory
[ 155.592201] WARNING: CPU: 3 PID: 6960 at fs/kernfs/dir.c:1683
kernfs_remove_by_name_ns+0xb9/0xc0
[ 155.601145] Modules linked in: xt_MASQUERADE xt_comment nft_compat veth
bridge stp
Fix the similar warning:
[ 155.585721] kernfs: can not remove 'enforce_isolation', no directory
[ 155.592201] WARNING: CPU: 3 PID: 6960 at fs/kernfs/dir.c:1683
kernfs_remove_by_name_ns+0xb9/0xc0
[ 155.601145] Modules linked in: xt_MASQUERADE xt_comment nft_compat veth
bridge stp llc overlay n
From: "jesse.zh...@amd.com"
This reverts commit 10aec8943bcc5123288ded8c97e78312bcf17fb1.
the dev->unplugged flag will also be set to true ,
Only uninstall the driver by amdgpu_exit, not actually unplug the device.
that will cause a new issue.
Signed-off-by: Jesse Zhang
---
dr
Fix the similar warning when hotplugging:
[ 155.585721] kernfs: can not remove 'enforce_isolation', no directory
[ 155.592201] WARNING: CPU: 3 PID: 6960 at fs/kernfs/dir.c:1683
kernfs_remove_by_name_ns+0xb9/0xc0
[ 155.601145] Modules linked in: xt_MASQUERADE xt_comment nft_compat veth
bridge
Replace the check drm_dev_enter with sysfs directory entry.
Because the dev->unplugged flag will also be set to true,
Only uninstall the driver by amdgpu_exit, not actually unplug the device.
Signed-off-by: Jesse Zhang
Reported-by: Andy Dong
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |
Replace the check drm_dev_enter with sysfs directory entry.
Because the dev->unplugged flag will also be set to true,
Only uninstall the driver by amdgpu_exit, not actually unplug the device.
Signed-off-by: Jesse Zhang
Reported-by: Andy Dong
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |
From: "jesse.zh...@amd.com"
This reverts commit 330d97e9b14e0c85cc8b63e0092e4abcb9ce99c8.
the dev->unplugged flag will also be set to true ,
Only uninstall the driver by amdgpu_exit,not actually unplug the device.
that will cause a new issue.
Signed-off-by: Jesse Zhang
---
driver
When using MES creating a pdd will require talking to the GPU to
setup the relevant context. The code here forgot to wake up the GPU
in case it was in suspend, this causes KVM to EFAULT for passthrough
GPU for example. This issue can be masked if the GPU was woken up by
other things (e.g. opening t
replace MES kgq reset with MMIO.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c
index 69941442f00b..ba2ab9296eb4 100644
Enable the kcg and kcq queue reset flag
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 10 +-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c
index 3aa34c4d..6994144
From: "jesse.zh...@amd.com"
This patch introduces shared SDMA reset functionality between AMDGPU and KFD.
The implementation includes the following key changes:
1. Added `amdgpu_sdma_reset_queue`:
- Resets a specific SDMA queue by instance ID.
- Invokes registered pre-reset and
From: "jesse.zh...@amd.com"
This commit introduces several improvements to the SDMA reset logic:
1. Added `cached_rptr` to the `amdgpu_ring` structure to store the read pointer
before a reset, ensuring proper state restoration after reset.
2. Introduced `gfx_guilty` and `page_gui
From: "jesse.zh...@amd.com"
This commit introduces a caller parameter to the amdgpu_sdma_reset_instance
function to differentiate
between reset requests originating from the KGD and KFD.
This change ensures proper synchronization between KGD and KFD during SDMA
resets.
If the cal
From: "jesse.zh...@amd.com"
This patch refactors the SDMA reset functionality in the `sdma_v4_4_2` driver
to improve modularity and support shared usage between AMDGPU and KFD. The
changes include:
1. **Refactored SDMA Reset Logic**:
- Split the `sdma_v4_4_2_reset_queue` functio
From: "jesse.zh...@amd.com"
This commit introduces several improvements to the SDMA reset logic:
1. Added `cached_rptr` to the `amdgpu_ring` structure to store the read pointer
before a reset, ensuring proper state restoration after reset.
2. Introduced `gfx_guilty` and `page_gui
From: "jesse.zh...@amd.com"
This patch refactors the SDMA reset functionality in the `sdma_v4_4_2` driver
to improve modularity and support shared usage between AMDGPU and KFD. The
changes include:
1. **Refactored SDMA Reset Logic**:
- Split the `sdma_v4_4_2_reset_queue` functio
From: "jesse.zh...@amd.com"
This patch introduces shared SDMA reset functionality between AMDGPU and KFD.
The implementation includes the following key changes:
1. Added `amdgpu_sdma_reset_queue`:
- Resets a specific SDMA queue by instance ID.
- Invokes registered pre-reset and
From: "jesse.zh...@amd.com"
This commit introduces several improvements to the SDMA reset logic:
1. Added `cached_rptr` to the `amdgpu_ring` structure to store the read pointer
before a reset, ensuring proper state restoration after reset.
2. Introduced `gfx_guilty` and `page_gui
From: "jesse.zh...@amd.com"
This commit introduces a caller parameter to the amdgpu_sdma_reset_instance
function to differentiate
between reset requests originating from the KGD and KFD.
This change ensures proper synchronization between KGD and KFD during SDMA
resets.
If the cal
This patch updates the sdma engine to support scheduling for
the page queue. The main changes include:
- Introduce a new variable `page` to handle the page queue if it exists.
- Update the scheduling logic to conditionally set the `sched.ready` flag for
both the sdma gfx queue and the page queue
Extracts the resume sequence for per sdma instance from sdma_v7_0_gfx_resume.
This function can be used in start or restart scenarios of specific instances.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 259 ++---
1 file changed, 141 insertions(+), 1
Implement sdma queue reset callback by mes_reset_queue_mmio.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 26 ++
1 file changed, 26 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
inde
Reset gfx/compute queue through mmio based on me_id and queue_id.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.h | 2 +
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 88 +-
2 files changed, 89 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/a
Reset sdma queue through mmio based on me_id and queue_id.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 46 ++
1 file changed, 46 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
index
sdmv7 queue reset already supports by mmio, add its sys file.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
index 62
Replace kcq queue reset with existing function amdgpu_mes_reset_legacy_queue.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 18 +-
1 file changed, 5 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c
b/drivers/gpu/drm/am
Extracts the resume sequence for per sdma instance from sdma_v7_0_gfx_resume.
This function can be used in start or restart scenarios of specific instances.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 259 ++---
1 file changed, 141 insertions(+), 1
Reset gfx/compute queue through mmio based on me_id and queue_id.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.h | 2 +
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 88 +-
2 files changed, 89 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/a
Reset sdma queue through mmio based on me_id and queue_id.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 46 ++
1 file changed, 46 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
index
Replace kcq queue reset with existing function amdgpu_mes_reset_legacy_queue.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 22 +++---
1 file changed, 3 insertions(+), 19 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
b/drivers/gpu/dr
Implement sdma queue reset callback by mes_reset_queue_mmio.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 26 ++
1 file changed, 26 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
inde
sdmv7 queue reset already supports by mmio, add its sys file.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
index 62
add the PPSMC_MSG_ResetSDMA2 definition for smu 13.0.6
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h | 1 +
drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h | 3 ++-
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 1 +
3 fi
From: "jesse.zh...@amd.com"
Remove apu check in sdma queue reset.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
b/drivers/gpu/drm/amd/amdgpu/sdm
Implement sdma queue reset by SMU_MSG_ResetSDMA2
Suggested-by: Tim Huang
Signed-off-by: Jesse Zhang
---
.../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 30 ++-
1 file changed, 22 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
b/
Initialize the process context address before setting the shader debugger.
[ 260.781212] amdgpu :03:00.0: amdgpu: [gfxhub] page fault (src_id:0
ring:32 vmid:0 pasid:0)
[ 260.781236] amdgpu :03:00.0: amdgpu: in page starting at address
0x from client 10
[ 260.781255]
From: "jesse.zh...@amd.com"
Remove apu check in sdma queue reset.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
b/drivers/gpu/drm/amd/amdgpu/sdm
From: "jesse.zh...@amd.com"
add the definition PPSMC_MSG_ResetSDMA2.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h | 1 +
drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h | 3 ++-
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13
From: "jesse.zh...@amd.com"
Implement sdma queue reset by SMU_MSG_ResetSDMA2.
Signed-off-by: Jesse Zhang
---
.../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 28 ++-
1 file changed, 21 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/s
implement gfx10 kgq reset via mmio.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 98 ++
1 file changed, 70 insertions(+), 28 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 89409c
implement gfx10 kcq reset via mmio.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 121 ++---
1 file changed, 88 insertions(+), 33 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 88393c
From: "jesse.zh...@amd.com"
Using mmio to do queue reset.
v2: Alignment this function with gfx9/gfx9.4.3.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 34 ++
1 file changed, 34 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
From: "jesse.zh...@amd.com"
Using mmio to do queue reset
v2: Alignment the function with gfx9/gfx9.4.3.
Signed-off-by: Jesse Zhang adev;
unsigned i;
+ uint32_t tmp;
/* enter save mode */
amdgpu_gfx_rlc_enter_safe_mode(adev, xcc_id);
@@ -3813,7 +3814,25
From: "jesse.zh...@amd.com"
pmfw now unifies PPSMC_MSG_ResetSDMA definitions for different devices.
PPSMC_MSG_ResetSDMA2 is not needed.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h | 1 -
drivers/gpu/drm/amd/pm/swsmu/inc/s
From: "jesse.zh...@amd.com"
This patch refactors the firmware version checks in `smu_v13_0_6_reset_sdma`
to support multiple SMU programs with different firmware version thresholds.
Signed-off-by: Jesse Zhang
---
.../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 14 +---
From: "jesse.zh...@amd.com"
pmfw unified PPSMC_MSG_ResetSDMA definitions for different devices.
PPSMC_MSG_ResetSDMA2 is not needed.
Signed-off-by: Jesse Zhang
---
.../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 30 +--
1 file changed, 8 insertions(+), 22 deletion
From: "jesse.zh...@amd.com"
This patch refactors the firmware version checks in `smu_v13_0_6_reset_sdma`
to support multiple SMU programs with different firmware version thresholds.
V2: return -EOPNOTSUPP for unspported pmfw
Suggested-by: Lazar Lijo
Signed-off-by: Jesse Zhang
--
When a GPU job times out, the driver attempts to recover by restarting
the scheduler. Previously, the scheduler was restarted with an error
code of 0, which does not distinguish between a full GPU reset and a
queue reset. This patch changes the error code to -ENODATA for queue
resets, while -ECANCE
From: "jesse.zh...@amd.com"
This patch refactors the firmware version checks in `smu_v13_0_6_reset_sdma`
to support multiple SMU programs with different firmware version thresholds.
V2: return -EOPNOTSUPP for unspported pmfw
V3: except IP_VERSION(13, 0, 12) which is not supported.
Su
From: "jesse.zh...@amd.com"
This patch refactors the firmware version checks in `smu_v13_0_6_reset_sdma`
to support multiple SMU programs with different firmware version thresholds.
V2: return -EOPNOTSUPP for unspported pmfw
V3: except IP_VERSION(13, 0, 12) which is not supported.
Su
The scheduler should restart only if the reset operation
succeeds This ensures that new tasks are only submitted
to the queues after a successful reset.
Fixes: 9b5d66721b66308a5 ("drm/amdgpu: Introduce conditional user queue
suspension for SDMA resets")
Suggested-by: Alex Deucher
Signed-off-by
From: "jesse.zh...@amd.com"
Increase the maximum number of rings supported by the AMDGPU driver from 133 to
149.
This change is necessary to enable support for the SDMA page ring.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
1 file changed, 1
From: "jesse.zh...@amd.com"
- Modify the VM invalidation engine allocation logic to handle SDMA page rings.
SDMA page rings now share the VM invalidation engine with SDMA gfx rings
instead of
allocating a separate engine. This change ensures efficient resource
management and
From: "jesse.zh...@amd.com"
This commit updates the VM flush implementation for the SDMA engine.
- Added a new function `sdma_v4_4_2_get_invalidate_req` to construct the
VM_INVALIDATE_ENG0_REQ
register value for the specified VMID and flush type. This function ensures
that al
From: "jesse.zh...@amd.com"
This commit updates the VM flush implementation for the SDMA engine.
- Added a new function `sdma_v4_4_2_get_invalidate_req` to construct the
VM_INVALIDATE_ENG0_REQ
register value for the specified VMID and flush type. This function ensures
that al
Since KFD no longer registers its own callbacks for SDMA resets, and only KGD
uses the reset mechanism,
we can simplify the SDMA reset flow by directly calling the ring's `stop_queue`
and `start_queue` functions.
This patch removes the dynamic callback mechanism and prepares for its eventual
dep
From: "jesse.zh...@amd.com"
This patch refactors the SDMA v5.0 reset logic by splitting the
`sdma_v5_0_reset_queue` function into two separate functions:
`sdma_v5_0_stop_queue` and `sdma_v5_0_restore_queue`.
This change aligns with the new SDMA reset mechanism, where the reset p
Add support for per-queue reset on SDMA v4.4.2 when running with:
1. MEC firmware version 0xb0 or later
2. DPM indicates SDMA reset is supported
Signed-off-by: Jesse.Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/
Add support for per-queue reset on SDMA v4.4.2 when running with:
1. MEC firmware version 17 or later
2. DPM indicates SDMA reset is supported
v2: Fixed supported firmware versions (Lijo)
Signed-off-by: Jesse.Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 4 +++-
1 file changed, 3 insert
Add GC11.0.0 to the list of GPU generations that support TMZ.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 464625282872..1eb92424
Since KFD no longer registers its own callbacks for SDMA resets, and only KGD
uses the reset mechanism,
we can simplify the SDMA reset flow by directly calling the ring's `stop_queue`
and `start_queue` functions.
This patch removes the dynamic callback mechanism and prepares for its eventual
dep
From: "jesse.zh...@amd.com"
This patch refactors the SDMA v5.0 queue reset and stop logic to improve
code readability, maintainability, and performance. The key changes include:
1. **Generalized `sdma_v5_0_gfx_stop` Function**:
- Added an `inst_mask` parameter to allow stopping spe
From: "jesse.zh...@amd.com"
This patch introduces a new function `amdgpu_sdma_soft_reset` to handle SDMA
soft resets directly,
rather than relying on the DPM interface.
1. **New `amdgpu_sdma_soft_reset` Function**:
- Implements a soft reset for SDMA engines by directly writ
From: "jesse.zh...@amd.com"
This patch refactors the SDMA v5.0 reset logic by splitting the
`sdma_v5_0_reset_queue` function into two separate functions:
`sdma_v5_0_stop_queue` and `sdma_v5_0_restore_queue`.
This change aligns with the new SDMA reset mechanism, where the reset p
From: "jesse.zh...@amd.com"
This patch removes the deprecated SDMA reset callback mechanism, which was
previously used to register pre-reset and post-reset callbacks for SDMA engine
resets.
The callback mechanism has been replaced with a more direct and efficient
approach using `
This patch refactors the SDMA v5.2 reset logic by splitting the
`sdma_v5_2_reset_queue` function into two separate functions:
`sdma_v5_2_stop_queue` and `sdma_v5_2_restore_queue`.
This change aligns with the new SDMA reset mechanism, where the reset process
is divided into stopping the queue, pe
From: "jesse.zh...@amd.com"
This patch refactors the SDMA v5.2 queue reset and stop logic to improve
code readability, maintainability, and performance. The key changes include:
1. **Generalized `sdma_v5_2_gfx_stop` Function**:
- Added an `inst_mask` parameter to allow stoppin
1 - 100 of 137 matches
Mail list logo