Initialize the queue type before resetting the queue using mmio.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index f7d5d4f08a53..10b61
Variable hub_inst is unused.
Related the commit "bde7ae79ca40":
"drm/amdkfd: Drop poison hanlding from gfx v10"
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c | 13 -
1 file changed, 13 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_pro
To avoid memory leaks, release q_extra_data when exiting the restore queue.
v2: Correct the proto (Alex)
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.
Not all ASICs support the queue reset feature.
Therefore, userspace can query this feature
via AMDGPU_INFO_QUEUE_RESET before validating a queue reset.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 27 +
include/uapi/drm/amdgpu_drm.h |
compute/gfx may have multiple rings on some hardware.
In some cases, userspace wants to run jobs on a specific ring for validation
purposes.
This debugfs entry helps to disable or enable submitting jobs to a specific
ring.
This entry is populated only if there are at least two or more cores in th
Userspace wants to run jobs on a specific sdma ring for verification purposes.
This debugfs entry helps to disable or enable submitting jobs to a specific
ring.
This entry is populated only if there are at least two or more cores in the
sdma ip.
Signed-off-by: Jesse Zhang
Suggested-by:Alex Deuc
Add the sysfs interface for vpe:
vpe_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
Sign
Add two sysfs interfaces for gfx and compute:
gfx_reset_mask
compute_reset_mask
These interfaces are read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead
Add the sysfs interface for sdma:
sdma_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
Signed
Add the sysfs interface for vcn:
vcn_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
Signed-o
Add the sysfs interface for jpeg:
jpeg_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
Signed
From: "jesse.zh...@amd.com"
Add the sysfs interface for vpe:
vpe_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
Signed-off-by: Jesse Zhang
Suggest
Add the sysfs interface for sdma:
sdma_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
Signed-off-by: Jesse Zhang
Suggested-by:Alex Deucher
---
drivers/gpu/drm/amd/am
Add two sysfs interfaces for gfx and compute:
gfx_reset_mask
compute_reset_mask
These interfaces are read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
Signed-off-by: Jesse Zhang
Suggested-by:Alex De
From: "jesse.zh...@amd.com"
Add the sysfs interface for vcn:
vcn_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
Signed-off-by: Jesse Zhang
Suggested-by:Al
From: "jesse.zh...@amd.com"
Add the sysfs interface for jpeg:
jpeg_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
Signed-off-by: Jesse Zhang
Suggest
Add two sysfs interfaces for gfx and compute:
gfx_reset_mask
compute_reset_mask
These interfaces are read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead
From: "jesse.zh...@amd.com"
Add two sysfs interfaces for gfx and compute:
gfx_reset_mask
compute_reset_mask
These interfaces are read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2:
From: "jesse.zh...@amd.com"
Add the sysfs interface for sdma:
sdma_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string
Add the sysfs interface for vcn:
vcn_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
V2: the
Add the sysfs interface for jpeg:
jpeg_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
v3: add
Add the sysfs interface for vpe:
vpe_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
v3: a
Add the sysfs interface for sdma:
sdma_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead of some flags (Christian)
v3: add
Add two sysfs interfaces for gfx and compute:
gfx_reset_mask
compute_reset_mask
These interfaces are read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string instead
[ 90.441868] [ cut here ]
[ 90.441873] kernel BUG at mm/slub.c:553!
[ 90.441885] Oops: invalid opcode: [#1] PREEMPT SMP NOPTI
[ 90.441892] CPU: 0 PID: 1523 Comm: amd_pci_unplug Tainted: GE
6.10.0+ #47
[ 90.441900] Hardware name: AMD Splinter/
From: "jesse.zh...@amd.com"
Add the sysfs interface for sdma:
sdma_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string
From: "jesse.zh...@amd.com"
Add the sysfs interface for vcn:
vcn_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string
From: "jesse.zh...@amd.com"
Add the sysfs interface for vpe:
vpe_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text stri
sysfs: cannot create duplicate filename
'/devices/pci:00/:00:01.1/:01:00.0/:02:00.0/:03:00.0/vcn_reset_mask'
[ 562.443738] CPU: 13 PID: 4888 Comm: modprobe Tainted: GE
6.10.0+ #51
[ 562.443740] Hardware name: AMD Splinter/Splinter-RPL, BIOS VS2683299N.FD
05
From: "jesse.zh...@amd.com"
sysfs: cannot create duplicate filename
'/devices/pci:00/:00:01.1/:01:00.0/:02:00.0/:03:00.0/vcn_reset_mask'
[ 562.443738] CPU: 13 PID: 4888 Comm: modprobe Tainted: GE
6.10.0+ #51
[ 562.443740] Hardwar
[ 2875.870277] [drm:amdgpu_device_init [amdgpu]] *ERROR* sw_init of IP block
failed -22
[ 2875.880494] amdgpu :01:00.0: amdgpu: amdgpu_device_ip_init failed
[ 2875.887689] amdgpu :01:00.0: amdgpu: Fatal error during GPU init
[ 2875.894791] amdgpu :01:00.0: amdgpu: amdgpu: finishing de
For multiple vcn instances, to avoid creating reset sysfs multiple times,
add the instance paramter in reset mask init.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 8
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h | 4 ++--
drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c
From: "jesse.zh...@amd.com"
Add the sysfs interface for jpeg:
jpeg_reset_mask
The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.
V2: the sysfs node returns a text string
Fix similar warning when running IGT:
[ 155.585721] kernfs: can not remove 'enforce_isolation', no directory
[ 155.592201] WARNING: CPU: 3 PID: 6960 at fs/kernfs/dir.c:1683
kernfs_remove_by_name_ns+0xb9/0xc0
[ 155.601145] Modules linked in: xt_MASQUERADE xt_comment nft_compat veth
bridge stp
Fix the similar warning:
[ 155.585721] kernfs: can not remove 'enforce_isolation', no directory
[ 155.592201] WARNING: CPU: 3 PID: 6960 at fs/kernfs/dir.c:1683
kernfs_remove_by_name_ns+0xb9/0xc0
[ 155.601145] Modules linked in: xt_MASQUERADE xt_comment nft_compat veth
bridge stp llc overlay n
From: "jesse.zh...@amd.com"
This reverts commit 10aec8943bcc5123288ded8c97e78312bcf17fb1.
the dev->unplugged flag will also be set to true ,
Only uninstall the driver by amdgpu_exit, not actually unplug the device.
that will cause a new issue.
Signed-off-by: Jesse Zhang
---
dr
Fix the similar warning when hotplugging:
[ 155.585721] kernfs: can not remove 'enforce_isolation', no directory
[ 155.592201] WARNING: CPU: 3 PID: 6960 at fs/kernfs/dir.c:1683
kernfs_remove_by_name_ns+0xb9/0xc0
[ 155.601145] Modules linked in: xt_MASQUERADE xt_comment nft_compat veth
bridge
Replace the check drm_dev_enter with sysfs directory entry.
Because the dev->unplugged flag will also be set to true,
Only uninstall the driver by amdgpu_exit, not actually unplug the device.
Signed-off-by: Jesse Zhang
Reported-by: Andy Dong
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |
Replace the check drm_dev_enter with sysfs directory entry.
Because the dev->unplugged flag will also be set to true,
Only uninstall the driver by amdgpu_exit, not actually unplug the device.
Signed-off-by: Jesse Zhang
Reported-by: Andy Dong
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c |
From: "jesse.zh...@amd.com"
This reverts commit 330d97e9b14e0c85cc8b63e0092e4abcb9ce99c8.
the dev->unplugged flag will also be set to true ,
Only uninstall the driver by amdgpu_exit,not actually unplug the device.
that will cause a new issue.
Signed-off-by: Jesse Zhang
---
driver
When using MES creating a pdd will require talking to the GPU to
setup the relevant context. The code here forgot to wake up the GPU
in case it was in suspend, this causes KVM to EFAULT for passthrough
GPU for example. This issue can be masked if the GPU was woken up by
other things (e.g. opening t
replace MES kgq reset with MMIO.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c
index 69941442f00b..ba2ab9296eb4 100644
Enable the kcg and kcq queue reset flag
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 10 +-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c
index 3aa34c4d..6994144
From: "jesse.zh...@amd.com"
This patch introduces shared SDMA reset functionality between AMDGPU and KFD.
The implementation includes the following key changes:
1. Added `amdgpu_sdma_reset_queue`:
- Resets a specific SDMA queue by instance ID.
- Invokes registered pre-reset and
From: "jesse.zh...@amd.com"
This commit introduces several improvements to the SDMA reset logic:
1. Added `cached_rptr` to the `amdgpu_ring` structure to store the read pointer
before a reset, ensuring proper state restoration after reset.
2. Introduced `gfx_guilty` and `page_gui
From: "jesse.zh...@amd.com"
This commit introduces a caller parameter to the amdgpu_sdma_reset_instance
function to differentiate
between reset requests originating from the KGD and KFD.
This change ensures proper synchronization between KGD and KFD during SDMA
resets.
If the cal
From: "jesse.zh...@amd.com"
This patch refactors the SDMA reset functionality in the `sdma_v4_4_2` driver
to improve modularity and support shared usage between AMDGPU and KFD. The
changes include:
1. **Refactored SDMA Reset Logic**:
- Split the `sdma_v4_4_2_reset_queue` functio
From: "jesse.zh...@amd.com"
This commit introduces several improvements to the SDMA reset logic:
1. Added `cached_rptr` to the `amdgpu_ring` structure to store the read pointer
before a reset, ensuring proper state restoration after reset.
2. Introduced `gfx_guilty` and `page_gui
From: "jesse.zh...@amd.com"
This patch refactors the SDMA reset functionality in the `sdma_v4_4_2` driver
to improve modularity and support shared usage between AMDGPU and KFD. The
changes include:
1. **Refactored SDMA Reset Logic**:
- Split the `sdma_v4_4_2_reset_queue` functio
From: "jesse.zh...@amd.com"
This patch introduces shared SDMA reset functionality between AMDGPU and KFD.
The implementation includes the following key changes:
1. Added `amdgpu_sdma_reset_queue`:
- Resets a specific SDMA queue by instance ID.
- Invokes registered pre-reset and
From: "jesse.zh...@amd.com"
This commit introduces several improvements to the SDMA reset logic:
1. Added `cached_rptr` to the `amdgpu_ring` structure to store the read pointer
before a reset, ensuring proper state restoration after reset.
2. Introduced `gfx_guilty` and `page_gui
From: "jesse.zh...@amd.com"
This commit introduces a caller parameter to the amdgpu_sdma_reset_instance
function to differentiate
between reset requests originating from the KGD and KFD.
This change ensures proper synchronization between KGD and KFD during SDMA
resets.
If the cal
This patch updates the sdma engine to support scheduling for
the page queue. The main changes include:
- Introduce a new variable `page` to handle the page queue if it exists.
- Update the scheduling logic to conditionally set the `sched.ready` flag for
both the sdma gfx queue and the page queue
Extracts the resume sequence for per sdma instance from sdma_v7_0_gfx_resume.
This function can be used in start or restart scenarios of specific instances.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 259 ++---
1 file changed, 141 insertions(+), 1
Implement sdma queue reset callback by mes_reset_queue_mmio.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 26 ++
1 file changed, 26 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
inde
Reset gfx/compute queue through mmio based on me_id and queue_id.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.h | 2 +
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 88 +-
2 files changed, 89 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/a
Reset sdma queue through mmio based on me_id and queue_id.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 46 ++
1 file changed, 46 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
index
sdmv7 queue reset already supports by mmio, add its sys file.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
index 62
Replace kcq queue reset with existing function amdgpu_mes_reset_legacy_queue.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 18 +-
1 file changed, 5 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c
b/drivers/gpu/drm/am
Extracts the resume sequence for per sdma instance from sdma_v7_0_gfx_resume.
This function can be used in start or restart scenarios of specific instances.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 259 ++---
1 file changed, 141 insertions(+), 1
Reset gfx/compute queue through mmio based on me_id and queue_id.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.h | 2 +
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 88 +-
2 files changed, 89 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/a
Reset sdma queue through mmio based on me_id and queue_id.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 46 ++
1 file changed, 46 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
index
Replace kcq queue reset with existing function amdgpu_mes_reset_legacy_queue.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 22 +++---
1 file changed, 3 insertions(+), 19 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
b/drivers/gpu/dr
Implement sdma queue reset callback by mes_reset_queue_mmio.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 26 ++
1 file changed, 26 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
inde
sdmv7 queue reset already supports by mmio, add its sys file.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c
index 62
add the PPSMC_MSG_ResetSDMA2 definition for smu 13.0.6
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h | 1 +
drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h | 3 ++-
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 1 +
3 fi
From: "jesse.zh...@amd.com"
Remove apu check in sdma queue reset.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
b/drivers/gpu/drm/amd/amdgpu/sdm
Implement sdma queue reset by SMU_MSG_ResetSDMA2
Suggested-by: Tim Huang
Signed-off-by: Jesse Zhang
---
.../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 30 ++-
1 file changed, 22 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
b/
Initialize the process context address before setting the shader debugger.
[ 260.781212] amdgpu :03:00.0: amdgpu: [gfxhub] page fault (src_id:0
ring:32 vmid:0 pasid:0)
[ 260.781236] amdgpu :03:00.0: amdgpu: in page starting at address
0x from client 10
[ 260.781255]
From: "jesse.zh...@amd.com"
Remove apu check in sdma queue reset.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
b/drivers/gpu/drm/amd/amdgpu/sdm
From: "jesse.zh...@amd.com"
add the definition PPSMC_MSG_ResetSDMA2.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h | 1 +
drivers/gpu/drm/amd/pm/swsmu/inc/smu_types.h | 3 ++-
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13
From: "jesse.zh...@amd.com"
Implement sdma queue reset by SMU_MSG_ResetSDMA2.
Signed-off-by: Jesse Zhang
---
.../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 28 ++-
1 file changed, 21 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/s
implement gfx10 kgq reset via mmio.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 98 ++
1 file changed, 70 insertions(+), 28 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 89409c
implement gfx10 kcq reset via mmio.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 121 ++---
1 file changed, 88 insertions(+), 33 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 88393c
From: "jesse.zh...@amd.com"
Using mmio to do queue reset.
v2: Alignment this function with gfx9/gfx9.4.3.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 34 ++
1 file changed, 34 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
From: "jesse.zh...@amd.com"
Using mmio to do queue reset
v2: Alignment the function with gfx9/gfx9.4.3.
Signed-off-by: Jesse Zhang adev;
unsigned i;
+ uint32_t tmp;
/* enter save mode */
amdgpu_gfx_rlc_enter_safe_mode(adev, xcc_id);
@@ -3813,7 +3814,25
From: "jesse.zh...@amd.com"
pmfw now unifies PPSMC_MSG_ResetSDMA definitions for different devices.
PPSMC_MSG_ResetSDMA2 is not needed.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu_v13_0_6_ppsmc.h | 1 -
drivers/gpu/drm/amd/pm/swsmu/inc/s
From: "jesse.zh...@amd.com"
This patch refactors the firmware version checks in `smu_v13_0_6_reset_sdma`
to support multiple SMU programs with different firmware version thresholds.
Signed-off-by: Jesse Zhang
---
.../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 14 +---
From: "jesse.zh...@amd.com"
pmfw unified PPSMC_MSG_ResetSDMA definitions for different devices.
PPSMC_MSG_ResetSDMA2 is not needed.
Signed-off-by: Jesse Zhang
---
.../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 30 +--
1 file changed, 8 insertions(+), 22 deletion
From: "jesse.zh...@amd.com"
This patch refactors the firmware version checks in `smu_v13_0_6_reset_sdma`
to support multiple SMU programs with different firmware version thresholds.
V2: return -EOPNOTSUPP for unspported pmfw
Suggested-by: Lazar Lijo
Signed-off-by: Jesse Zhang
--
When a GPU job times out, the driver attempts to recover by restarting
the scheduler. Previously, the scheduler was restarted with an error
code of 0, which does not distinguish between a full GPU reset and a
queue reset. This patch changes the error code to -ENODATA for queue
resets, while -ECANCE
From: "jesse.zh...@amd.com"
This patch refactors the firmware version checks in `smu_v13_0_6_reset_sdma`
to support multiple SMU programs with different firmware version thresholds.
V2: return -EOPNOTSUPP for unspported pmfw
V3: except IP_VERSION(13, 0, 12) which is not supported.
Su
From: "jesse.zh...@amd.com"
This patch refactors the firmware version checks in `smu_v13_0_6_reset_sdma`
to support multiple SMU programs with different firmware version thresholds.
V2: return -EOPNOTSUPP for unspported pmfw
V3: except IP_VERSION(13, 0, 12) which is not supported.
Su
From: "jesse.zh...@amd.com"
- Modify the VM invalidation engine allocation logic to handle SDMA page rings.
SDMA page rings now share the VM invalidation engine with SDMA gfx rings
instead of
allocating a separate engine. This change ensures efficient resource
management and
From: "jesse.zh...@amd.com"
Increase the maximum number of rings supported by the AMDGPU driver from 132 to
148.
This change is necessary to enable support for the SDMA page ring.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
1 file changed, 1
From: "jesse.zh...@amd.com"
This commit updates the VM flush implementation for the SDMA engine.
- Added a new function `sdma_v4_4_2_get_invalidate_req` to construct the
VM_INVALIDATE_ENG0_REQ
register value for the specified VMID and flush type. This function ensures
that al
From: "jesse.zh...@amd.com"
The is
Reviewed-by: Jesse Zhang
Incrementing the gpu_reset counter needs to be in the is_guilty block. Alos
move the fence error before the reset to keep the original ordering.
Fixes: f447ba2bbd48 ("drm/amdgpu: Update amdgpu_job_timedout to check
This commit updates the VM flush implementation for the SDMA engine.
- Added a new function `sdma_v4_4_2_get_invalidate_req` to construct the
VM_INVALIDATE_ENG0_REQ
register value for the specified VMID and flush type. This function ensures
that all relevant
page table cache levels (L1 PTEs,
From: "jesse.zh...@amd.com"
- Modify the VM invalidation engine allocation logic to handle SDMA page rings.
SDMA page rings now share the VM invalidation engine with SDMA gfx rings
instead of
allocating a separate engine. This change ensures efficient resource
management and
From: "jesse.zh...@amd.com"
Increase the maximum number of rings supported by the AMDGPU driver from 132 to
148.
This change is necessary to enable support for the SDMA page ring.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
1 file changed, 1
From: "jesse.zh...@amd.com"
- Modify the VM invalidation engine allocation logic to handle SDMA page rings.
SDMA page rings now share the VM invalidation engine with SDMA gfx rings
instead of
allocating a separate engine. This change ensures efficient resource
management and
From: "jesse.zh...@amd.com"
This patch introduces two new functions, `amdgpu_sdma_stop_queue` and
`amdgpu_sdma_start_queue`, to handle the stopping and starting of SDMA queues
during engine reset operations. The changes include:
1. **New Functions**:
- `amdgpu_sdma_stop_queue`:
From: "jesse.zh...@amd.com"
This patch introduces two new callbacks, `stop_queue` and `start_queue`, to the
`amdgpu_ring_funcs` structure. These callbacks are designed to handle the
stopping
and starting of SDMA queues during engine reset operations. The changes include:
1. **A
From: "jesse.zh...@amd.com"
This commit updates the VM flush implementation for the SDMA engine.
- Added a new function `sdma_v4_4_2_get_invalidate_req` to construct the
VM_INVALIDATE_ENG0_REQ
register value for the specified VMID and flush type. This function ensures
that al
From: "jesse.zh...@amd.com"
- Modify the VM invalidation engine allocation logic to handle SDMA page rings.
SDMA page rings now share the VM invalidation engine with SDMA gfx rings
instead of
allocating a separate engine. This change ensures efficient resource
management and
From: "jesse.zh...@amd.com"
Increase the maximum number of rings supported by the AMDGPU driver from 133 to
149.
This change is necessary to enable support for the SDMA page ring.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
1 file changed, 1
From: "jesse.zh...@amd.com"
This patch refactors the SDMA v5.2 reset logic by splitting the
`sdma_v5_2_reset_queue` function into two separate functions:
`sdma_v5_2_stop_queue` and `sdma_v5_2_restore_queue`.
This change aligns with the new SDMA reset mechanism, where the reset p
From: "jesse.zh...@amd.com"
This patch refactors the SDMA v5.0 queue reset and stop logic to improve
code readability, maintainability, and performance. The key changes include:
1. **Generalized `sdma_v5_0_gfx_stop` Function**:
- Added an `inst_mask` parameter to allow stopping spe
From: "jesse.zh...@amd.com"
This commit updates the VM flush implementation for the SDMA engine.
- Added a new function `sdma_v4_4_2_get_invalidate_req` to construct the
VM_INVALIDATE_ENG0_REQ
register value for the specified VMID and flush type. This function ensures
that al
From: "jesse.zh...@amd.com"
Increase the maximum number of rings supported by the AMDGPU driver from 133 to
149.
This change is necessary to enable support for the SDMA page ring.
Signed-off-by: Jesse Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 +-
1 file changed, 1
1 - 100 of 111 matches
Mail list logo