Driver mode-2 is only supported by relative new
smc firmware.
Signed-off-by: Hawking Zhang
---
.../gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 40 +++
1 file changed, 32 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
b/drivers/gpu
When SMU IP is disabled by ip_block_mask, driver
should not refer to any dpm/swSMU callback. Instead,
any driver call into swSMU/dpm callback needs to
return error code EOPNOTSUPP.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 3 ++-
1 file changed, 2 insertions
signaled.
v2: drop the unused local variable (Tao)
Signed-off-by: Hawking Zhang
---
.../gpu/drm/amd/amdkfd/kfd_int_process_v9.c| 18 +-
drivers/gpu/drm/amd/amdkfd/soc15_int.h | 1 +
2 files changed, 2 insertions(+), 17 deletions(-)
diff --git a/drivers/gpu/drm/amd
signaled.
v2: drop the unused local variable (Tao)
Signed-off-by: Hawking Zhang
---
.../gpu/drm/amd/amdkfd/kfd_int_process_v9.c| 18 +-
drivers/gpu/drm/amd/amdkfd/soc15_int.h | 1 +
2 files changed, 2 insertions(+), 17 deletions(-)
diff --git a/drivers/gpu/drm/amd
Driver switches to interrupt source id to identify
utcl2 poison event. polling interface is not needed.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 16
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 2 --
drivers/gpu/drm/amd/amdgpu
Not supported.
Signed-off-by: Hawking Zhang
---
.../gpu/drm/amd/amdkfd/kfd_int_process_v10.c | 71 ---
1 file changed, 71 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c
b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v10.c
index 8e0d0356e810
signaled.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 3 +--
drivers/gpu/drm/amd/amdkfd/soc15_int.h | 1 +
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
b/drivers/gpu/drm/amd/amdkfd
Add debug option to enable mode2 for poison recovery
for testing purpose only.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++
drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 16
TA should not be loaded from guest side.
Signed-off-by: Hawking Zhang
Reviewed-by: Shiwu Zhang
---
drivers/gpu/drm/amd/amdgpu/psp_v13_0.c | 8 +---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
b/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
Data abort exception and unknown errors are supported.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 10 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 2 ++
2 files changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers
To align with firmware, hbm id field 0x1 refers to
hbm stack 0, 0x2 refers to hbm statck 1.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/amd
Driver should write to fault_cntl registers to do
one-shot address/status clear.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c
b/drivers/gpu/drm/amd/amdgpu
adev->gfx.imu.funcs could be NULL
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index b4575765d7a8..5c1740943
adev->gfx.imu.funcs could be NULL.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 8
drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 8
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
b/drivers/gpu/
fault_status is read only register. fault_cntl
is not accessible from guest environment.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c | 8 +---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 3 ++-
drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 8 +---
3 files
AMDGPU_RAS_GPU_ERR_BOOT_STATUS field is no longer valid.
The polling sequence is also simplifed according to
the latest firmware change.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 99 +++--
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 4 +-
2
Add estimate of how much vram we need to reserve for RAS
when caculating the total available vram.
v2: apply the change to MP0 v13_0_2 and v13_0_14
Signed-off-by: Hawking Zhang
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 9 +++--
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 20
Add estimate of how much vram we need to reserve for RAS
when caculating the total available vram.
Signed-off-by: Hawking Zhang
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 9 +++--
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c| 18 ++
drivers/gpu/drm/amd/amdgpu
hbm filed takes bit 13 and bit 14 in boot status.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
index c8980d5f6540
mode-2 reset is the only reliable method that can get
GC/SDMA back when poison is consumed. mmhub requires
mode-1 reset.
Signed-off-by: Hawking Zhang
---
.../gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 27 ++-
1 file changed, 8 insertions(+), 19 deletions(-)
diff --git a
mode-2 reset is the only reliable method that can get
GC/SDMA back when poison is consumed. mmhub requires
mode-1 reset.
Signed-off-by: Hawking Zhang
---
.../gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 22 +++
1 file changed, 3 insertions(+), 19 deletions(-)
diff --git a
mode-2 reset is the only reliable method that can get
GC/SDMA back when poison is consumed. mmhub requires
mode-1 reset.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 8 ++--
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm
When BACO exit is triggered by doorbell transaction,
firmware will config bif to issue msi interrupt to
indicate doorbell transaction. If bif ring is not
enabled in such case, driver needs to ack the interrupt
by clearing the interrupt status.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm
ASD is not needed by headless GPU.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 94b310fdb719d..83bf86352267d 100644
--- a
ASD is not needed by headless GPU.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 94b310fdb719d..063203865bbe2 100644
--- a
Do not load/invoke display TA if display hardware is not
available
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 18 ++
1 file changed, 18 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
Display TA doesn't need to be loaded/invoked if it
is harvested.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 18 ++
1 file changed, 18 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_
Only do this from host side.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/soc15.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index 15033efec2ba..2c8702560090 100644
--- a/drivers
Update boot time errors polling seqeunce to align with
the latest firmware change.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 14 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 5 +
2 files changed, 18 insertions(+), 1 deletion(-)
diff --git a
amdgpu_reg_state_sysfs_fini could be invoked at the
time when asic_func is even not initialized, i.e.,
amdgpu_discovery_init fails for some reason.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/include/amdgpu_reg_state.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a
Check and report boot status if discovery failed.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
b/drivers/gpu/drm/amd/amdgpu
Add ras helper function to query boot time gpu
errors.
v2: use aqua_vanjaram smn addressing pattern
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 95 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
To allow using this helper for indirect access when
nbio funcs is not available. For instance, in ip
discovery phase.
v2: define macro for pcie_index/data/index_hi fallback.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 23 +-
1 file changed
Check and report firmware boot status if it doesn't
reach steady status.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/psp_v13_0.c | 11 +--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
b/drivers/gpu/drm/amd/a
Check and report boot status if discovery failed.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
b/drivers/gpu/drm/amd/amdgpu
Will replace it with new implementation to cover
boot fails in ip discovery phase.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 -
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c| 15 -
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h| 4 --
drivers/gpu/drm/amd
Instead of traditional atomfirmware interfaces for RAS
capability, host driver can query ras capability from
psp starting from psp v13_0_6.
v2: drop redundant local variable from get_ras_capability.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 10
Move ras capablity check to amdgpu_ras_check_supported.
Driver will query ras capablity through psp interace, or
vbios interface, or specific ip callbacks.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 170 +---
1 file changed, 93 insertions
Driver and firmware share the same ras block enum.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
index 5785b705c692..8b053602c5ca
Add ras helper function to query boot time gpu
errors.
v2: use aqua_vanjaram smn addressing pattern
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 95 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
Check and report firmware boot status if it doesn't
reach steady status.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/psp_v13_0.c | 11 +--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
b/drivers/gpu/drm/amd/a
Check and report boot status if discovery failed.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
b/drivers/gpu/drm/amd/amdgpu
To allow using this helper for indirect access when
nbio funcs is not available. For instance, in ip
discovery phase.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 +++-
1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu
Will replace it with new implementation to cover
boot fails in ip discovery phase.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 -
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c| 15 -
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h| 4 --
drivers/gpu/drm/amd
So kernel message has the device pcie bdf information,
which helps issue debugging especially in multiple GPU
system.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 144
1 file changed, 75 insertions(+), 69 deletions(-)
diff --git a/drivers
Not needed any more with firmware fixes
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 842405bb8995
Initialize RAS feature mask bit[31:29] with socket_id.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 72b6e41329b0
Move ras capablity check to amdgpu_ras_check_supported.
Driver will query ras capablity through psp interace, or
vbios interface, or specific ip callbacks.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 170 +---
1 file changed, 93 insertions
Instead of traditional atomfirmware interfaces for RAS
capability, host driver can query ras capability from
psp starting from psp v13_0_6.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 13 +
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 2 ++
drivers/gpu
Driver and firmware share the same ras block enum.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
index 5785b705c692..8b053602c5ca
Driver can query RAS capability through psp or bios.
Hawking Zhang (3):
drm/amdgpu: Align ras block enum with firmware
drm/amdgpu: Query ras capablity from psp
drm/amdgpu: Centralize ras cap query to amdgpu_ras_check_supported
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 13 ++
drivers/gpu
Check and report firmware boot status if it doesn't
reach steady status.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/psp_v13_0.c | 11 +--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
b/drivers/gpu/drm/amd/a
Check and report boot status if discovery failed.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
b/drivers/gpu/drm/amd/amdgpu
Add ras helper function to query boot time gpu
errors.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 95 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 15 +++-
3 files changed, 112 insertions
To allow using this helper for indirect access when
nbio funcs is not available. For instance, in ip
discovery phase.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 +++-
1 file changed, 15 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu
Will replace it with new implementation to cover
boot fails in ip discovery phase.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 -
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c| 15 -
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h| 4 --
drivers/gpu/drm/amd
For ASICs that support boot time error reporting, poll all
the boot time errors cached in registers and make it available
in kernel log.
Hawking Zhang (5):
drm/amdgpu: drop psp v13 query_boot_status implementation
drm/amdgpu: Init pcie_index/data address as fallback
drm/amdgpu: Add ras
Instead of software managed counters.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h | 2 ++
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 6 --
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
Boot time error query is not available till a10109
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/psp_v13_0.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
b/drivers/gpu/drm/amd/amdgpu/psp_v13_0.c
index 3cf4684d0d3f
In nbio v7_9, host driver should not issu gpu reset
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c | 5 -
1 file changed, 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
b/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
index 23f26f8caad4..25a3da83e0fb
Not needed anymore.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 114
1 file changed, 114 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c
index 40d06d32bb74..5df727be88c4 100644
Query boot status and report boot errors. A follow
up change is needed to stop GPU initialization if boot
fails.
v2: only invoke the call for dGPU (Le/Lijo)
Signed-off-by: Hawking Zhang
Reviewed-by: Tao Zhou
Reviewed-by: Yang Wang
Reviewed-by: Le Ma
---
drivers/gpu/drm/amd/amdgpu
Add psp v13 function to query boot status.
v2: limit the use case to dGPU only (Lijo)
Signed-off-by: Hawking Zhang
Reviewed-by: Tao Zhou
Reviewed-by: Yang Wang
Reviewed-by: Le Ma
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 15 +
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 3 +
drivers
Add MP0_C2PMSG_109/126 register field shift/masks
that are used to identify boot status by driver.
Signed-off-by: Hawking Zhang
Reviewed-by: Tao Zhou
Reviewed-by: Yang Wang
Reviewed-by: Le Ma
---
.../include/asic_reg/mp/mp_13_0_2_sh_mask.h | 28 +++
1 file changed, 28
Add RAS sepcifc programming to dpg sram.
Signed-off-by: Hawking Zhang
Reviewed-by: Tao Zhou
---
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
index f85d18cd74ec
Set VCN/JPEG RAS masks to enable software RAS for
VCN and JPEG.
Signed-off-by: Hawking Zhang
Reviewed-by: Tao Zhou
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/amd
So driver doesn't generate incorrect message until
the new format is settled down for aqua_vanjaram
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras
So driver doesn't generate incorrect message until
the new format is settled down for aqua_vanjaram
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/dr
gfx_v9_4_3_ue|ce_reg_list is an array per gfx core instance
correct the settings of se_num and reg_inst for some of
gfx ras counters so all the available register instances
can be polled for ras status.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 40
Do not access the pointer for ras input cmd buffer
if it is even not allocated.
Signed-off-by: Hawking Zhang
Reviewed-by: Stanley Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
Do not access the pointer for ras input cmd buffer
if it is even not allocated.
Signed-off-by: Hawking Zhang
Reviewed-by: Stanley Yang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
Do not access the pointer for ras input cmd buffer
if it is even not allocated.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/amd
Driver queries umc_info v4_0 to identify ecc cap
for aqua_vanjaram
Signed-off-by: Hawking Zhang
Reviewed-by: Candice Li
---
.../gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c | 18 --
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
To be used by aqua_vanjaram
Signed-off-by: Hawking Zhang
Reviewed-by: Candice Li
---
drivers/gpu/drm/amd/include/atomfirmware.h | 18 ++
1 file changed, 18 insertions(+)
diff --git a/drivers/gpu/drm/amd/include/atomfirmware.h
b/drivers/gpu/drm/amd/include/atomfirmware.h
index
Disable gfx ras command is needed in some use cases
like live migration.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/amd/amdgpu
amdgpu_device_mode1_reset will return gpu mode1_reset
succeed (ret = 0) as long as wait_for_bootloader call
succeed, regardless of the status reported by smu or
psp firmware. This results to driver continue executing
recovery even smu or psp fail to perform mode1 reset.
Signed-off-by: Hawking
not needed any more
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4
1 file changed, 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 00658c2816dc..c58b31121fd7 100644
--- a/drivers/gpu/drm/
For non-GFX IP blocks, set up ras obj if ras feature
is allowed. For GFX IP blocks, force issue ras
enable_feature command to firmware and only set up ras
obj if ras feature is allowed
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 30 +
1
amdgpu_ras_late_init will invoke ras_late_init call
per IP block
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 26 --
1 file changed, 26 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0
e IP block.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 041112c7fbbd..8524365761b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_r
like amdgpu_ras_is_supported and
amdgpu_ras_is_feature_allowed ensure only GFX RAS
is enabled when poison mode is supported.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 49 -
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 26 -
2 files changed, 16 inser
Some IP blocks only support partial ras feature and don't
have ras counter and/or ras error status register at all.
Driver should not create err_count sysfs node for those
IP blocks.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 31 ++---
1
From: Lijo Lazar
Use the right data structure for allocation.
Signed-off-by: Lijo Lazar
Reviewed-by: Hawking Zhang
---
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c
b
Was introduced as workaround. not needed anymore
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/gfxhub_v3_0.c | 22 --
1 file changed, 22 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v3_0.c
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v3_0.c
index
fix backward compatibility issue to stay with
the old name of xgmi_wafl node.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
b/drivers/gpu/drm/amd/amdgpu
GPU will stop working once fatal error is detected.
it will inform driver to do reset to recover from
the fatal error.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 11
drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c | 79 +
drivers/gpu/drm/amd
Fix a coding error which results to null interrupt
handler for umc ras.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c
amdgpu_ras_register_ras_block should always be invoked
by ras_sw_init, where driver needs to check ras caps
at ip level, instead of asic level.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/gpu/drm/amd
To align with other IP blocks.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 9
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 28 +++-
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 1 +
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 7
To align with other IP blocks
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 13 +
drivers/gpu/drm/amd/amdgpu/amdgpu_mca.c | 72 +
drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h | 9 ++--
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 15
pcie_bif ras blocks needs to be initialized as early
as possible to handle fatal error detected in hw_init
phase. also align the pcie_bif ras sw_init with other
ras blocks
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c | 23 +++
drivers/gpu/drm/amd
Initialize hdp ras block only when mmhub ip block
supports ras features. Driver queries ras capabilities
after early_init, ras block init needs to be moved to
sw_init.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/Makefile | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 5
Initialize mmhub ras block only when mmhub ip block
supports ras features. Driver queries ras capabilities
after early_init, ras block init needs to be moved to
sw_init.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/Makefile | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
Initialize umc ras block only when umc ip block
supports ras. Driver queries ras capabilities after
early_init, ras block init needs to be moved to
sw_init.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 9 +++-
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 2
Initialize vcn ras block only when vcn ip block
supports ras features. Driver queries ras capabilities
after early_init, ras block init needs to be moved to
sw_int.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 29 -
drivers/gpu/drm/amd
Use default gfx ras_late_init callback for gfx
ras block.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index
Initialize jpeg ras block only when jpeg ip block
supports ras features. Driver queries ras capabilities
after early_init, ras block init needs to be moved to
sw_int.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c | 29
drivers/gpu/drm/amd
moved to ras sw_init and follows ip based ras cap check
from amdgpu_ras_init, instead of the check in soc level.
v2: simplify the ras check (Stanley/Tao)
Hawking Zhang (10):
drm/amdgpu: Move jpeg ras block init to ras sw_init
drm/amdgpu: Move vcn ras block init to ras sw_init
drm/amdgpu: Move
Not needed since from vi. drop the function so
we don't duplicate code when introduce new asics.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/nv.c| 17 -
drivers/gpu/drm/amd/amdgpu/soc15.c | 20
drivers/gpu/drm/amd/amdgpu/soc21.c
Replace soc15, nv, soc21 get_rev_id callback with common
helper so we don't need to duplicate code when introduce
new asics.
Signed-off-by: Hawking Zhang
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12
drivers/gpu/dr
1 - 100 of 395 matches
Mail list logo