to avoid reading wrong WPTR from doorbell in sriov vf, set
CP_HQD_PQ_DOORBELL_CONTROL.DOORBELL_MODE to 1 to read WPTR from MQD.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 3 +++
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 3 +++
2 files changed, 6
VF can't access FB when host is doing mode1 reset. Using sizeof to get
vf2pf info size, instead of reading it from vf2pf header stored in FB.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/driver
increase retry times to wait host has enough time to complete reset.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
index
Host will initiate an FLR for all poison consumption.
Guest should wait for FLR message to re-init data exchange.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
b/drivers/gpu/drm
1. change AMDGPU_VF2PF_UPDATE_MAX_RETRY_LIMIT from 30 to 5.
2. set fatel error detected flag.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 2 +-
3 files changed, 3
recover, it will be restored, then caused page
fault.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 17 ++---
1 file changed, 6 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index
1. change AMDGPU_VF2PF_UPDATE_MAX_RETRY_LIMIT from 30 to 5.
2. set fatel error detected flag.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 2 +-
3 files changed, 3
recover, it will be restored, then caused page
fault.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 15 +--
1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index
1. change AMDGPU_VF2PF_UPDATE_MAX_RETRY_LIMIT from 30 to 5.
2. set fatel error detected flag.
Change-Id: If1e0357deffa4549d4e83e925c8d764f7f8c9f42
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 1 +
drivers/gpu/drm
recover, it will be restored, then caused page
fault.
Signed-off-by: Zhigang Luo
Change-Id: Ib1eddb56b69ecd41fe703abd169944154f48b0cd
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
b/drivers/gpu
1. change AMDGPU_VF2PF_UPDATE_MAX_RETRY_LIMIT from 30 to 5.
2. set fatel error detected flag.
Change-Id: If1e0357deffa4549d4e83e925c8d764f7f8c9f42
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 1 +
drivers/gpu/drm
it will cause page fault after device recovered if there is a process running.
Signed-off-by: Zhigang Luo
Change-Id: Ib1eddb56b69ecd41fe703abd169944154f48b0cd
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
Signed-off-by: Zhigang Luo
Change-Id: I2a98d513c26107ac76ecf20e951c188afbc7ede6
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 20
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 10 +-
drivers/gpu/drm/amd/amdkfd/kfd_device.c| 11 +++
3 files changed, 40
1. change AMDGPU_VF2PF_UPDATE_MAX_RETRY_LIMIT from 30 to 5.
2. set fatel error detected flag.
Change-Id: If1e0357deffa4549d4e83e925c8d764f7f8c9f42
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 1 +
drivers/gpu/drm
it will cause page fault after device recovered if there is a process running.
Signed-off-by: Zhigang Luo
Change-Id: Ib1eddb56b69ecd41fe703abd169944154f48b0cd
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
Signed-off-by: Zhigang Luo
Change-Id: I2a98d513c26107ac76ecf20e951c188afbc7ede6
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 20
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 5 -
drivers/gpu/drm/amd/amdkfd/kfd_device.c| 11 +++
3 files changed, 35
if reading pf2vf data failed 30 times continuously, it means something is
wrong. Need to trigger flr_work to recover the issue.
also use dev_err to print the error message to get which device has
issue and add warning message if waiting IDH_FLR_NOTIFICATION_CMPL
timeout.
Signed-off-by: Zhigang
if reading pf2vf data failed 5 times continuously, it means something is
wrong. Need to trigger flr_work to recover the issue.
also use dev_err to print the error message to get which device has
issue and add warning message if waiting IDH_FLR_NOTIFICATION_CMPL
timeout.
Signed-off-by: Zhigang
From: Victor Skvortsov
In a non-FLR page avoidance scenario, the host driver will
provide the bad pages in the pf2vf exchange region.
Adding a new host response message to indicate when the
pf2vf exchange region has been updated.
Signed-off-by: Victor Skvortsov
Change-Id: I58d5d11d959d91ad5723
From: Victor Skvortsov
In runtime, use vram manager for virtualization page retirement.
Signed-off-by: Victor Skvortsov
Change-Id: Ia8fe6c7d4e4acae9d3a953b3ba4567e8fc6de0fa
---
drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 30
1 file changed, 20 insertions(+), 10 deletion
From: YiPeng Chai
Support passing poison consumption ras blocks
to SRIOV.
Signed-off-by: YiPeng Chai
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c| 5 ++--
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 2 +-
drivers/gpu/drm/amd/am
Signed-off-by: Zhigang Luo
Change-Id: I71524c69c7137c6db4968b95e480c910aba24703
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 21438ff61c6e..de9a2a7f5459
For SRIOV VF, FB location is programmed by host driver, no need to
program it in guest driver.
Signed-off-by: Zhigang Luo
Change-Id: I2a4838f6703e94bb0bcf3a8e923c69466e37803f
---
drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c | 15 +--
drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 12
port SRIOV VF missed changes from gfx_v9_0 to gfx_v9_4_3.
Signed-off-by: Zhigang Luo
Change-Id: Id580820376c8d653e9ec5ebf5a8b950cd0a67e1a
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 15 ++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
For SRIOV VF, no TMR needed.
Signed-off-by: Zhigang Luo
Change-Id: If9556cf60dfcbd95e102b1387cf233e902d9490e
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index
For the ASIC has big FB, it need more time to clear FB during reset.
This change extended SRIOV VF waiting reset completion timeout from 5s
to 10s.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers
For SRIOV VF, the XGMI topology was not recovered after reset. This
change added code to SRIOV VF reset function to update XGMI topology
for SRIOV VF after reset.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 17 ++---
1 file changed, 14 insertions
For SIORV VF, XGMI was not initialized during recover. This change added
XGMI initialization for SRIOV VF during recover.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 12
1 file changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
notification before the
real hive reset been executed. The VF device can handle the reset request
individually in it's reset work handler.
This change updated gpu recover sequence to skip reset other device in
the same hive for SRIOV VF.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/a
For sriov vf hang, vf flr will be triggered. Hive reset is not needed.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu
MMSCH 1.0 doesn't have major/minor version, only verison.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/mmsch_v1_0.h | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/mmsch_v1_0.h
b/drivers/gpu/drm/amd/amdgpu/mmsch_v1_0.h
MMSCH doesn't have major/minor version, only verison.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/mmsch_v1_0.h | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/mmsch_v1_0.h
b/drivers/gpu/drm/amd/amdgpu/mmsch_v1_0.h
need to load xgmi ta for aldebaran sriov vf.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 47ceb783e2a5..29c365160043
host driver programmed mmhub system aperture and fb location for vf, no
need to program in guest side.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c | 17 +++--
1 file changed, 3 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
host driver programmed fb location registers for vf, no need to
check anymore.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 5 +
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
b/drivers/gpu/drm/amd/amdgpu
psp added new feature to check fw buffer address for sriov vf. the
address range must be in vf fb.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 19 ++-
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
need to load xgmi ta for arcturus and aldebaran sriov vf.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index
host driver programmed the gfxhub fb location for vf, no need to
program in guest side.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 12
1 file changed, 12 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
b/drivers/gpu/drm/amd/amdgpu
1. add Aldebaran in virtualization detection list.
2. disable Aldebaran virtual display support as there is no GFX
engine in Aldebaran.
3. skip TMR loading if Aldebaran is in virtualizatin mode as it
shares the one host loaded.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu
update kfd_supported_devices to enable Aldebaran virtualization support
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
It is Aldebaran VF device ID, for virtualization support.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 0369d3532bf0
To make sure the CAP feature is supported by the SOS, add SOS FW version
checking before loading the CAP FW.
Change-Id: I7aa1c09f9c117f67ede0db6cd5911d56c8568495
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 5 +
1 file changed, 5 insertions(+)
diff --git a
The CAP fw is for enabling driver compatibility. Currently, it only
enabled for vega10 VF.
Signed-off-by: Zhigang Luo
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 9 +++-
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 3 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h | 3 ++-
drivers/gpu
43 matches
Mail list logo