[PATCH] drm/amdgpu/sriov: Add MB_REQ_MSG_READY_TO_RESET response

2021-04-08 Thread jianzh
From: Jiange Zhao Add MB_REQ_MSG_READY_TO_RESET response when VF get FLR notification. When guest received FLR notification from host, it would lock adapter into reset state. There will be no more job submission and hardware access after that. Then it should send a response to host that it has p

[PATCH] drm/amdgpu/sriov: Add MB_REQ_MSG_READY_TO_RESET response

2021-04-08 Thread jianzh
From: Jiange Zhao Add MB_REQ_MSG_READY_TO_RESET response when VF get FLR notification. When guest received FLR notification from host, it would lock adapter into reset state. There will be no more job submission and hardware access after that. Then it should send a response to host that it has p

[PATCH] drm/amdgpu/SRIOV: Extend VF reset request wait period

2020-11-25 Thread jianzh
From: Jiange Zhao In Virtualization case, when one VF is sending too many FLR requests, hypervisor would stop responding to this VF's request for a long period of time. This is called event guard. During this period of cooling time, guest driver should wait instead of doing other things. After th

[PATCH] drm/amdgpu/SRIOV: Extend VF reset request wait period

2020-12-07 Thread jianzh
From: Jiange Zhao In Virtualization case, when one VF is sending too many FLR requests, hypervisor would stop responding to this VF's request for a long period of time. This is called event guard. During this period of cooling time, guest driver should wait instead of doing other things. After th

[PATCH] drm/amdgpu/SRIOV: SRIOV VF doesn't support BACO

2019-10-28 Thread jianzh
From: Jiange Zhao SRIOV VF doesn't support BACO. Only PF with BACO capability can do it. Signed-off-by: Jiange Zhao --- drivers/gpu/drm/amd/amdgpu/nv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index 2

[PATCH] drm/amdgpu/SRIOV: Reorganize hw.status for SRIOV re-init

2019-10-28 Thread jianzh
From: Jiange Zhao in amdgpu_device_ip_reinit_early_sriov, after IH hw_init, only IH's hw.status is true. Other three IP's hw.status are re-set to false, even though they have already done hw_init. The new way is to do hw_init for each IP in the list, regardless of hw.status. And set hw.status on

[PATCH] drm/amdgpu/SRIOV: Only reset hw.status for target IP

2019-10-29 Thread jianzh
From: Jiange Zhao In the old way, when doing IH hw_init, PSP, nv_common and GMC hw.status would be reset to false, even though their hw_init have been done. In the next step, fw_loading, PSP would do hw_init again. In the new way, only reset hw.status to false for the target IP in the list. In t

[PATCH] drm/amdgpu/sriov: Use VF-accessible register for gpu_clock_count

2020-03-03 Thread jianzh
Alexander's advice, switch to mmGOLDEN_TSC_COUNT_LOWER/UPPER for both bare metal and SRIOV. Signed-off-by: jianzh --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/a

[PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset

2020-04-23 Thread jianzh
From: Jiange Zhao When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is due, amdgpu would notify usermode app through wa

[PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset (v2)

2020-04-24 Thread jianzh
From: Jiange Zhao When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is due, amdgpu would notify usermode app through wa

[PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset v4

2020-04-26 Thread jianzh
From: Jiange Zhao When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is due, amdgpu would notify usermode app through wa

[PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset v4

2020-04-28 Thread jianzh
From: Jiange Zhao When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is due, amdgpu would notify usermode app through wa

[PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset v4

2020-05-09 Thread jianzh
From: Jiange Zhao When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is due, amdgpu would notify usermode app through wa

[PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset v4

2020-05-13 Thread jianzh
From: Jiange Zhao When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is due, amdgpu would notify usermode app through wa

[PATCH] drm/amdgpu: Add autodump debugfs node for gpu reset v4

2020-05-14 Thread jianzh
From: Jiange Zhao When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is due, amdgpu would notify usermode app through wa

[PATCH] drm/amdgpu/sriov: Use VF-accessible register for gpu_clock_count

2020-02-27 Thread jianzh
Navi12 VK CTS subtest timestamp.calibrated.dev_domain_test failed because mmRLC_CAPTURE_GPU_CLOCK_COUNT register cannot be written in VF due to security policy. Solution: use a VF-accessible timestamp register pair mmGOLDEN_TSC_COUNT_LOWER/UPPER for SRIOV case. Signed-off-by: jianzh

[PATCH] drm/amdgpu: Add SRIOV mailbox backend for Navi1x

2019-09-09 Thread jianzh
From: Jiange Zhao Mimic the ones for Vega10, add mailbox backend for Navi1x Signed-off-by: Jiange Zhao --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 380 ++ drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h | 41 +++ 3 files changed, 4

[PATCH] drm/amdgpu: For Navi12 SRIOV VF, register mailbox functions

2019-09-11 Thread jianzh
From: Jiange Zhao Mailbox functions and interrupts are only for Navi12 VF. Register functions and irqs during initialization. Signed-off-by: Jiange Zhao --- drivers/gpu/drm/amd/amdgpu/nv.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c

[PATCH] drm/amdgpu: Navi10/12 VF doesn't support SMU

2019-09-11 Thread jianzh
From: Jiange Zhao In SRIOV case, SMU and powerplay are handled in HV. VF shouldn't have control over SMU and powerplay. Signed-off-by: Jiange Zhao --- drivers/gpu/drm/amd/amdgpu/nv.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/

[PATCH] drm/amdgpu: Navi12 SRIOV VF doesn't load TOC

2019-09-11 Thread jianzh
From: Jiange Zhao In SRIOV case, the autoload sequence is the same as bare metal, except VF won't load TOC. Signed-off-by: Jiange Zhao --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c

[PATCH] drm/amdgpu/SRIOV: Navi12 SRIOV VF gets GTT base

2019-09-16 Thread jianzh
From: Jiange Zhao With changes in PSP and HV, SRIOV VF will handle vram gtt location just like bare metal. There is no need to differentiate it anymore. Signed-off-by: Jiange Zhao --- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a

[PATCH] drm/amdgpu/SRIOV: add navi12 pci id for SRIOV

2019-09-17 Thread jianzh
From: Jiange Zhao Add Navi12 PCI id support. Signed-off-by: Jiange Zhao --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 420888e941df..b52c7255e5e4 100644 ---

[PATCH] drm/amdgpu/SRIOV: add navi12 pci id for SRIOV

2019-09-18 Thread jianzh
From: Jiange Zhao Add Navi12 PCI id support. Signed-off-by: Jiange Zhao --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 420888e941df..b52c7255e5e4 100644 ---