Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-08 Thread Andrey Grodzovsky
e- From: Grodzovsky, Andrey Sent: Monday, March 8, 2021 1:28 AM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe But the hive->tb object is used regardless, inside amdgpu_device_xgmi_reset_func currently, it m

RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-08 Thread Liu, Shaoyun
; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe I see, thanks for explaning. Andrey On 2021-03-08 10:27 a.m., Liu, Shaoyun wrote: > [AMD Official Use Only - Internal Distribution Only] > > Check the

Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-08 Thread Andrey Grodzovsky
021 1:28 AM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe But the hive->tb object is used regardless, inside amdgpu_device_xgmi_reset_func currently, it means then even when you explcicitly schdule xgmi

RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-08 Thread Liu, Shaoyun
ell when all GPUs are removed . > > Thanks > shaopyunliu > > -Original Message- > From: amd-gfx On Behalf Of > Liu, Shaoyun > Sent: Saturday, March 6, 2021 3:41 PM > To: Grodzovsky, Andrey ; > amd-gfx@lists.freedesktop.org > Subject: RE: [PATCH 5/5]

Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-07 Thread Andrey Grodzovsky
f Liu, Shaoyun Sent: Saturday, March 6, 2021 3:41 PM To: Grodzovsky, Andrey ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe [AMD Official Use Only - Internal Distribution Only] I call the amdgpu_do_asic_reset with the parameter s

[PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-06 Thread shaoyunl
In passthrough configuration, hypervisior will trigger the SBR(Secondary bus reset) to the devices without sync to each other. This could cause device hang since for XGMI configuration, all the devices within the hive need to be reset at a limit time slot. This serial of patches try to solve thi

RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-06 Thread Liu, Shaoyun
essage- From: amd-gfx On Behalf Of Liu, Shaoyun Sent: Saturday, March 6, 2021 3:41 PM To: Grodzovsky, Andrey ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe [AMD Official Use Only - Internal Distribution O

RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-06 Thread Liu, Shaoyun
aoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe Thanks for explaining this, one thing I still don't understand is why you schedule the reset work explicilty in the begining of amdgpu_drv_delayed_reset_work_handle

Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-05 Thread Andrey Grodzovsky
org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe On 2021-03-05 12:52 p.m., shaoyunl wrote: In passthrough configuration, hypervisior will trigger the SBR(Secondary bus reset) to the devices without sync to each other. This could cause device hang

RE: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-05 Thread Liu, Shaoyun
. let me verify it. Regards Shaoyun.liu -Original Message- From: Grodzovsky, Andrey Sent: Friday, March 5, 2021 2:27 PM To: Liu, Shaoyun ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe On 2021-03-05 12:52 p.m.

Re: [PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-05 Thread Andrey Grodzovsky
On 2021-03-05 12:52 p.m., shaoyunl wrote: In passthrough configuration, hypervisior will trigger the SBR(Secondary bus reset) to the devices without sync to each other. This could cause device hang since for XGMI configuration, all the devices within the hive need to be reset at a limit time

[PATCH 5/5] drm/amdgpu: Reset the devices in the XGMI hive duirng probe

2021-03-05 Thread shaoyunl
In passthrough configuration, hypervisior will trigger the SBR(Secondary bus reset) to the devices without sync to each other. This could cause device hang since for XGMI configuration, all the devices within the hive need to be reset at a limit time slot. This serial of patches try to solve thi