Am 08.03.21 um 05:06 schrieb Liu, Monk:
[AMD Official Use Only - Internal Distribution Only]
well first of all please completely drop the affinity group stuff from this
patch. We should concentrate on one feature at at time.
We need it to expedite the process, we can introduce this change
But the hive->tb object is used regardless, inside
amdgpu_device_xgmi_reset_func currently, it means then even when you
explcicitly schdule xgmi_reset_work as you do now they code will try to
sync using a not well iniitlized tb object. Maybe you can define a
global static tb object, fill it in
Hi, Christian and Monk,
Thanks so much for these valuable ideas. I will follow the advices.
>> Then the implementation is way to complicate. All you need to do is insert a
>> dma_fence_wait after re-scheduling each job after a reset.
Sure, if I re-implement amd_sched_resubmit_jobs2(), and add a
[AMD Official Use Only - Internal Distribution Only]
>> well first of all please completely drop the affinity group stuff from this
>> patch. We should concentrate on one feature at at time.
We need it to expedite the process, we can introduce this change in another
patch
>> Then the implement
From: Alex Deucher
[ Upstream commit 25951362db7b3791488ec45bf56c0043f107b94b ]
It works fine and was only disabled because primary GPUs
don't enter runpm if there is a console bound to the fbdev due
to the kmap. This will at least allow runpm on secondary cards.
Reviewed-by: Evan Quan
Review
From: Alex Deucher
[ Upstream commit 25951362db7b3791488ec45bf56c0043f107b94b ]
It works fine and was only disabled because primary GPUs
don't enter runpm if there is a console bound to the fbdev due
to the kmap. This will at least allow runpm on secondary cards.
Reviewed-by: Evan Quan
Review