Am 01.06.2018 um 11:29 schrieb Huang Rui:
On Fri, Jun 01, 2018 at 05:13:49PM +0800, Christian König wrote:
Am 01.06.2018 um 08:41 schrieb Huang Rui:
After defer the execution of gfx/compute ib tests. However, at that time, the
gfx already go into "mid state" of gfxoff.
PWR_MISC_CNTL_STATUS: PWR_GFXOFF_STATUS field (2:1 bits)
0 = GFXOFF.
1 = Transition out of GFXOFF state.
2 = Not in GFXOFF.
3 = Transition into GFXOFF.
If hit the mid state (1 or 3), the doorbell writing interrupt cannot wake up the
gfx back successfully. And the field value is 1 when we issue the ib test at
that, so we got the hang. This is the root cause that we encountered the issue.
Meanwhile, we cannot set clockgating of GFX after gfx is already in "off" state.
So here we should move the gfx powergating and gfxoff enabling behavior at the
end of initialization behind ib test and clockgating.
Mhm, that still looks like a only halve backed solution:
1. What prevents this bug from happening during "normal" IB submission
from userspace?
2. Shouldn't we poll the PWR_MISC_CNTL_STATUS register to make sure we
are not in any transition phase instead?
Yes, right. How about also add polling of PWR_MISC_CNTL_STATUS in
amdgpu_ring_commit() behind set_wptr that confirm the status as "0" or "2"?
You could add an end_use() callback for that, but I think we rather need
to do this in gfx_v9_0_ring_set_wptr_gfx() before we write the doorbell.
Christian.
Thanks,
Ray
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx