On 2016年08月03日 21:43, Christian König wrote:
Well that is a clear NAK to this whole approach.
Submitting the recovery jobs to the scheduler is reentrant because the
scheduler is the one who originally signaled us of a timeout.
we have reset all recovery jobs, right? Could we think those jobs are
same as others?
Why not submit the recovery jobs to the hardware ring directly?
Yeah, this is also what I did at begin.
The mainly reasons are:
0. recovery jobs need to wait itself page table recovery completed at least.
1. direct submission is using run_job which is used by scheduler as
well, which could introduce conflicts.
2. if all vm clients use one sdma engine, the speed of restoring is
slow. If we can use itself pte ring, then we will use all sdma engines
for them.
3. if just one entity is to recover all vm page tables, then their
recovery jobs will have potential dependency, the later is waiting the
front. If they have their own entity, there will be no dependency
between them.
4. if recovery entity is based on kernel run queue, then the recovery
jobs could be executed with pt jobs at the same time.
Above is why I introduce recovery entity and recovery run queue.
Regards,
David Zhou
Regards,
Christian.
Am 28.07.2016 um 12:13 schrieb Chunming Zhou:
every vm has itself recovery entity, which is used to reovery page
table from their shadow.
They don't need to wait front vm completed.
And also using all pte rings can speed reovery.
every scheduler has its own recovery entity, which is used to save hw
jobs, and resubmit from it, which solves the conflicts between reset
thread and scheduler thread when run job.
And some fixes when doing this improment.
Chunming Zhou (11):
drm/amdgpu: hw ring should be empty when gpu reset
drm/amdgpu: specify entity to amdgpu_copy_buffer
drm/amd: add recover run queue for scheduler
drm/amdgpu: fix vm init error path
drm/amdgpu: add vm recover entity
drm/amdgpu: use all pte rings to recover page table
drm/amd: add recover entity for every scheduler
drm/amd: use scheduler to recover hw jobs
drm/amd: hw job list should be exact
drm/amd: reset jobs to recover entity
drm/amdgpu: no need fence wait every time
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 5 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c | 3 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 35 +++++--
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 11 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_test.c | 8 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 26 ++++--
drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 129
+++++++++++++-------------
drivers/gpu/drm/amd/scheduler/gpu_scheduler.h | 4 +-
9 files changed, 134 insertions(+), 92 deletions(-)
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx