Here is the call trace: Call Trace: dump_stack+0x4d/0x63 amdgpu_ib_schedule+0x86/0x4b7 ? __mod_timer+0x21e/0x244 amdgpu_job_run+0x108/0x178 drm_sched_main+0x253/0x2fa ? remove_wait_queue+0x51/0x51 ? drm_sched_cleanup_jobs.part.12+0xda/0xda kthread+0x14f/0x157 ? kthread_park+0x86/0x86 ret_from_fork+0x22/0x40 amdgpu 0000:03:00.0: couldn't schedule ib on ring <sdma0> [drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)
printed via below change: @@ -151,6 +152,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs, } if (!ring->sched.ready) { + dump_stack(); dev_err(adev->dev, "couldn't schedule ib on ring <%s>\n", ring->name); return -EINVAL; On 10/24/2019 10:00 PM, Christian König wrote: Am 24.10.19 um 17:06 schrieb Grodzovsky, Andrey: On 10/24/19 7:01 AM, Christian König wrote: Am 24.10.19 um 12:58 schrieb S, Shirish: [Why] Upon GPU reset, kernel cleans up already submitted jobs via drm_sched_cleanup_jobs. This schedules ib's via drm_sched_main()->run_job, leading to race condition of rings being ready or not, since during reset rings may be suspended. NAK, exactly that's what should not happen. The scheduler should be suspend while a GPU reset is in progress. So you are running into a completely different race here. Below is the series of events when the issue occurs. (Note that as you & Andrey mentioned the scheduler has been suspended but the job is scheduled nonetheless.) amdgpu 0000:03:00.0: GPU reset begin! ... amdgpu_device_gpu_recover stopping ring sdma0 via drm_sched_stop ... amdgpu 0000:03:00.0: GPU reset succeeded, trying to resume amdgpu_do_asic_reset starting to resume blocks ... amdgpu 0000:03:00.0: couldn't schedule ib on ring <sdma0> [drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22) ... amdgpu_device_ip_resume_phase2 resumed gfx_v9_0 amdgpu_device_ip_resume_phase2 resumed sdma_v4_0 amdgpu_device_ip_resume_phase2 resumed powerplay ... FWIW, since the job is always NULL when "drm_sched_stop(&ring->sched, job ? &job->base : NULL);" when called during reset, all drm_sched_stop() does is cancel delayed work and park the sched->thread. There is no job list to be iterated to de-activate or remove or update fences. Based on all this analysis, adding a mutex is more failsafe and less intrusive in the current code flow and lastly seems to be logical as well, hence I devised this approach Please sync up with Andrey how this was able to happen. Regards, Christian. Shirish - Christian makes a good point - note that in amdgpu_device_gpu_recover drm_sched_stop which stop all the scheduler threads is called way before we suspend the HW in amdgpu_device_pre_asic_reset->amdgpu_device_ip_suspend where SDMA suspension is happening and where the HW ring marked as not ready - please provide call stack for when you hit [drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22) to identify the code path which tried to submit the SDMA IB Well the most likely cause of this is that the hardware failed to resume after the reset. Infact hardware resume has not yet started, when the job is scheduled, which is the race am trying to address with this patch. Regards, Shirish S Christian. Andrey [How] make GPU reset's amdgpu_device_ip_resume_phase2() & amdgpu_ib_schedule() in amdgpu_job_run() mutually exclusive. Signed-off-by: Shirish S <shiris...@amd.com><mailto:shiris...@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 ++ 3 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index f4d9041..7b07a47b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -973,6 +973,7 @@ struct amdgpu_device { bool in_gpu_reset; enum pp_mp1_state mp1_state; struct mutex lock_reset; + struct mutex lock_ib_sched; struct amdgpu_doorbell_index doorbell_index; int asic_reset_res; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 676cad1..63cad74 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2759,6 +2759,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, mutex_init(&adev->virt.vf_errors.lock); hash_init(adev->mn_hash); mutex_init(&adev->lock_reset); + mutex_init(&adev->lock_ib_sched); mutex_init(&adev->virt.dpm_mutex); mutex_init(&adev->psp.mutex); @@ -3795,7 +3796,9 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info *hive, if (r) return r; + mutex_lock(&tmp_adev->lock_ib_sched); r = amdgpu_device_ip_resume_phase2(tmp_adev); + mutex_unlock(&tmp_adev->lock_ib_sched); if (r) goto out; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index e1bad99..cd6082d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -233,8 +233,10 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job) if (finished->error < 0) { DRM_INFO("Skip scheduling IBs!\n"); } else { + mutex_lock(&ring->adev->lock_ib_sched); r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, job, &fence); + mutex_unlock(&ring->adev->lock_ib_sched); if (r) DRM_ERROR("Error scheduling IBs (%d)\n", r); } _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx