amdgpu_ras_reserve_bad_pages is only used by umc block, so another approach is to move it into amdgpu_umc_process_ras_data_cb. Anyway, either way is OK and the patch is:
Reviewed-by: Tao Zhou <tao.zh...@amd.com> > -----Original Message----- > From: Andrey Grodzovsky <andrey.grodzov...@amd.com> > Sent: 2019年9月11日 3:41 > To: amd-gfx@lists.freedesktop.org > Cc: Chen, Guchun <guchun.c...@amd.com>; Zhou1, Tao > <tao.zh...@amd.com>; Deucher, Alexander > <alexander.deuc...@amd.com>; Grodzovsky, Andrey > <andrey.grodzov...@amd.com> > Subject: [PATCH] drm/amdgpu: Fix mutex lock from atomic context. > > Problem: > amdgpu_ras_reserve_bad_pages was moved to amdgpu_ras_reset_gpu > because writing to EEPROM during ASIC reset was unstable. > But for ERREVENT_ATHUB_INTERRUPT amdgpu_ras_reset_gpu is called > directly from ISR context and so locking is not allowed. Also it's irrelevant > for > this partilcular interrupt as this is generic RAS interrupt and not memory > errors specific. > > Fix: > Avoid calling amdgpu_ras_reserve_bad_pages if not in task context. > > Signed-off-by: Andrey Grodzovsky <andrey.grodzov...@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h > index 012034d..dd5da3c 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h > @@ -504,7 +504,9 @@ static inline int amdgpu_ras_reset_gpu(struct > amdgpu_device *adev, > /* save bad page to eeprom before gpu reset, > * i2c may be unstable in gpu reset > */ > - amdgpu_ras_reserve_bad_pages(adev); > + if (in_task()) > + amdgpu_ras_reserve_bad_pages(adev); > + > if (atomic_cmpxchg(&ras->in_recovery, 0, 1) == 0) > schedule_work(&ras->recovery_work); > return 0; > -- > 2.7.4 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx