On 2025. december 18., csütörtök 9:41:41 középső államokbeli zónaidő Alex Deucher wrote: > On Thu, Dec 18, 2025 at 12:21 AM Timur Kristóf <[email protected]> wrote: > > On 2025. december 15., hétfő 10:07:09 középső államokbeli zónaidő Alex > > Deucher> > > wrote: > > > Only set an error on the fence if the fence is not > > > signalled. We can end up with a warning if the > > > per queue reset path signals the fence and sets an error > > > as part of the reset, but fails to recover. > > > > Can you please elaborate why this is necessary? > > I don't entirely see the point of this patch. Why don't want to set an > > error on the fence when it was signalled by the per queue reset? I would > > have thought that the next patch does that, and also fixes the warning > > mentioned in the commit message here. > > If you call dma_fence_set_error() on a fence that has already signaled > it triggers a warning. What could happen is that the queue reset sets > the error on the fence and then signals the fence as part of the reset > sequence. However if the queue reset ultimately fails, the fence is > already signaled and then we try and set an error again here as we > fall back to adapter reset, triggering the warning. > > Alex
I would have thought that the next patch in the series would take care of this problem by itself. Thanks for the explanation. The patch is: Reviewed-by: Timur Kristóf <[email protected]> > > > > Signed-off-by: Alex Deucher <[email protected]> > > > --- > > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index > > > 67fde99724bad..7f5d01164897f 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > > > @@ -147,7 +147,8 @@ static enum drm_gpu_sched_stat > > > amdgpu_job_timedout(struct drm_sched_job *s_job) dev_err(adev->dev, > > > "Ring > > > %s reset failed\n", ring->sched.name); } > > > > > > - dma_fence_set_error(&s_job->s_fence->finished, -ETIME); > > > + if (dma_fence_get_status(&s_job->s_fence->finished) == 0) > > > + dma_fence_set_error(&s_job->s_fence->finished, -ETIME); > > > > > > amdgpu_vm_put_task_info(ti);
