lima: implement the file flush callback

Erico Nunes Mon, 14 Apr 2025 06:19:46 -0700

On Thu, Apr 10, 2025 at 4:04 PM Christian König
<christian.koe...@amd.com> wrote:
>
> Am 10.04.25 um 15:56 schrieb Qiang Yu:
> >>>> This prevents applications with multiple contexts from running into a
> >>>> race condition between running tasks and context destroy when
> >>>> terminating.
> >>>>
> > If implementing flush callback fix the problem, it must not be when SIGKILL.
>
> SIGKILL also calls flush (again eventually), but we can detect that in the 
> scheduler code.
>
> See the code and comment in drm_sched_entity_flush(). We could potentially 
> improve the comment cause it's not 100% correct, but it should work under all 
> cases.
>
> > Could you describe the problem and how this fix solves it? As I don't 
> > understand
> > how the above difference could fix a race bug.
>
> That was the point I wasn't sure about either. It should *not* fix any race, 
> it's just gives a nicer SIGKILL experience.


Thanks for this feedback. So as mentioned in the initial letter, I'm
also trying to understand if this is the correct fix.

This problem is unfortunately not trivial to reproduce, there is only
one application which can reproduce this so far and it is a
complicated one with multiple contexts and relatively lenghty jobs.
What I observed so far is that as it is right now, a context might be
destroyed while a job is running (e.g. by killing the application at
the right time), and a context destroy appears to release buffers that
are still being used by the running job which is what causes the read
faults.
This added flush made it so that the jobs always finish before the
context destroy gets called, which prevents the issue for this
application in my attempts. But I suppose it might also just be that
it had some more time to finish now. If this is not the correct fix
then there might be some more missing synchronization between running
job and context destroy in lima?

Thanks

Erico

Re: [PATCH 1/1] drm/lima: implement the file flush callback

Reply via email to