On Thu, Sep 12, 2013 at 05:11:25PM +0200, Maarten Lankhorst wrote: > Op 12-09-13 17:06, Peter Zijlstra schreef: > > Hi Dave, > > > > So I'm poking around the preemption code and stumbled upon: > > > > drivers/gpu/drm/i915/i915_gem.c: set_need_resched(); > > drivers/gpu/drm/ttm/ttm_bo_vm.c: set_need_resched(); > > drivers/gpu/drm/ttm/ttm_bo_vm.c: set_need_resched(); > > drivers/gpu/drm/udl/udl_gem.c: set_need_resched(); > > > > All these sites basically do: > > > > while (!trylock()) > > yield(); > > > > which is a horrible and broken locking pattern. > > > > Firstly its deadlock prone, suppose the faulting process is a FIFOn+1 > > task that preempted the lock holder at FIFOn. > > > > Secondly the implementation is worse than usual by abusing > > VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault > > doesn't retry, but you're using it as a get out of fault path. And > > you're using set_need_resched() which is not something a driver should > > _ever_ touch. > > > > Now I'm going to take away set_need_resched() -- and while you can > > 'reimplement' it using set_thread_flag() you're not going to do that > > because it will be broken due to changes to the preempt code. > > > > So please as to fix ASAP and don't allow anybody to trick you into > > merging silly things like that again ;-) > > > Agreed, but this is a case of locking inversion. How do you propose to get > around that?
me? No idea, I've never looked at the actual locking in DRM. Someone who's familiar with that code would have to tackle that. I just spotted the fail because I was going to remove set_need_resched() and had a WTF moment when I tripped over this stuff. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel