On Fri, Sep 13, 2024 at 01:23:01PM -0700, Rob Clark wrote: > From: Rob Clark <robdcl...@chromium.org> > > Fixes a race condition reported here: > https://github.com/AsahiLinux/linux/issues/309#issuecomment-2238968609 > > The whole premise of lockless access to a single-producer-single- > consumer queue is that there is just a single producer and single > consumer. That means we can't call drm_sched_can_queue() (which is > about queueing more work to the hw, not to the spsc queue) from > anywhere other than the consumer (wq). > > This call in the producer is just an optimization to avoid scheduling > the consuming worker if it cannot yet queue more work to the hw. It > is safe to drop this optimization to avoid the race condition. > > Suggested-by: Asahi Lina <l...@asahilina.net> > Fixes: a78422e9dff3 ("drm/sched: implement dynamic job-flow control") > Closes: https://github.com/AsahiLinux/linux/issues/309 > Cc: sta...@vger.kernel.org > Signed-off-by: Rob Clark <robdcl...@chromium.org> > --- > drivers/gpu/drm/scheduler/sched_entity.c | 4 ++-- > drivers/gpu/drm/scheduler/sched_main.c | 7 ++----- > include/drm/gpu_scheduler.h | 2 +- > 3 files changed, 5 insertions(+), 8 deletions(-)
Tested for several hours with CONFIG_PREMPT=y and kasan with a similar workload as in the github issue without reports or oopses. Feel free to add Tested-by: Janne Grunau <j...@jannau.net> thanks, Janne