Hey,

Den 2026-03-10 kl. 18:04, skrev Sebastian Andrzej Siewior:
> On 2026-03-10 12:57:08 [+0100], Maarten Lankhorst wrote:
>> From: Sebastian Andrzej Siewior <[email protected]>
>>
>> IRQ-Work (FIFO-1) will be preempted by the threaded-interrupt (FIFO-50)
>> and the interrupt will poll on signaler_active while the irq-work can't
>> make progress.
> 
> The threaded-interrupt is the interrupt.
> 
> | On PREEMPT_RT the irq_work can be preempted by threaded-interrupt which
> | will be poll for completion but the irq_work routine can't make
> | progress.
> 
>> Solve this by adding a spinlock to prevent starvation and force
>> completion.
> 
> | Solve this by adding a spinlock_t to prevent starvation by forcing a
> | context switch if lock is held based on `signaler_active'. On
> | !PREEMPT_RT `signaler_active' can only be non-zero if multiple CPUs are
> | involved and spinning on the lock leds to the same result.
> 
>> Signed-off-by: Maarten Lankhorst <[email protected]>
> 
> If I am the From: then I should have the Signed-off-by, too. Let me do
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> 
> so it can be picked up.
> 
> You did suggest the following:
> 
> --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
> @@ -209,7 +209,7 @@ static void signal_irq_work(struct irq_work *work)
>               intel_breadcrumbs_disarm_irq(b);
>  
>       rcu_read_lock();
> -     atomic_inc(&b->signaler_active);
> +     spin_lock(&b->signaler_active_sync);
>       list_for_each_entry_rcu(ce, &b->signalers, signal_link) {
>               struct i915_request *rq;
>  
> @@ -245,7 +245,7 @@ static void signal_irq_work(struct irq_work *work)
>                               i915_request_put(rq);
>               }
>       }
> -     atomic_dec(&b->signaler_active);
> +     spin_unlock(&b->signaler_active_sync);
>       rcu_read_unlock();
>  
>       llist_for_each_safe(signal, sn, signal) {
> @@ -290,6 +290,7 @@ intel_breadcrumbs_create(struct intel_engine_cs 
> *irq_engine)
>       init_llist_head(&b->signaled_requests);
>  
>       spin_lock_init(&b->irq_lock);
> +     spin_lock_init(&b->signaler_active_sync);
>       init_irq_work(&b->irq_work, signal_irq_work);
>  
>       b->irq_engine = irq_engine;
> @@ -487,8 +488,10 @@ void intel_context_remove_breadcrumbs(struct 
> intel_context *ce,
>       if (release)
>               intel_context_put(ce);
>  
> -     while (atomic_read(&b->signaler_active))
> -             cpu_relax();
> +     while (spin_is_locked(&b->signaler_active_sync)) {
> +             spin_lock_irqsave(&b->signaler_active_sync, flags);
> +             spin_unlock_irqrestore(&b->signaler_active_sync, flags);
> 
> And this does not work because spin_is_locked() returns true and spins
> forever. This fails because there is a "corner case" where
> spin_is_locked() returns but the lock has no lock owner as in locked.
> This happens if there is a waiter which did not yet acquire the lock. 
> 
> So if you happy with this, we could keep it ;)

It seems CI is a lot happier too now.

Xe:
https://patchwork.freedesktop.org/series/159034/#rev14

BAT passes, the full run has some minor issues but only kms_vblank systematic.
Likely due to the fundamental changes of the PREEMPT_RT kernel itself, nothing 
driver specific.

i915:
https://patchwork.freedesktop.org/series/159035/#rev14

A few new warnings, and some noise. The most worrying part is the i915 
execlists selftest
failing on nearly all platforms:

<3>[  504.279024] i915/intel_execlists_live_selftests: live_preempt_user failed 
with error -62
...
<7>[  506.799685] [IGT] i915_selftest: finished subtest execlists, FAIL

I don't know what's going on there yet, likely needs more debugging. Could be 
the test itself
being written incorrectly or something else entirely.

Otherwise things are looking good! Have you uncovered anything else?

Kind regards,
~Maarten Lankhorst

Reply via email to