Hey,
Den 2026-03-05 kl. 11:50, skrev Sebastian Andrzej Siewior:
> On 2026-03-05 11:42:19 [+0100], Maarten Lankhorst wrote:
>> Hey,
> Hi,
>
>> Den 2026-02-26 kl. 15:38, skrev Sebastian Andrzej Siewior:
>>> On 2026-02-26 15:19:42 [+0100], To Maarten Lankhorst wrote:
>>>> On 2026-02-26 13:07:18 [+0100], To Maarten Lankhorst wrote:
>>>>> series somewhere I could pull and check. In meantime I would look what
>>>>> causes the lockup on i915.
>>>>
>>>> I think I got it.
>>>
>>> This the atomic sync as-is, IRQ-Work (FIFO-1) will be preempted by the
>>> threaded-interrupt (FIFO-50) and the interrupt will poll on
>>> signaler_active while the irq-work can't make progress.
>>>
>>> This will provide the needed sync:
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
>>> b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
>>> index a2b413982ce64..337f6e88faf05 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
>>> @@ -209,6 +209,7 @@ static void signal_irq_work(struct irq_work *work)
>>> intel_breadcrumbs_disarm_irq(b);
>>>
>>> rcu_read_lock();
>>> + spin_lock(&b->signaler_active_sync);
>>> atomic_inc(&b->signaler_active);
>>> list_for_each_entry_rcu(ce, &b->signalers, signal_link) {
>>> struct i915_request *rq;
>>> @@ -246,6 +247,7 @@ static void signal_irq_work(struct irq_work *work)
>>> }
>>> }
>>> atomic_dec(&b->signaler_active);
>>> + spin_unlock(&b->signaler_active_sync);
>>> rcu_read_unlock();
>>>
>>> llist_for_each_safe(signal, sn, signal) {
>>> @@ -290,6 +292,7 @@ intel_breadcrumbs_create(struct intel_engine_cs
>>> *irq_engine)
>>> init_llist_head(&b->signaled_requests);
>>>
>>> spin_lock_init(&b->irq_lock);
>>> + spin_lock_init(&b->signaler_active_sync);
>>> init_irq_work(&b->irq_work, signal_irq_work);
>>>
>>> b->irq_engine = irq_engine;
>>> @@ -487,8 +490,11 @@ void intel_context_remove_breadcrumbs(struct
>>> intel_context *ce,
>>> if (release)
>>> intel_context_put(ce);
>>>
>>> - while (atomic_read(&b->signaler_active))
>>> + while (atomic_read(&b->signaler_active)) {
>>> + spin_lock(&b->signaler_active_sync);
>>> + spin_unlock(&b->signaler_active_sync);
>>> cpu_relax();
>>> + }
>>> }
>>>
>>> static void print_signals(struct intel_breadcrumbs *b, struct drm_printer
>>> *p)
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
>>> b/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
>>> index bdf09fd67b6e7..28dae32628aab 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
>>> @@ -40,6 +40,7 @@ struct intel_breadcrumbs {
>>> struct list_head signalers;
>>> struct llist_head signaled_requests;
>>> atomic_t signaler_active;
>>> + spinlock_t signaler_active_sync;
>>>
>>> spinlock_t irq_lock; /* protects the interrupt from hardirq context */
>>> struct irq_work irq_work; /* for use from inside irq_lock */
>>>
>>
>> Thinking some more, replacing signaler_active with signaler_active_sync
>> might be the best fix.
>> I'm not sure there's much use for parallel completion of the same
>> breadcrumb, and using completion
>> might be too heavy handed.
>
> We have something similar in timer, tasklet and other "similar" code
> where on RT you have preemption and therefore the possibility of another
> user on the same CPU while on !RT it is only possible on a remote CPU.
> Using the spinlock_t for synchronisation would restrict to one-on-one.
> The closest API that comes to mind would be a sequence lock. One writer,
> multiple reader. So that would be an option that you might like ;) If
> the pure spinlock_t is off the table.
I think it should be possible then to remove signaler_active, and change
the reader side to if (spin_is_locked()) { spin_lock(); spin_unlock(); } ?