Re,

[email protected], le mar. 03 mars 2026 04:08:15 +0000, a ecrit:
> 3. rwlock timeout/wakeup races (H-3, H-4): When a timed rwlock
> operation times out simultaneously with being woken by an unlock,
> reader counts or writer ownership leak, permanently locking the
> rwlock.

Indeed!

> This could explain rumpnet hangs.

Errr... rumpnet doesn't use pthread_rwlock_*


> 3. **rwlock timeout/wakeup race** (H-3, H-4): When a timed rwlock operation
>    times out simultaneously with being woken by an unlock, reader counts leak
>    or writer ownership leaks, permanently locking the rwlock.

> #### H-3: rwlock timedrdlock: reader count leak on timeout/wakeup race 
> (CRITICAL)
> 
> **Files**: `sysdeps/htl/pt-rwlock-timedrdlock.c:86-109`,
> `sysdeps/htl/pt-rwlock-unlock.c:66-87`
> 
> When a reader times out but was already dequeued and counted by the unlocking
> thread (which does `rwlock->__readers += n` for all queued readers), the
> timedrdlock returns `ETIMEDOUT` without decrementing `__readers`. The reader
> count is permanently inflated: the rwlock never appears fully unlocked again.
> 
> **Scenario**: Thread A holds write lock. Thread B calls 
> `pthread_rwlock_timedrdlock`
> and queues. B's timeout fires. Concurrently, A unlocks, dequeues B, increments
> `__readers`. B wakes, sees `prevp == NULL` and `err != 0`, drains the Mach
> message, returns ETIMEDOUT. But `__readers` is now 1 with no actual reader.
> 
> **Impact**: Permanent rwlock deadlock for writers. Could explain ext2fs hangs.
> 
> #### H-3 Proposed Fix: Treat dequeue-by-unlocker as successful acquisition
> 
> The bug is that `prevp == NULL && err != 0` returns ETIMEDOUT even though
> the unlocker already counted this thread as a reader. When `prevp == NULL`,
> the unlocker dequeued us and incremented `__readers` — we own the read lock
> regardless of whether we also timed out. Replace lines 86-116 with:
> 
> ```c
> __pthread_spin_wait (&rwlock->__lock);
> if (self->prevp == NULL)
>   {
>     /* Dequeued by unlocker — we got the lock.  Drain any stale wakeup.  */
>     __pthread_spin_unlock (&rwlock->__lock);
>     if (err)
>       __pthread_block (self);  /* drain */
>     assert (rwlock->__readers > 0);
>     return 0;
>   }
> else
>   {
>     /* Still queued — we timed out.  Remove ourselves.  */
>     __pthread_dequeue (self);
>     __pthread_spin_unlock (&rwlock->__lock);
>     assert (err == ETIMEDOUT);
>     return err;
>   }
> ```
> 
> #### H-4: rwlock timedwrlock: `__held` leak on timeout/wakeup race (CRITICAL)
> 
> **File**: `sysdeps/htl/pt-rwlock-timedwrlock.c:72-99`
> 
> Same pattern as H-3 but for writers. The unlocker transfers `__held` ownership
> to a queued writer by leaving `__held` locked. If the writer times out but was
> already dequeued, it returns ETIMEDOUT without releasing `__held`. The rwlock
> is permanently locked.
> 
> **Impact**: Permanent deadlock. Any thread attempting to acquire the rwlock 
> will
> block forever.
> 
> #### H-4 Proposed Fix: Same pattern as H-3
> 
> When `prevp == NULL`, the unlocker transferred `__held` ownership to us.
> Replace lines 72-99 with the same structure:
> 
> ```c
> __pthread_spin_wait (&rwlock->__lock);
> if (self->prevp == NULL)
>   {
>     /* Dequeued by unlocker — we own __held.  Drain any stale wakeup.  */
>     __pthread_spin_unlock (&rwlock->__lock);
>     if (err)
>       __pthread_block (self);  /* drain */
>     assert (rwlock->__readers == 0);
>     return 0;
>   }
> else
>   {
>     /* Still queued — we timed out.  Remove ourselves.  */
>     __pthread_dequeue (self);
>     __pthread_spin_unlock (&rwlock->__lock);
>     assert (err == ETIMEDOUT);
>     return err;
>   }
> ```


Yep, fixed essentially that way (but keeping the existing comments...)

> ### rumpnet hangs
> 
> Most likely: **H-3/H-4** (rwlock timeout race causing permanent deadlock).
> Network stacks use rwlocks extensively. A single timeout race permanently
> locks the rwlock, hanging all subsequent operations.

ditto.

> 3. **Fix H-3/H-4**: In the rwlock timedrdlock/timedwrlock timeout path, when
>    `prevp == NULL`, treat as successful acquisition regardless of timeout.

Samuel

Reply via email to