Jeremy Katz experienced a posix-timer related bug on 2.6.14. This is caused by a subtle race, which is there since the original posix timer commit and persists until today.
timer_delete does: lock_timer(); timer->it_process = NULL; unlock_timer(); release_posix_timer(); timer->it_process is checked in lock_timer() to prevent access to a timer, which is on the way to be deleted, but the check happens after idr_lock is dropped. This allows release_posix_timer() to delete the timer before the lock code can check the timer: CPU 0 CPU 1 lock_timer(); timer->it_process = NULL; unlock_timer(); lock_timer() spin_lock(idr_lock); timer = idr_find(); spin_lock(timer->lock); spin_unlock(idr_lock); release_posix_timer(); spin_lock(idr_lock); idr_remove(timer); spin_unlock(idr_lock); free_timer(timer); if (timer->......) Change the locking to prevent this. Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]> diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c index 329ce01..c2ac6fd 100644 --- a/kernel/posix-timers.c +++ b/kernel/posix-timers.c @@ -605,13 +605,14 @@ static struct k_itimer * lock_timer(timer_t timer_id, unsigned long *flags) timr = (struct k_itimer *) idr_find(&posix_timers_id, (int) timer_id); if (timr) { spin_lock(&timr->it_lock); - spin_unlock(&idr_lock); if ((timr->it_id != timer_id) || !(timr->it_process) || timr->it_process->tgid != current->tgid) { - unlock_timer(timr, *flags); + spin_unlock(&timr->it_lock); + spin_unlock_irqrestore(&idr_lock, *flags); timr = NULL; - } + } else + spin_unlock(&idr_lock); } else spin_unlock_irqrestore(&idr_lock, *flags); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/