On 07/17, Jeremy Katz wrote: > > This is with the patch (and 2.6.22.1 and hrt6): > > ------------[ cut here ]------------ > Kernel BUG at c0125adb [verbose debug info unavailable] > invalid opcode: 0000 [#1] > SMP > Modules linked in: > CPU: 3 > EIP: 0060:[<c0125adb>] Not tainted VLI > EFLAGS: 00010246 (2.6.22.1-WR1.4aq_cgl #2) > EIP is at sigqueue_free+0x23/0x72 > eax: 00000000 ebx: f6eaf1a8 ecx: f6d4b888 edx: 00000202 > esi: f73e2ba8 edi: 00000000 ebp: f7c7bf8c esp: f7c7bf84 > ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 > Process hrtm_test (pid: 15340, ti=f7c7b000 task=f6c0f030 task.ti=f7c7b000) > Stack: 00000202 f73e2ba8 f7c7bf9c c012c941 f73e2ba8 000002fb f7c7bfb0 > c012d1c0 > 00000286 000002fb 00000000 f7c7b000 c01027ca 000002fb 466afff4 > 08064190 > 00000000 00000000 ae8d43c8 00000107 ffff007b c010007b 00000000 > 00000107 > Call Trace: > [<c0103504>] show_trace_log_lvl+0x1a/0x30 > [<c01035bb>] show_stack_log_lvl+0x8d/0xaa > [<c01037f5>] show_registers+0x1cd/0x2cb > [<c0103a4a>] die+0x113/0x207 > [<c039aab5>] do_trap+0x8f/0xc6 > [<c0103d35>] do_invalid_op+0x88/0x92 > [<c039a882>] error_code+0x72/0x78 > [<c012c941>] release_posix_timer+0x1b/0x7a > [<c012d1c0>] sys_timer_delete+0xd7/0x10c > [<c01027ca>] syscall_call+0x7/0xb
This looks really impossible. I believe I see another bug in sys_timer_create(), but it is not related to this problem. sys_timer_create() inserts a partly initialized new_timer into posix_timers_id and drops idr_lock. Suppose that another thread does sys_timer_delete(). If it sees ->it_process != NULL, lock_timer() succeeds. If the timer was not fully initialized at this time (or another CPU sees the result of STOREs out of order), we can have multiple probles. list_del() may oops, we can do put_task_struct() before sys_timer_create() does get_task_struct(), we may leak the task_struct if sys_timer_delete() doesn't see SIGEV_THREAD_ID yet. Jeremy, I agree with Thomas that your patch should not be right, but it does make a difference. Perhaps this is just the timing, but who knows. Could you add some printk's to be sure that lock_timer() actually fails while it never should? Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/