Ingo Molnar wrote: > * Andrew Morton <[EMAIL PROTECTED]> wrote: > >> On Mon, 15 Oct 2007 16:17:23 +0200 >> Ingo Molnar <[EMAIL PROTECTED]> wrote: >> >>> Linus, please pull the latest scheduler git tree from: >>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched.git >> Did Paul Jackson's crash get fixed? > > yes - that crash was a showstopper that was holding up the pull request > for 2 days. Paul bisected it down to the culprit and the fix was to do > this in wake_up_new_task(): > > - if (!p->sched_class->task_new || !current->se.on_rq) { > + if (!p->sched_class->task_new || !current->se.on_rq || !rq->cfs.curr) > { > > (during early bootup the cfs_rq has no curr pointer yet.) It's not clear > why this race did not trigger earlier. (and the two checks can probably > be consolidated into a single "!rq->cfs.curr" condition.)
Maybe not related to that but now my box is killed after this merge. When I do not much on the box I get maybe 6h uptime , by doing some work ( compiling etc ) is random freeze. I was able to capture the OOps finally : ... [15692.917111] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000044 [15692.917159] printing eip: [15692.917174] c0111f90 [15692.917185] *pde = 00000000 [15692.917200] Oops: 0000 [#1] [15692.917208] PREEMPT SMP [15692.917240] Modules linked in: fuse netconsole configfs pc87360 hwmon_vid eeprom adm1021 uhci_hcd sr_mod shpchp pci_hotplug ohci_hcd iTCO_wdt iTCO_vendor_support intel_agp i82860_edac i2c_i801 ehci_hcd usbcore edac_core cdrom agpgart 3c59x mii ext4dev jbd2 capability commoncap loop lp parport_pc parport evdev [15692.917623] CPU: 0 [15692.917625] EIP: 0060:[<c0111f90>] Not tainted VLI [15692.917629] EFLAGS: 00010046 (2.6.23-g65a6ec0d #330) [15692.917661] EIP is at pick_next_task_fair+0x1f/0x2d [15692.917672] eax: c150a7b8 ebx: 00000000 ecx: 00000000 edx: 00000000 [15692.917689] esi: c1507a48 edi: 00000000 ebp: 00eaaf7a esp: cb1fdf14 [15692.917701] ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068 [15692.917715] Process sed (pid: 28999, ti=cb1fc000 task=cfdc3500 task.ti=cb1fc000) [15692.917725] Stack: c02f8268 c02ef7b5 00000002 cb1fdf58 cb1fdf50 00000000 c0400f38 c0403780 [15692.917833] cfdc3500 cfdc3634 c150a780 00000000 c011a8e7 00000000 c1077aa0 000000ff [15692.917942] 00000000 00000000 00000000 cb1fdf8c 00000010 cfdc3500 cb1fdf8c c011ace5 [15692.918048] Call Trace: [15692.918072] [<c02ef7b5>] schedule+0x321/0x58f [15692.918109] [<c011a8e7>] do_exit+0x293/0x6c6 [15692.918143] [<c011ace5>] do_exit+0x691/0x6c6 [15692.918169] [<c011ad87>] sys_exit_group+0x0/0xd [15692.918195] [<c01026e6>] sysenter_past_esp+0x5f/0x85 [15692.918232] ======================= [15692.918244] Code: 8b 53 28 89 43 34 89 53 38 5b 5e c3 53 31 d2 83 78 40 00 74 20 83 c0 38 8b 50 20 31 db 85 d2 74 0a 8d 5a f8 89 da e8 a9 ff ff ff <8b> 43 44 85 c0 75 e6 8d 53 d0 89 d0 5b c3 57 56 53 89 c6 89 d7 [15692.918981] EIP: [<c0111f90>] pick_next_task_fair+0x1f/0x2d SS:ESP 0068:cb1fdf14 ... After that the box is death need to hard reset it. Interesting thing is when I compile the kernel with debug I don't get that ( or maybe its need longer to triggers it ? ) Config , lspci , dmesg , hardware specs , Oops message , and the top output when it Oops'ed there : http://194.231.229.228/lara/ > > Ingo Regards, Gabriel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/