On Monday 26 March 2007 08:49, Con Kolivas wrote: > On Monday 26 March 2007 04:28, Torsten Kaiser wrote: > > On 3/24/07, Con Kolivas <[EMAIL PROTECTED]> wrote: > > > kernel/sched.c | 51 > > > +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 > > > insertions(+) > > > > 2.6.21-rc4-mm1 also fails for me. > > > > I tried pure 2.6.21-rc4-mm1, +hotfixes, +hotfixes+rsdl33 and at last > > also added above debug patch. > > Thank you very much for the effort! > > > The oops from with the debug-patch added: > > [ 65.426126] Freeing unused kernel memory: 312k freed > > (on the console the system is starting up, getting until "Letting udev > > process events ...") > > [ 66.665611] Unable to handle kernel NULL pointer dereference at > > 0000000000000020 RIP: > > [ 66.682030] [<ffffffff8026167c>] __sched_text_start+0x4dc/0xa0e > > The debug patch didn't do anything. This means it is not an unset bitmap > problem at all otherwise it should have self corrected itself. > > > The system in x86_64, two 2218 on a MCP55 nvidia chipset. > > > > 2.6.21-rc3-mm1 works fine. > > > > (gdb) list *0xffffffff8026167c > > 0xffffffff8026167c is in schedule (kernel/sched.c:3619). > > next = list_entry(queue->next, struct task_struct, run_list); > rq->prio_level = idx; > > > 3614 /* > > 3615 * When the task is chosen it is checked to see if its > > quota has been > > 3616 * added to this runqueue level which is only performed > > once per 3617 * level per major rotation for each running > > task. 3618 */ > > 3619 if (next->rotation != rq->prio_rotation) { > > Urgh. Dereferencing there? That can only be next that's deferencing meaning > the run_list entry is bogus. That should only ever be done under runqueue > lock so I have a race somewhere where it's not. Time for more looking.
This is about the only place I can see the run_list is looked at unlocked. Can you see if this simple patch helps? The debug patch is unnecessary now. Thanks! -- Ensure checking task_queued() is only done under runqueue lock. Signed-off-by: Con Kolivas <[EMAIL PROTECTED]> --- kernel/sched.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) Index: linux-2.6.21-rc4-mm1/kernel/sched.c =================================================================== --- linux-2.6.21-rc4-mm1.orig/kernel/sched.c 2007-03-26 08:54:15.000000000 +1000 +++ linux-2.6.21-rc4-mm1/kernel/sched.c 2007-03-26 08:55:21.000000000 +1000 @@ -3421,16 +3421,16 @@ static inline void rotate_runqueue_prior static void task_running_tick(struct rq *rq, struct task_struct *p, int tick) { - if (unlikely(!task_queued(p))) { - /* Task has expired but was not scheduled yet */ - set_tsk_need_resched(p); - return; - } /* SCHED_FIFO tasks never run out of timeslice. */ if (unlikely(p->policy == SCHED_FIFO)) return; spin_lock(&rq->lock); + if (unlikely(!task_queued(p))) { + /* Task has expired but was not scheduled off yet */ + set_tsk_need_resched(p); + goto out_unlock; + } /* * Accounting is performed by both the task and the runqueue. This * allows frequently sleeping tasks to get their proper quota of -- -ck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/