> Is this the first OOPS it prints out? I don't think so. I am 
> very sure it printed out messages from die_if_kernel first and 
> we need that initial OOPS to diagnose this bug and fix it. 
> 
> All the rest of the OOPS messages are useless and won't tell 
> us what the real problem is. 

> Later, 
> David S. Miller 

Bad news about recursive Oops is that too often the system
cannot continue and oopsen never reach /var/log/messages.

This problem was so common on sparc(32) that I run all my
kernels with the attached patch. I think an application
of a similar change should be mandatory if you are insterested
in any sort of debugging.

The alternative is to use a serial console, captured at all times.

--Pete

diff -u -r1.63 traps.c
--- arch/sparc/kernel/traps.c   2000/06/04 06:23:52     1.63
+++ arch/sparc/kernel/traps.c   2000/06/26 18:19:10
@@ -114,18 +116,23 @@
                 * bound in case our stack is trashed and we loop.
                 */
                while(rw                                        &&
-                     count++ < 30                              &&
+                     count++ < 10                              && /* P3 30 */
                       (((unsigned long) rw) >= PAGE_OFFSET)    &&
                      !(((unsigned long) rw) & 0x7)) {
                        printk("Caller[%08lx]\n", rw->ins[7]);
                        rw = (struct reg_window *)rw->ins[6];
                }
        }
+#if 0
        printk("Instruction DUMP:");
        instruction_dump ((unsigned long *) regs->pc);
        if(regs->psr & PSR_PS)
                do_exit(SIGKILL);
        do_exit(SIGSEGV);
+#else
+       printk("Looping...");
+       for (;;) { }
+#endif
 }
 
 void do_hw_interrupt(unsigned long type, unsigned long psr, unsigned long pc)
Index: arch/sparc/mm/fault.c
===================================================================
RCS file: /vger-cvs/linux/arch/sparc/mm/fault.c,v
retrieving revision 1.116
diff -u -r1.116 fault.c
--- arch/sparc/mm/fault.c       2000/05/03 06:37:03     1.116
+++ arch/sparc/mm/fault.c       2000/06/26 18:19:11
@@ -146,11 +146,15 @@
                printk(KERN_ALERT "Unable to handle kernel paging request "
                        "at virtual address %08lx\n", address);
        }
+       if (tsk->active_mm == NULL) {
+               printk(KERN_ALERT "tsk->active_mm = NULL\n");
+       } else {
        printk(KERN_ALERT "tsk->{mm,active_mm}->context = %08lx\n",
                (tsk->mm ? tsk->mm->context : tsk->active_mm->context));
        printk(KERN_ALERT "tsk->{mm,active_mm}->pgd = %08lx\n",
                (tsk->mm ? (unsigned long) tsk->mm->pgd :
                        (unsigned long) tsk->active_mm->pgd));
+       }
        die_if_kernel("Oops", regs);
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Reply via email to