On Thu, Nov 05, 2020 at 09:19:22PM +1100, Michael Ellerman wrote: > Carl Jacobsen <cjacob...@storix.com> writes: > This doesn't make a lot of sense. > > > Bad kernel stack pointer 7fffffffeac0 at 700 > > "at 700" is the regs->nip value, and suggests we're trying to handle a > program check, which is either a trap or BUG or WARN, or illegal > instruction or several other things.
> > REGS: c00000001ec2fd40 TRAP: 0300 Tainted: G (4.12.14-197.18-default) > > But then here it says TRAP = 0x300, which is != 0x700. > > The trap number is hardcoded in the bad stack handling code, and I don't > see how we can end up with nip == 0x700 but the trap value == 0x300. > > > MSR: 8000000000001000 <SF,ME> CR: 44000844 XER: 20000000 > > And here the MSR says you were in big endian mode, but you said before > your machine was ppc64le. It looks like you got a DSI (the 300), but for some reason that interrupt was not taken in LE mode, so the instruction at 300 was read as a lot of gobbledygook, not a valid insn, and the processor took a program interrupt (the 700). (MSR[RI]=0, but there can be other causes for that of course.) Segher