On Mon, Apr 18, 2011 at 7:47 AM, Christian Ehrhardt <[email protected]> wrote: ... > Note that the size of the stack frame between Xintr_ioapic1 and pf_pull_hdr > is huge (~6k). This area is filled with a 12 byte pattern that looks like > a code address, the kernel code segment and a pushed eflags register. The > code address points to the interrupt return path in this piece of code from > i386/locore.s: > > #define INTRFASTEXIT \ > popl %fs ; \ > popl %gs ; \ > popl %es ; \ > popl %ds ; \ > popl %edi ; \ > popl %esi ; \ > popl %ebp ; \ > popl %ebx ; \ > popl %edx ; \ > popl %ecx ; \ > popl %eax ; \ > sti ; \ <===== (1) > addl $8,%esp ; \ > iret <===== (2) > > The return address points to the iret marked with (2). I.e. we get hit by > an interrupt immediately before the iret and this happens repeatedly until > the kernel stack overflows.
Nice analysis! > This is only possible due to the sti instruction > at the point marked (1). As iret will restore the pushed eflags value anyway, > it should be safe to remove the sti altogether. Discussed with bluhm@ and > hshoexer@. Patch follows: > > Index: locore.s > =================================================================== > RCS file: /cvs/src/sys/arch/i386/i386/locore.s,v > retrieving revision 1.130 > diff -u -r1.130 locore.s > --- locore.s 3 Jul 2010 04:54:32 -0000 1.130 > +++ locore.s 18 Apr 2011 13:52:16 -0000 > @@ -128,7 +128,6 @@ > popl %edx ; \ > popl %ecx ; \ > popl %eax ; \ > - sti ; \ > addl $8,%esp ; \ > iret > > > The sti was introduced in revision 1.97 of locore.s in March 2006 by > mickey@. Commit message: > > | prevent the faults on iret to run w/ disabled intrs and cause > | deadlocks; niklas toby tom ok > > Maybe mickey or one of the people giving oks back then want to comment? That predates my involvement with OpenBSD, but I think get the gist. iret can fault if, for example, the user corrupts certain members of the sigcontext passed to a signal handler. If that happens the the CPU generates an exception on the iret and you end up in trap() where it peeks at the instruction that triggered the trap and, seeing it was iret, arranges to 'return' to the resume_iret label to generate what looks like a normal fault against the process. The goal of that sti is to keep that 'normal' fault from being processed with interrupts blocked. So, possibility one: sti's effect is delayed to the end of next instruction. What happens if you put it immediately before an iret that faults? If the flags pushed on the stack by the fault have interrupts unblocked, then simply moving the sti down a line to between the addl and the iret would be enough. If you can reproduce this situation easily than I would ask that you try that. Possibility two would be to try to handle this from, uh, inside resume_iret I guess. I'm not 100% sure whether it can be unilateral there though. Philip Guenther
