On Mon, 2015-11-16 at 20:23 +1100, Michael Neuling wrote: > On Mon, 2015-11-16 at 12:51 +0530, Anshuman Khandual wrote: > > On 11/13/2015 10:27 AM, Michael Neuling wrote: > > > Currently we can hit a scenario where we'll tm_reclaim() twice. > > > This > > > results in a TM bad thing exception because the second reclaim > > > occurs > > > when not in suspend mode. > > > > > > The scenario in which this can happen is the following. We attempt > > > to > > > deliver a signal to userspace. To do this we need obtain the stack > > > pointer to write the signal context. To get this stack pointer we > > > must tm_reclaim() in case we need to use the checkpointed stack > > > pointer (see get_tm_stackpointer()). Normally we'd then return > > > directly to userspace to deliver the signal without going through > > > __switch_to(). > > > > > > Unfortunatley, if at this point we get an error (such as a bad > > > userspace stack pointer), we need to exit the process. The exit > > > will > > > result in a __switch_to(). __switch_to() will attempt to save the > > > process state which results in another tm_reclaim(). This > > > tm_reclaim() now causes a TM Bad Thing exception as this state has > > > already been saved and the processor is no longer in TM suspend > > > mode. > > > Whee! > > > > > > This patch checks the state of the MSR to ensure we are TM > > > suspended > > > before we attempt the tm_reclaim(). If we've already saved the > > > state > > > away, we should no longer be in TM suspend mode. This has the > > > additional advantage of checking for a potential TM Bad Thing > > > exception. > > > > Can this situation be created using a test and verified that with > > this new change, the kernel can handle it successfully. I guess > > the self test in the series does not cover this scenario. > > No it doesn't. The syscall fuzzer I have does hit it but I don't have > permission to post that.
And we don't really want a fuzzer as a selftest, because it might call unlink or something else bad. But having found the bug with the fuzzer, can't you write a test that triggers the bad case? From your description it sounds like if you had a child spinning with a bad r1, and then a parent sent it a signal that would trip it? cheers _______________________________________________ Linuxppc-dev mailing list [email protected] https://lists.ozlabs.org/listinfo/linuxppc-dev
