On Wed, 19 Apr 2023, Michael Schmitz wrote: > > I wonder what we'd see if we patched the kernel to log every user data > > write fault caused by a MOVEM instruction. I'll try to code that up. > > If these instructions did always cause stack corruption on 030, I think > we would have noticed long ago? >
I think it probably was noticed long ago, in the form of rare userland crashes on 68030. But it was probably never reported because the actual culprit is too distant from the symptoms. But I take your point -- signal delivery seems to be crucial. Would it be difficult to skip signal delivery following a bus error? Perhaps there's no need to try that experiment, as we know what would happen. I will take a look at your modified test program and try to use the output to figure out the stack gymnastics. IIUC, there are two RTEs following the page fault. The first one runs the signal handler, the second one resumes the MOVEM that faulted. Maybe we'll have to intercept the latter (at do_sigreturn() perhaps?) and examine that exception frame.