On Tue, May 28, 2019 at 04:32:24PM +0100, Will Deacon wrote: > On Tue, May 28, 2019 at 04:01:03PM +0200, Peter Zijlstra wrote: > > On Tue, May 28, 2019 at 08:31:29PM +0800, Young Xiao wrote: > > > When a kthread calls call_usermodehelper() the steps are: > > > 1. allocate current->mm > > > 2. load_elf_binary() > > > 3. populate current->thread.regs > > > > > > While doing this, interrupts are not disabled. If there is a perf > > > interrupt in the middle of this process (i.e. step 1 has completed > > > but not yet reached to step 3) and if perf tries to read userspace > > > regs, kernel oops. > > This seems to be because pt_regs(current) gives NULL for kthreads on Power.
I think you mean task_pt_regs(current) here. > > > Fix it by setting abi to PERF_SAMPLE_REGS_ABI_NONE when userspace > > > pt_regs are not set. > > > > > > See commit bf05fc25f268 ("powerpc/perf: Fix oops when kthread execs > > > user process") for details. > > > > Why the hell do we set current->mm before it is complete? Note that > > normally exec() builds the new mm before attaching it, see exec_mmap() > > in flush_old_exec(). > > From the initial report [1], it doesn't look like the mm isn't initialised, > but rather than we're dereferencing a NULL pt_regs pointer somehow for the > current task (see previous comment). I don't see how that can happen on > arm64, given that we put the pt_regs on the kernel stack which is allocated > during fork. > > Will > > [1] https://git.kernel.org/linus/bf05fc25f268 One caveat is that for the idle threads, the initial SP overlaps the task_pt_regs() area: * __primary_switched starts SP at init_thread_union + THREAD_SIZE. * __cpu_up() starts SP at task_stack_page(idle) + THREAD_SIZE. ... and in either case, sampling that would be bad. For both arm, I believe similar holds true. AFAICT x86 seems to reserve the regs area in its head_{64,32}.S, but I can't see what it does for other threads. Regardless, for arm, arm64, and x86, task_pt_regs(current) cannot be NULL. Thanks, Mark.