On Wed, Sep 30, 2020 at 10:30 AM Peter Zijlstra <pet...@infradead.org> wrote: > > On Wed, Sep 30, 2020 at 07:48:48AM -0700, Dave Hansen wrote: > > On 9/30/20 7:42 AM, Liang, Kan wrote: > > >> When I tested on my kernel, it panicked because I suspect > > >> current->active_mm could be NULL. Adding a check for NULL avoided the > > >> problem. But I suspect this is not the correct solution. > > > > > > I guess the NULL active_mm should be a rare case. If so, I think it's > > > not bad to add a check and return 0 page size. > > > > I think it would be best to understand why ->active_mm is NULL instead > > of just papering over the problem. If it is papered over, and this is > > common, you might end up effectively turning off your shiny new feature > > inadvertently. > > context_switch() can set prev->active_mm to NULL when it transfers it to > @next. It does this before @current is updated. So an NMI that comes in > between this active_mm swizzling and updating @current will see > !active_mm. > I think Peter is right. This code is called in the context of NMI, so if active_mm is set to NULL inside a locked section, this is not enough to protect the perf_events code from seeing it.
> In general though; I think using ->active_mm is a mistake though. That > code should be doing something like: > > > mm = current->mm; > if (!mm) > mm = &init_mm; > >