----- On Apr 30, 2020, at 10:50 AM, Joerg Roedel jroe...@suse.de wrote: > On Thu, Apr 30, 2020 at 04:11:20PM +0200, Joerg Roedel wrote: >> The page-fault handler calls a tracing function which again ends up in >> trace_event_ignore_this_pid(), where it faults again. From here on the CPU >> is in >> a page-fault loop, which continues until the stack overflows (with >> CONFIG_VMAP_STACK). > > Did some more testing to find out what this issue has to do with > > 763802b53a42 x86/mm: split vmalloc_sync_all() > > Above commit removes a call to vmalloc_sync_all() from the vmalloc > unmapping path, because that call caused severe performance regressions > on some workloads and was not needed on x86-64 anyway. > > But that call caused vmalloc_sync_all() to be called regularily on > x86-64 machines, so that all page-tables were more likely to be in sync. > > The call was introduced by commit > > 3f8fd02b1bf1 mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy() > > to fix a correctness issue on x86-32 PAE systems, which also need > unmappings of large pages in the vmalloc area to be synchronized. > > This additional call to vmalloc_sync_all() did hide the problem. I > verified it by reverting both of the above commits on v5.7-rc3 and > testing on that kernel. The problem is reproducible there too, the box > hangs hard. > > So the underlying problem is that a vmalloc()'ed tracing buffer is used > to trace the page-fault handler, so that it has no chance of faulting in > the buffer address to poking_mm and maybe other PGDs. > > The right fix is to call vmalloc_sync_mappings() right after allocating > tracing or perf buffers via v[zm]alloc().
Either right after allocation, or right before making the vmalloc'd data structure visible to the instrumentation. In the case of the pid filter, that would be the rcu_assign_pointer() which publishes the new pid filter table. As long as vmalloc_sync_mappings() is performed somewhere *between* allocation and publishing the pointer for instrumentation, it's fine. I'll let Steven decide on which approach works best for him. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com