On Mon, Dec 4, 2017 at 6:07 AM, Thomas Gleixner <t...@linutronix.de> wrote: > We can use PCID to retain the TLBs across CR3 switches; including > those now part of the user/kernel switch. This increases performance > of kernel entry/exit at the cost of more expensive/complicated TLB > flushing. > > Now that we have two address spaces, one for kernel and one for user > space, we need two PCIDs per mm. We use the top PCID bit to indicate a > user PCID (just like we use the PFN LSB for the PGD). Since we do TLB > invalidation from kernel space, the existing code will only invalidate > the kernel PCID, we augment that by marking the corresponding user > PCID invalid, and upon switching back to userspace, use a flushing CR3 > write for the switch. > > In order to access the user_pcid_flush_mask we use PER_CPU storage, > which means the previously established SWAPGS vs CR3 ordering is now > mandatory and required. > > Having to do this memory access does require additional registers, > most sites have a functioning stack and we can spill one (RAX), sites > without functional stack need to otherwise provide the second scratch > register. > > Note: PCID is generally available on Intel Sandybridge and later CPUs. > Note: Up until this point TLB flushing was broken in this series.
I haven't checked that hard which patch introduces this bug, but it seems that, with this applied, nothing propagates non-mm-switch-related flushes to usermode. Shouldn't flush_tlb_func_common() contain a call to invalidate_user_asid() near the bottom? Alternatively, it could be in local_flush_tlb() and __flush_tlb_single() (or whatever the hell the flush-one-usermode-TLB function ends up being called). Also, on a somewhat related note, __flush_tlb_single() is called from both flush_tlb_func_common() and do_kernel_range_flush. That sounds wrong.