On Fri, Sep 26, 2014 at 04:31:17PM -0700, Andi Kleen wrote: > From: Andi Kleen <a...@linux.intel.com> > > When copy_from_user_nmi faults the copy_user_tail code ends > up "replaying" the page faults to compute the exact tail bytes, > (added with 112958).
That is a wrong way to quote commits in two ways; 1) Linus 'requires' you use 12 character abreviated hashes (because we've already seen collisions with the default 8), yet you use 6. 2) the recommended quoting style is: 1129585a08ba ("x86: introduce copy_user_handle_tail() routine") You _should_ know this. > So we do an expensive page fault. And then we do it *again*. > > This ends up being very expensive in the PMI handler for any > page fault on a stack access, and is one the more common > causes for the NMI handler exceeding its runtime limit. > > 1) 0.109 us | copy_from_user_nmi(); > 1) | copy_from_user_nmi() { > 1) | __do_page_fault() { > 1) | bad_area_nosemaphore() { > 1) | __bad_area_nosemaphore() { > 1) | no_context() { > 1) | fixup_exception() { > 1) | search_exception_tables() { > 1) 0.079 us | search_extable(); > 1) 0.409 us | } > 1) 0.757 us | } > 1) 1.106 us | } > 1) 1.466 us | } > 1) 1.793 us | } > 1) 2.233 us | } > 1) | copy_user_handle_tail() { > 1) | __do_page_fault() { > 1) | bad_area_nosemaphore() { > 1) | __bad_area_nosemaphore() { > 1) | no_context() { > 1) | fixup_exception() { > 1) | search_exception_tables() { > 1) 0.060 us | search_extable(); > 1) 0.412 us | } > 1) 0.764 us | } > 1) 1.074 us | } > 1) 1.389 us | } > 1) 1.665 us | } > 1) 2.002 us | } > 1) 2.784 us | } > 1) 6.230 us | } > > The NMI code actually doesn't care about the exact tail value. It only > needs to know if a fault happened (!= 0) For now, changing the semantics of the function seems like a sure way to fail in the future though. > So check for in_nmi() in copy_user_tail and don't bother with the exact > tail check. This way we save the extra ~2.7us. > > In theory we could also duplicate the whole copy_*_ path for cases > where the caller doesn't care about the exact bytes. But that > seems overkill for just this issue, and I'm not sure anyone > else cares about how fast this is. The simpler check works > as well for now. So I don't get that code, but why not fix it in general? Taking two faults seems silly. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/