On Mon, Aug 31, 2015 at 6:19 PM, Andy Lutomirski <l...@amacapital.net> wrote: > > On Sun, Aug 30, 2015 at 7:52 PM, Andy Lutomirski <l...@amacapital.net> wrote: >> >> On Sun, Aug 30, 2015 at 2:18 PM, Brian Gerst <brge...@gmail.com> wrote: >> > On Sat, Aug 29, 2015 at 12:10 PM, Andy Lutomirski <l...@amacapital.net> >> > wrote: >> >> On Sat, Aug 29, 2015 at 8:20 AM, Brian Gerst <brge...@gmail.com> wrote: >> >>> This patch set contains several cleanups to the 32-bit VDSO. The >> >>> main change is to only build one VDSO image, and select the syscall >> >>> entry point at runtime. >> >> >> >> Oh no, we have dueling patches! >> >> >> >> I have a 2/3 finished series that cleans up the AT_SYSINFO mess >> >> differently, as I outlined earlier. I've only done the compat and >> >> common bits (no 32-bit native support quite yet), and it enters >> >> successfully on Intel using SYSENTER and on (fake) AMD using SYSCALL. >> >> The SYSRET bit isn't there yet. >> >> >> >> Other than some ifdeffery, the final system_call.S looks like this: >> >> >> >> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/tree/arch/x86/entry/vdso/vdso32/system_call.S?h=x86/entry_compat >> >> >> >> The meat is (sorry for whitespace damage): >> >> >> >> .text >> >> .globl __kernel_vsyscall >> >> .type __kernel_vsyscall,@function >> >> ALIGN >> >> __kernel_vsyscall: >> >> CFI_STARTPROC >> >> /* >> >> * Reshuffle regs so that all of any of the entry instructions >> >> * will preserve enough state. >> >> */ >> >> pushl %edx >> >> CFI_ADJUST_CFA_OFFSET 4 >> >> CFI_REL_OFFSET edx, 0 >> >> pushl %ecx >> >> CFI_ADJUST_CFA_OFFSET 4 >> >> CFI_REL_OFFSET ecx, 0 >> >> movl %esp, %ecx >> >> >> >> #ifdef CONFIG_X86_64 >> >> /* If SYSENTER is available, use it. */ >> >> ALTERNATIVE_2 "", "sysenter", X86_FEATURE_SYSENTER32, \ >> >> "syscall", X86_FEATURE_SYSCALL32 >> >> #endif >> >> >> >> /* Enter using int $0x80 */ >> >> movl (%esp), %ecx >> >> int $0x80 >> >> GLOBAL(int80_landing_pad) >> >> >> >> /* Restore ECX and EDX in case they were clobbered. */ >> >> popl %ecx >> >> CFI_RESTORE ecx >> >> CFI_ADJUST_CFA_OFFSET -4 >> >> popl %edx >> >> CFI_RESTORE edx >> >> CFI_ADJUST_CFA_OFFSET -4 >> >> ret >> >> CFI_ENDPROC >> >> >> >> .size __kernel_vsyscall,.-__kernel_vsyscall >> >> .previous >> >> >> >> And that's it. >> >> >> >> What do you think? This comes with massively cleaned up kernel-side >> >> asm as well as a test case that actually validates the CFI directives. >> >> >> >> Certainly, a bunch of your patches make sense regardless, and I'll >> >> review them and add them to my queue soon. >> >> >> >> --Andy >> > >> > How does the performance compare to the original? Looking at the >> > disassembly, there are two added function calls, and it reloads the >> > args from the stack instead of just shuffling registers. >> >> The replacement is dramatically faster, which means I probably >> benchmarked it wrong. I'll try again in a day or two. > > > It's enough slower to be problematic. I need to figure out how to trace it > properly. (Hmm? Maybe it's time to learn how to get perf on the host to > trace a KVM guest.) > > Everything is and was hilariously slow with context tracking on. That needs > to get fixed, and hopefully once this entry stuff is done someone will do the > other end of it. >
I got random errors from perf kvm, but I think I found at least part of the issue. The two irqs_disabled() calls in common.c are kind of expensive. I should disable them on non-lockdep kernels. The context tracking hooks are also too expensive, even when disabled. I should do something to optimize those. Hello, static keys? This doesn't affect syscalls, though. With context tracking off and the irqs_disabled checks commented out, we're probably doing well enough. We can always tweak the C code and aggressively force inlining if we want a few cycles back. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/