On Thu, Sep 10, 2015 at 12:04 PM, Denys Vlasenko <dvlas...@redhat.com> wrote: > On 09/10/2015 12:01 AM, Andy Lutomirski wrote: >> On Mon, Sep 7, 2015 at 8:56 AM, Denys Vlasenko <dvlas...@redhat.com> wrote: >>> This new test checks that all x86 registers are preserved across >>> 32-bit syscalls. It tests syscalls through VDSO (if available) >>> and through INT 0x80, normally and under ptrace. >>> >>> If kernel is a 64-bit one, high registers (r8..r15) are poisoned >>> before the syscall is called and are checked afterwards. >>> >>> They must be either preserved, or cleared to zero (but r11 is special); >>> r12..15 must be preserved for INT 0x80. >>> >>> EFLAGS is checked for changes too, but change there is not >>> considered to be a bug (paravirt kernels do not preserve >>> arithmetic flags). >>> >>> Run-tested on 64-bit kernel: >>> >>> $ ./test_syscall_vdso_32 >>> [RUN] Executing 6-argument 32-bit syscall via VDSO >>> [OK] Arguments are preserved across syscall >>> [NOTE] R11 has changed:0000000000200ed7 - assuming clobbered by SYSRET insn >>> [OK] R8..R15 did not leak kernel data >>> [RUN] Executing 6-argument 32-bit syscall via INT 80 >>> [OK] Arguments are preserved across syscall >>> [OK] R8..R15 did not leak kernel data >>> [RUN] Running tests under ptrace >>> [RUN] Executing 6-argument 32-bit syscall via VDSO >>> [OK] Arguments are preserved across syscall >>> [OK] R8..R15 did not leak kernel data >>> [RUN] Executing 6-argument 32-bit syscall via INT 80 >>> [OK] Arguments are preserved across syscall >>> [OK] R8..R15 did not leak kernel data >>> >>> On 32-bit paravirt kernel: >>> >>> $ ./test_syscall_vdso_32 >>> [NOTE] Not a 64-bit kernel, won't test R8..R15 leaks >>> [RUN] Executing 6-argument 32-bit syscall via VDSO >>> [WARN] Flags before=0000000000200ed7 id 0 00 o d i s z 0 a 0 p 1 c >>> [WARN] Flags after=0000000000200246 id 0 00 i z 0 0 p 1 >>> [WARN] Flags change=0000000000000c91 0 00 o d s 0 a 0 0 c >>> [OK] Arguments are preserved across syscall >>> [RUN] Executing 6-argument 32-bit syscall via INT 80 >>> [OK] Arguments are preserved across syscall >>> [RUN] Running tests under ptrace >>> [RUN] Executing 6-argument 32-bit syscall via VDSO >>> [OK] Arguments are preserved across syscall >>> [RUN] Executing 6-argument 32-bit syscall via INT 80 >>> [OK] Arguments are preserved across syscall >>> >>> Signed-off-by: Denys Vlasenko <dvlas...@redhat.com> >>> CC: Linus Torvalds <torva...@linux-foundation.org> >>> CC: Steven Rostedt <rost...@goodmis.org> >>> CC: Ingo Molnar <mi...@kernel.org> >>> CC: Borislav Petkov <b...@alien8.de> >>> CC: "H. Peter Anvin" <h...@zytor.com> >>> CC: Andy Lutomirski <l...@amacapital.net> >>> CC: Oleg Nesterov <o...@redhat.com> >>> CC: Frederic Weisbecker <fweis...@gmail.com> >>> CC: Alexei Starovoitov <a...@plumgrid.com> >>> CC: Will Drewry <w...@chromium.org> >>> CC: Kees Cook <keesc...@chromium.org> >>> CC: x...@kernel.org >>> CC: linux-kernel@vger.kernel.org >> >> Acked-by: Andy Lutomirski <l...@kernel.org> >> >> with minor caveats below, none of which are show-stoppers... >> >>> + /* INT80 syscall entrypoint can be used by >>> + * 64-bit programs too, unlike SYSCALL/SYSENTER. >>> + * Therefore it must preserve R12+ >>> + * (they are callee-saved registers in 64-bit C >>> ABI). >>> + * >>> + * This was probably historically not intended, >>> + * but R8..11 are clobbered (cleared to 0). >>> + * IOW: they are the only registers which aren't >>> + * preserved across INT80 syscall. >>> + */ >>> + if (*r64 == 0 && num <= 11) >>> + continue; >> >> Ugh. I'll change my big entry patchset to preserve these and maybe to >> preserve all of the 64-bit regs. > > If you do that, this won't change the ABI: we don't _promise_ > to save them. If we accidentally do, that means nothing. > > If you do that, the test won't fail. The code above does > not require regs to be 0 - there is further code which > also allow them to be unchanged. > > (I'm not very comfortable about additional six push/pops > which are necessary for this to happen. I'm surprised > maintainers tentatively agreed to that - > I was grilled and asked to prove with measurements > that *one* additional push+pop wasn't adding significant overhead).
I suspect that I need to make the series faster. Also, int $0x80 isn't a fast path for any legitimate use case except Debian, and I would argue that Debian is just buggy. --Andy -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/