On Thu, Feb 13, 2020 at 5:10 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Thu, Feb 13, 2020 at 1:42 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > On Thu, Feb 13, 2020 at 01:28:43PM +0100, Uros Bizjak wrote: > > > On Thu, Feb 13, 2020 at 1:06 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > > > On Thu, Feb 13, 2020 at 09:29:32AM +0100, Uros Bizjak wrote: > > > > > On Wed, Feb 12, 2020 at 1:21 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > > > > > > > On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak <ubiz...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu <hjl.to...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak > > > > > > > > <ubiz...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu <hjl.to...@gmail.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > Since nested function isn't only called directly, there is > > > > > > > > > > ENDBR32 at > > > > > > > > > > function entry and we need to skip it for direct jump in > > > > > > > > > > trampoline. > > > > > > > > > > > > > > > > > > Hm, I'm afraid I don't understand this comment. Can you > > > > > > > > > perhaps rephrase it? > > > > > > > > > > > > > > > > > > > > > > > > > ix86_trampoline_init has > > > > > > > > > > > > > > > > /* Compute offset from the end of the jmp to the target > > > > > > > > function. > > > > > > > > In the case in which the trampoline stores the static > > > > > > > > chain on > > > > > > > > the stack, we need to skip the first insn which pushes > > > > > > > > the > > > > > > > > (call-saved) register static chain; this push is 1 > > > > > > > > byte. */ > > > > > > > > offset += 5; > > > > > > > > disp = expand_binop (SImode, sub_optab, fnaddr, > > > > > > > > plus_constant (Pmode, XEXP (m_tramp, > > > > > > > > 0), > > > > > > > > offset - (MEM_P > > > > > > > > (chain) ? 1 : 0)), > > > > > > > > NULL_RTX, 1, OPTAB_DIRECT); > > > > > > > > emit_move_insn (mem, disp); > > > > > > > > > > > > > > > > Without CET, we got > > > > > > > > > > > > > > > > 0000011 <bar.1878>: > > > > > > > > 11: 56 push %esi > > > > > > > > 12: 55 push %ebp <<<<<< trampoline > > > > > > > > jumps here. > > > > > > > > 13: 89 e5 mov %esp,%ebp > > > > > > > > 15: 83 ec 08 sub $0x8,%esp > > > > > > > > > > > > > > > > With CET, if bar isn't only called directly, we got > > > > > > > > > > > > > > > > 00000015 <bar.1878>: > > > > > > > > 15: f3 0f 1e fb endbr32 > > > > > > > > 19: 56 push %esi > > > > > > > > 1a: 55 push %ebp <<<<<<<< trampoline > > > > > > > > jumps here. > > > > > > > > 1b: 89 e5 mov %esp,%ebp > > > > > > > > 1d: 83 ec 08 sub $0x8,%esp > > > > > > > > > > > > > > > > We need to add 4 bytes for trampoline to skip endbr32. > > > > > > > > > > > > > > > > Here is the updated patch to check if nested function isn't only > > > > > > > > called directly, > > > > > > > > > > > > > > Please figure out the final patch. I don't want to waste my time > > > > > > > reviewing different version every half hour. Ping me in a couple > > > > > > > of > > > > > > > days. > > > > > > > > > > > > This is the final version: > > > > > > > > > > > > https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00586.html > > > > > > > > > > > > You can try the testcase in the patch on any machine with CET > > > > > > binutils > > > > > > since ENDBR32 is nop on none-CET machines. Without this patch, > > > > > > the test will fail. > > > > > > > > > > Please rephrase the comment. I don't understand what it tries to say. > > > > > > > > > > > > > Here is the patch with updated comments. OK for master and 8/9 > > > > branches? > > > > > > > > Thanks. > > > > > > > > H.J. > > > > --- > > > > Since pass_insert_endbranch inserts ENDBR32 at entry of the target > > > > function if it may be called indirectly, we also need to skip the > > > > 4-byte ENDBR32 for direct jump in trampoline if it is the case. > > > > > > " > > > Skip ENDBR32 at the entry if called indirectly. > > > " > > > pass_insert_endbranch inserts ENDBR at the entry of the target > > > function. We need to skip > > > > > > > > Tested on Linux/x86-64 CET machine with and without -m32. > > > > > > > > gcc/ > > > > > > > > PR target/93656 > > > > * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at > > > > the target function entry. > > > > > > > > gcc/testsuite/ > > > > > > > > PR target/93656 > > > > * gcc.target/i386/pr93656.c: New test. > > > > --- > > > > gcc/config/i386/i386.c | 9 ++++++++- > > > > gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++ > > > > 2 files changed, 12 insertions(+), 1 deletion(-) > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c > > > > > > > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > > > > index 44bc0e0176a..52640b74cc8 100644 > > > > --- a/gcc/config/i386/i386.c > > > > +++ b/gcc/config/i386/i386.c > > > > @@ -16839,9 +16839,16 @@ ix86_trampoline_init (rtx m_tramp, tree > > > > fndecl, rtx chain_value) > > > > the stack, we need to skip the first insn which pushes the > > > > (call-saved) register static chain; this push is 1 byte. */ > > > > offset += 5; > > > > + int skip = MEM_P (chain) ? 1 : 0; > > > > > > offset += 5 - MEM_P (chain) ? 1: 0; > > > > This is done on purpose sinc there is > > > > gcc_assert (offset <= TRAMPOLINE_SIZE); > > > > after it. We shouldn't update offset. > > > > > > > > > + /* Since pass_insert_endbranch inserts ENDBR32 at entry of the > > > > + target function if it may be called indirectly, we also need > > > > + to skip the 4-byte ENDBR32 if it is the case. */ > > > > > > /* Skip ENDBR32 at the entry of the target function when called > > > indirectly. */ > > > > Fixed. > > > > > > > > > + if (need_endbr > > > > + && !cgraph_node::get (fndecl)->only_called_directly_p ()) > > > > + skip += 4; > > > > > > offset += 4; > > > > > > > disp = expand_binop (SImode, sub_optab, fnaddr, > > > > plus_constant (Pmode, XEXP (m_tramp, 0), > > > > - offset - (MEM_P (chain) ? 1 : > > > > 0)), > > > > + offset - skip), > > > > > > plus_constant (Pmode, XEXP (m_tramp, 0), offset), > > > > > > Uros. > > > > > > > Here is the updated patch. OK for master and 8/9 branches? > > > > Thanks. > > > > > > H.J. > > -- > > Skip ENDBR32 at the entry of the target function when called indirectly. > > Hm, this was sent too early, I was trying to say: > > "Skip ENDBR32 at the entry of target function when initializing trampoline." > > > > > Tested on Linux/x86-64 CET machine with and without -m32. > > > > gcc/ > > > > PR target/93656 > > * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at > > the target function entry. > > > > gcc/testsuite/ > > > > PR target/93656 > > * gcc.target/i386/pr93656.c: New test. > > LGTM for mainline, please wait a week before backporting. > > Uros. > > > --- > > gcc/config/i386/i386.c | 8 +++++++- > > gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++ > > 2 files changed, 11 insertions(+), 1 deletion(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c > > > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > > index 44bc0e0176a..3f3dcd53eb2 100644 > > --- a/gcc/config/i386/i386.c > > +++ b/gcc/config/i386/i386.c > > @@ -16839,9 +16839,15 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, > > rtx chain_value) > > the stack, we need to skip the first insn which pushes the > > (call-saved) register static chain; this push is 1 byte. */ > > offset += 5; > > + int skip = MEM_P (chain) ? 1 : 0; > > + /* Skip ENDBR32 at the entry of the target function when called > > + indirectly. */ > > Just say "Skip ENDBR32 at the entry of the target function." > > The condition below is clear, the purpose is not. >
This is the patch I am checking in. I will backport it next week. Thanks. -- H.J.
From 66e602cc3d1ada59013f99e2a9a629135c369b22 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" <hjl.to...@gmail.com> Date: Mon, 10 Feb 2020 11:10:52 -0800 Subject: [PATCH] i386: Skip ENDBR32 at the target function entry Skip ENDBR32 at the target function entry when initializing trampoline. Tested on Linux/x86-64 CET machine with and without -m32. gcc/ PR target/93656 * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at the target function entry. gcc/testsuite/ PR target/93656 * gcc.target/i386/pr93656.c: New test. --- gcc/config/i386/i386.c | 7 ++++++- gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++ 2 files changed, 10 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 44bc0e0176a..dac7a3fc5fd 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -16839,9 +16839,14 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value) the stack, we need to skip the first insn which pushes the (call-saved) register static chain; this push is 1 byte. */ offset += 5; + int skip = MEM_P (chain) ? 1 : 0; + /* Skip ENDBR32 at the entry of the target function. */ + if (need_endbr + && !cgraph_node::get (fndecl)->only_called_directly_p ()) + skip += 4; disp = expand_binop (SImode, sub_optab, fnaddr, plus_constant (Pmode, XEXP (m_tramp, 0), - offset - (MEM_P (chain) ? 1 : 0)), + offset - skip), NULL_RTX, 1, OPTAB_DIRECT); emit_move_insn (mem, disp); } diff --git a/gcc/testsuite/gcc.target/i386/pr93656.c b/gcc/testsuite/gcc.target/i386/pr93656.c new file mode 100644 index 00000000000..f0ac8c8edaa --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr93656.c @@ -0,0 +1,4 @@ +/* { dg-do run { target { ia32 && cet } } } */ +/* { dg-options "-O2 -fcf-protection" } */ + +#include "pr67770.c" -- 2.24.1