On Thu, Feb 13, 2020 at 01:28:43PM +0100, Uros Bizjak wrote:
> On Thu, Feb 13, 2020 at 1:06 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> >
> > On Thu, Feb 13, 2020 at 09:29:32AM +0100, Uros Bizjak wrote:
> > > On Wed, Feb 12, 2020 at 1:21 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> > > >
> > > > On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak <ubiz...@gmail.com> wrote:
> > > > >
> > > > > On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak <ubiz...@gmail.com> 
> > > > > > wrote:
> > > > > > >
> > > > > > > On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu <hjl.to...@gmail.com> 
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Since nested function isn't only called directly, there is 
> > > > > > > > ENDBR32 at
> > > > > > > > function entry and we need to skip it for direct jump in 
> > > > > > > > trampoline.
> > > > > > >
> > > > > > > Hm, I'm afraid I don't understand this comment. Can you perhaps 
> > > > > > > rephrase it?
> > > > > > >
> > > > > >
> > > > > > ix86_trampoline_init has
> > > > > >
> > > > > >      /* Compute offset from the end of the jmp to the target 
> > > > > > function.
> > > > > >          In the case in which the trampoline stores the static 
> > > > > > chain on
> > > > > >          the stack, we need to skip the first insn which pushes the
> > > > > >          (call-saved) register static chain; this push is 1 byte.  
> > > > > > */
> > > > > >       offset += 5;
> > > > > >       disp = expand_binop (SImode, sub_optab, fnaddr,
> > > > > >                            plus_constant (Pmode, XEXP (m_tramp, 0),
> > > > > >                                           offset - (MEM_P (chain) ? 
> > > > > > 1 : 0)),
> > > > > >                            NULL_RTX, 1, OPTAB_DIRECT);
> > > > > >       emit_move_insn (mem, disp);
> > > > > >
> > > > > > Without CET, we got
> > > > > >
> > > > > > 0000011 <bar.1878>:
> > > > > >   11: 56                    push   %esi
> > > > > >   12: 55                    push   %ebp   <<<<<< trampoline jumps 
> > > > > > here.
> > > > > >   13: 89 e5                mov    %esp,%ebp
> > > > > >   15: 83 ec 08              sub    $0x8,%esp
> > > > > >
> > > > > > With CET, if bar isn't only called directly, we got
> > > > > >
> > > > > > 00000015 <bar.1878>:
> > > > > >   15: f3 0f 1e fb          endbr32
> > > > > >   19: 56                    push   %esi
> > > > > >   1a: 55                    push   %ebp   <<<<<<<< trampoline jumps 
> > > > > > here.
> > > > > >   1b: 89 e5                mov    %esp,%ebp
> > > > > >   1d: 83 ec 08              sub    $0x8,%esp
> > > > > >
> > > > > > We need to add 4 bytes for trampoline to skip endbr32.
> > > > > >
> > > > > > Here is the updated patch to check if nested function isn't only
> > > > > > called directly,
> > > > >
> > > > > Please figure out the final patch. I don't want to waste my time
> > > > > reviewing different version every half hour. Ping me in a couple of
> > > > > days.
> > > >
> > > > This is the final version:
> > > >
> > > > https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00586.html
> > > >
> > > > You can try the testcase in the patch on any machine with CET binutils
> > > > since ENDBR32 is nop on none-CET machines.  Without this patch,
> > > > the test will fail.
> > >
> > > Please rephrase the comment. I don't understand what it tries to say.
> > >
> >
> > Here is the patch with updated comments.  OK for master and 8/9 branches?
> >
> > Thanks.
> >
> > H.J.
> > ---
> > Since pass_insert_endbranch inserts ENDBR32 at entry of the target
> > function if it may be called indirectly, we also need to skip the
> > 4-byte ENDBR32 for direct jump in trampoline if it is the case.
> 
> "
> Skip ENDBR32 at the entry if called indirectly.
> "
> pass_insert_endbranch inserts ENDBR at the entry of the target
> function.  We need to skip
> >
> > Tested on Linux/x86-64 CET machine with and without -m32.
> >
> > gcc/
> >
> >         PR target/93656
> >         * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at
> >         the target function entry.
> >
> > gcc/testsuite/
> >
> >         PR target/93656
> >         * gcc.target/i386/pr93656.c: New test.
> > ---
> >  gcc/config/i386/i386.c                  | 9 ++++++++-
> >  gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++
> >  2 files changed, 12 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c
> >
> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > index 44bc0e0176a..52640b74cc8 100644
> > --- a/gcc/config/i386/i386.c
> > +++ b/gcc/config/i386/i386.c
> > @@ -16839,9 +16839,16 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, 
> > rtx chain_value)
> >          the stack, we need to skip the first insn which pushes the
> >          (call-saved) register static chain; this push is 1 byte.  */
> >        offset += 5;
> > +      int skip = MEM_P (chain) ? 1 : 0;
> 
> offset += 5 - MEM_P (chain) ? 1: 0;

This is done on purpose sinc there is

  gcc_assert (offset <= TRAMPOLINE_SIZE);

after it.  We shouldn't update offset.

> 
> > +      /* Since pass_insert_endbranch inserts ENDBR32 at entry of the
> > +        target function if it may be called indirectly, we also need
> > +        to skip the 4-byte ENDBR32 if it is the case.  */
> 
> /* Skip ENDBR32 at the entry of the target function when called indirectly.  
> */

Fixed.

> 
> > +      if (need_endbr
> > +         && !cgraph_node::get (fndecl)->only_called_directly_p ())
> > +       skip += 4;
> 
> offset += 4;
> 
> >        disp = expand_binop (SImode, sub_optab, fnaddr,
> >                            plus_constant (Pmode, XEXP (m_tramp, 0),
> > -                                         offset - (MEM_P (chain) ? 1 : 0)),
> > +                                         offset - skip),
> 
> plus_constant (Pmode, XEXP (m_tramp, 0), offset),
> 
> Uros.
> 

Here is the updated patch.  OK for master and 8/9 branches?

Thanks.


H.J.
--
Skip ENDBR32 at the entry of the target function when called indirectly.

Tested on Linux/x86-64 CET machine with and without -m32.

gcc/

        PR target/93656
        * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at
        the target function entry.

gcc/testsuite/

        PR target/93656
        * gcc.target/i386/pr93656.c: New test.
---
 gcc/config/i386/i386.c                  | 8 +++++++-
 gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++
 2 files changed, 11 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 44bc0e0176a..3f3dcd53eb2 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -16839,9 +16839,15 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain_value)
         the stack, we need to skip the first insn which pushes the
         (call-saved) register static chain; this push is 1 byte.  */
       offset += 5;
+      int skip = MEM_P (chain) ? 1 : 0;
+      /* Skip ENDBR32 at the entry of the target function when called
+        indirectly.  */
+      if (need_endbr
+         && !cgraph_node::get (fndecl)->only_called_directly_p ())
+       skip += 4;
       disp = expand_binop (SImode, sub_optab, fnaddr,
                           plus_constant (Pmode, XEXP (m_tramp, 0),
-                                         offset - (MEM_P (chain) ? 1 : 0)),
+                                         offset - skip),
                           NULL_RTX, 1, OPTAB_DIRECT);
       emit_move_insn (mem, disp);
     }
diff --git a/gcc/testsuite/gcc.target/i386/pr93656.c 
b/gcc/testsuite/gcc.target/i386/pr93656.c
new file mode 100644
index 00000000000..f0ac8c8edaa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr93656.c
@@ -0,0 +1,4 @@
+/* { dg-do run { target { ia32 && cet } } } */
+/* { dg-options "-O2 -fcf-protection" } */
+
+#include "pr67770.c"
-- 
2.24.1

Reply via email to