On 01/03/2018 01:49 PM, Jeff Law wrote:
> On 01/03/2018 01:36 PM, Jakub Jelinek wrote:
>> On Wed, Jan 03, 2018 at 12:57:10PM -0700, Jeff Law wrote:
>>>> Ok for trunk if it passes bootstrap/regtest on x86_64 and i686-linux?
>>>>
>>>> 2018-01-03  Jakub Jelinek  <ja...@redhat.com>
>>>>
>>>>    PR target/83641
>>>>    * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
>>>>    noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
>>>>    only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
>>>>    and add REG_CFA_ADJUST_CFA notes in that case to both insns.
>>> I still question how useful this part is, but not enough to object given
>>
>> Reduces bloat in .eh_frame and when unwinding less work is needed.  The
>> patch isn't that large after all.
>>
>>> you've done the work.  I'll go ahead and commit both as a single unit.
>>
>> BTW, I've now bootstrapped/regtested on x86_64-linux and i686-linux your and
>> my patch, the only regression caused by your patch is:
>> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler popl\\t%esi
>> +FAIL: gcc.target/i386/stack-check-12.c scan-assembler pushl\\t%esi
>> on i686-linux, quite understandably, that is exactly the case you are
>> changing, there is %edi clobbered in the function and thus needs to be saved
>> and restored and so no push/pop of %esi is needed.
>> So, this testcase should be tweaked.
> I think the ASM can just be dropped.  It doesn't contribute anything to
> what Florian was trying to show.   Once we drop the ASM we should see
> those probes return.  I'll have to check that obviously.
Here's the final patch -- only changes are smashing Jakub's and my work
together, adding a comment and the stack-check-12 tweak mentioned above.

Committing to the trunk.

jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8df8f232d80..7aa0920fc65 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,17 @@
+2017-01-03  Jakub Jelinek  <ja...@redhat.com>
+           Jeff Law  <l...@redhat.com>
+
+       PR target/83641
+       * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): For
+       noreturn probe, use gen_pop instead of ix86_emit_restore_reg_using_pop,
+       only set RTX_FRAME_RELATED_P on both the push and pop if cfa_reg is sp
+       and add REG_CFA_ADJUST_CFA notes in that case to both insns.
+
+       PR target/83641
+       * config/i386/i386.c (ix86_adjust_stack_and_probe_stack_clash): Do not
+       explicitly probe *sp in a noreturn function if there were any callee
+       register saves or frame pointer is needed.
+
 2018-01-03  Jakub Jelinek  <ja...@redhat.com>
 
        PR debug/83621
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 56baaa7e5e8..c363de93e02 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12217,21 +12217,39 @@ ix86_adjust_stack_and_probe_stack_clash (const 
HOST_WIDE_INT size)
      pointer could be anywhere in the guard page.  The safe thing
      to do is emit a probe now.
 
+     The probe can be avoided if we have already emitted any callee
+     register saves into the stack or have a frame pointer (which will
+     have been saved as well).  Those saves will function as implicit
+     probes.
+
      ?!? This should be revamped to work like aarch64 and s390 where
      we track the offset from the most recent probe.  Normally that
      offset would be zero.  For a noreturn function we would reset
      it to PROBE_INTERVAL - (STACK_BOUNDARY / BITS_PER_UNIT).   Then
      we just probe when we cross PROBE_INTERVAL.  */
-  if (TREE_THIS_VOLATILE (cfun->decl))
+  if (TREE_THIS_VOLATILE (cfun->decl)
+      && !(m->frame.nregs || m->frame.nsseregs || frame_pointer_needed))
     {
       /* We can safely use any register here since we're just going to push
         its value and immediately pop it back.  But we do try and avoid
         argument passing registers so as not to introduce dependencies in
         the pipeline.  For 32 bit we use %esi and for 64 bit we use %rax.  */
       rtx dummy_reg = gen_rtx_REG (word_mode, TARGET_64BIT ? AX_REG : SI_REG);
-      rtx_insn *insn = emit_insn (gen_push (dummy_reg));
-      RTX_FRAME_RELATED_P (insn) = 1;
-      ix86_emit_restore_reg_using_pop (dummy_reg);
+      rtx_insn *insn_push = emit_insn (gen_push (dummy_reg));
+      rtx_insn *insn_pop = emit_insn (gen_pop (dummy_reg));
+      m->fs.sp_offset -= UNITS_PER_WORD;
+      if (m->fs.cfa_reg == stack_pointer_rtx)
+       {
+         m->fs.cfa_offset -= UNITS_PER_WORD;
+         rtx x = plus_constant (Pmode, stack_pointer_rtx, -UNITS_PER_WORD);
+         x = gen_rtx_SET (stack_pointer_rtx, x);
+         add_reg_note (insn_push, REG_CFA_ADJUST_CFA, x);
+         RTX_FRAME_RELATED_P (insn_push) = 1;
+         x = plus_constant (Pmode, stack_pointer_rtx, UNITS_PER_WORD);
+         x = gen_rtx_SET (stack_pointer_rtx, x);
+         add_reg_note (insn_pop, REG_CFA_ADJUST_CFA, x);
+         RTX_FRAME_RELATED_P (insn_pop) = 1;
+       }
       emit_insn (gen_blockage ());
     }
 
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 2c66a0d007e..1b8e7ad6ea5 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2018-01-03  Jeff Law  <l...@redhat.com>
+
+       PR target/83641
+       * gcc.target/i386/stack-check-17.c: New test.
+       * gcc.target/i386/stack-check-12.c: Drop unnecessary asm.
+
 2018-01-03  Jakub Jelinek  <ja...@redhat.com>
 
        PR debug/83621
diff --git a/gcc/testsuite/gcc.target/i386/stack-check-12.c 
b/gcc/testsuite/gcc.target/i386/stack-check-12.c
index 980416946df..74d3a26cad1 100644
--- a/gcc/testsuite/gcc.target/i386/stack-check-12.c
+++ b/gcc/testsuite/gcc.target/i386/stack-check-12.c
@@ -7,7 +7,6 @@ __attribute__ ((noreturn)) void exit (int);
 __attribute__ ((noreturn)) void
 f (void)
 {
-  asm volatile ("nop" ::: "edi");
   exit (1);
 }
 
diff --git a/gcc/testsuite/gcc.target/i386/stack-check-17.c 
b/gcc/testsuite/gcc.target/i386/stack-check-17.c
new file mode 100644
index 00000000000..d2ef83b348a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/stack-check-17.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstack-clash-protection -mtune=generic 
-fomit-frame-pointer" } */
+/* { dg-require-effective-target supports_stack_clash_protection } */
+
+
+int x0, x1;
+void f1 (void);
+void f2 (int, int);
+
+__attribute__ ((noreturn))
+void
+f3 (void)
+{ 
+  int y0 = x0;
+  int y1 = x1;
+  f1 ();
+  f2 (y0, y1);
+  while (1);
+}
+
+/* Verify no explicit probes.  */
+/* { dg-final { scan-assembler-not "or\[ql\]" } } */
+
+/* We also want to verify we did not use a push/pop sequence
+   to probe *sp as the callee register saves are sufficient
+   to probe *sp.
+
+   y0/y1 are live across the call and thus must be allocated
+   into either a stack slot or callee saved register.  The former
+   would be rather dumb.  So assume it does not happen.
+
+   So search for two/four pushes for the callee register saves/argument
+   pushes and no pops (since the function has no reachable epilogue).  */
+/* { dg-final { scan-assembler-times "push\[ql\]" 2 { target { ! ia32 } } } }  
*/
+/* { dg-final { scan-assembler-times "push\[ql\]" 4 { target { ia32 } } } }  */
+/* { dg-final { scan-assembler-not "pop" } } */
+

Reply via email to