The prior patch introduced -fstack-check=clash prologues for the x86. And yet we still saw large allocations in our testing.
It turns out combine-stack-adjustments would take allocate PROBE_INTERVAL probe allocate PROBE_INTERVAL probe allocate PROBE_INTERVAL probe allocate RESIDUAL And turn that into allocate (3 * PROBE_INTERVAL) + residual probe probe probe Adjusting the address of the probes appropriately. Ugh. This patch introduces a new note that the backend can attach to a stack adjustment which essentially tells c-s-a to not merge it into other adjustments. THere's an x86 specific test to verify behavior. Comments/Questions? Ok for the trunk?
* combine-stack-adj.c (combine_stack_adjustments_for_block): Do nothing for stack adjustments with REG_STACK_CHECK. * config/i386/i386.c (pro_epilogue_adjust_stack): Return insn. (ix86_adjust_satck_and_probe_stack_clash): Add REG_STACK_NOTEs. * reg-notes.def (STACK_CHECK): New note. testsuite/ * gcc.target/i386/stack-check-11.c: New test. commit f363b876ccbbc584db85510cd24b80349fcd8260 Author: Jeff Law <l...@torsion.usersys.redhat.com> Date: Wed Jun 28 12:36:49 2017 -0400 Don't combine adjustments for stck probing diff --git a/gcc/combine-stack-adj.c b/gcc/combine-stack-adj.c index 9ec14a3..82d6dba 100644 --- a/gcc/combine-stack-adj.c +++ b/gcc/combine-stack-adj.c @@ -508,6 +508,8 @@ combine_stack_adjustments_for_block (basic_block bb) continue; set = single_set_for_csa (insn); + if (set && find_reg_note (insn, REG_STACK_CHECK, NULL_RTX)) + set = NULL_RTX; if (set) { rtx dest = SET_DEST (set); diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 7098f74..a737300 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -13405,7 +13405,7 @@ ix86_add_queued_cfa_restore_notes (rtx insn) zero if %r11 register is live and cannot be freely used and positive otherwise. */ -static void +static rtx pro_epilogue_adjust_stack (rtx dest, rtx src, rtx offset, int style, bool set_cfa) { @@ -13496,6 +13496,7 @@ pro_epilogue_adjust_stack (rtx dest, rtx src, rtx offset, m->fs.sp_valid = valid; m->fs.sp_realigned = realigned; } + return insn; } /* Find an available register to be used as dynamic realign argument @@ -13837,9 +13838,11 @@ ix86_adjust_stack_and_probe_stack_clash (const HOST_WIDE_INT size) for (i = PROBE_INTERVAL; i <= size; i += PROBE_INTERVAL) { /* Allocate PROBE_INTERVAL bytes. */ - pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx, + rtx insn + = pro_epilogue_adjust_stack (stack_pointer_rtx, stack_pointer_rtx, GEN_INT (-PROBE_INTERVAL), -1, m->fs.cfa_reg == stack_pointer_rtx); + add_reg_note (insn, REG_STACK_CHECK, const0_rtx); /* And probe at *sp. */ emit_stack_probe (stack_pointer_rtx); diff --git a/gcc/reg-notes.def b/gcc/reg-notes.def index 8734d26..18cf7e3 100644 --- a/gcc/reg-notes.def +++ b/gcc/reg-notes.def @@ -223,6 +223,10 @@ REG_NOTE (ARGS_SIZE) pseudo reg. */ REG_NOTE (RETURNED) +/* Indicates the instruction is a stack check probe that should not + be combined with other stack adjustments. */ +REG_NOTE (STACK_CHECK) + /* Used to mark a call with the function decl called by the call. The decl might not be available in the call due to splitting of the call insn. This note is a SYMBOL_REF. */ diff --git a/gcc/testsuite/gcc.target/i386/stack-check-11.c b/gcc/testsuite/gcc.target/i386/stack-check-11.c new file mode 100644 index 0000000..c17b8c6 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/stack-check-11.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fstack-check=clash" } */ + +extern void arf (unsigned long int *, unsigned long int *); +void +frob () +{ + unsigned long int num[859]; + unsigned long int den[859]; + arf (den, num); +} + +/* { dg-final { scan-assembler-times "subq" 4 } } */ +/* { dg-final { scan-assembler-times "orq" 3 } } */ +