On 11/15/2011 10:07 PM, Maxim Kuvyrkov wrote: > On 30/10/2011, at 8:17 AM, Tom de Vries wrote: > >> Richard, >> >> I have a tentative fix for PR50764. > > Richard, > > Tom's patch is good (with the comments below addressed), and I would > appreciate you validating my review with your formal approval. >
Richard, Updated patch according to comments from Maxim. Added test-case. Bootstrapped and reg-tested on x86_64. Ok for trunk? Thanks, - Tom >> >> In the example from the test-case, -fsched2-use-superblocks moves an insn >> from >> block 4 to block 3. >> >> 2 >> bar >> | >> -------+----- >> / \ >> * * >> 5 *------------ 3 >> abort bar >> | >> | >> * >> 4 >> return >> >> >> The insn has a REG_CFA_DEF_CFA note and is frame-related. >> ... >> (insn/f 51 50 52 4 (set (reg:DI 39 r10) >> (mem/c:DI (plus:DI (reg/f:DI 6 bp) >> (const_int -8 [0xfffffffffffffff8])) [3 S8 A8])) pr50764.c:13 >> 62 {*movdi_internal_rex64} >> (expr_list:REG_CFA_DEF_CFA (reg:DI 39 r10) >> (nil))) >> ... >> >> This causes the assert in maybe_record_trace_start to trigger: >> ... >> /* We ought to have the same state incoming to a given trace no >> matter how we arrive at the trace. Anything else means we've >> got some kind of optimization error. */ >> gcc_checking_assert (cfi_row_equal_p (cur_row, ti->beg_row)); >> ... >> >> The assert does not occur with -fno-tree-tail-merge, but that is due to the >> following: >> - -fsched-use-superblocks does not handle dead labels explicitly >> - -freorder-blocks introduces a dead label, which is not removed until after >> sched2 >> - -ftree-tail-merge makes a difference in which block -freorder-blocks >> introduces the dead label. In the case of -ftree-tail-merge, the dead label >> is introduced at the start of block 3, and block 3 and 4 end up in the same >> ebb. In the case of -fno-tree-tail-merge, the dead label is introduced at >> the >> start of block 4, and block 3 and 4 don't end up in the same ebb. >> >> attached untested patch fixes PR50764 in a similar way as the patch for >> PR49994, >> which is also about an ICE in maybe_record_trace_start with >> -fsched2-use-superblocks. >> >> The patch for PR49994 makes sure frame-related instructions are not moved >> past >> the following jump. >> >> Attached patch makes sure frame-related instructions are not moved past the >> preceding jump. >> >> Is this the way to fix this PR? > > Tom, > > Thank you for good analysis, your patch is the right way to go. > > Scheduler should not move frame-related insns from either prologue or > epilogue basic blocks. Currently sched-deps analysis handles the prologue > case, and your patch fixes the epilogue case. The primary reason why we > didn't hit the assert before is due to the fact that we do interblock > scheduling after reload only on few architectures. With single-block > scheduling after reload, which is what we do for most architectures, this > issue cannot arise. > >> >> Index: gcc/sched-deps.c >> =================================================================== >> --- gcc/sched-deps.c (revision 180521) >> +++ gcc/sched-deps.c (working copy) >> @@ -2812,8 +2812,13 @@ sched_analyze_insn (struct deps_desc *de >> during prologue generation and avoid marking the frame pointer setup >> as frame-related at all. */ >> if (RTX_FRAME_RELATED_P (insn)) >> - deps->sched_before_next_jump >> - = alloc_INSN_LIST (insn, deps->sched_before_next_jump); >> + { >> + deps->sched_before_next_jump >> + = alloc_INSN_LIST (insn, deps->sched_before_next_jump); > > This code is rather obscure, so additional comments would be helpful. Please > add something like "Make sure prologue INSN is scheduled before next jump." > before the first statement; and add something like "Make sure epilogue INSN > is not moved before preceding jumps." before the second statement. > >> + >> + if (deps->pending_jump_insns) >> + add_dependence (insn, XEXP (deps->pending_jump_insns, 0), REG_DEP_ANTI); > > Please use "add_dependence_list (insn, deps->pending_jump_insns, 1, > REG_DEP_ANTI);" instead. We want INSN to depend upon all of pending jumps, > not just one of them. The situation where pending_jump_insns has more than a > single jump does not happen in current setup of scheduling runs (as sched-rgn > does not do interblock scheduling after reload), but that may change in the > future. > > OK upon formal approval from Richard or other reviewer. > > Thank you, > > -- > Maxim Kuvyrkov > CodeSourcery / Mentor Graphics > 2011-11-16 Tom de Vries <t...@codesourcery.com> PR rtl-optimization/50764 * (sched_analyze_insn): Make sure frame-related insns are not moved past preceding jump. * (gcc.dg/pr50764.c): New test.
Index: gcc/sched-deps.c =================================================================== --- gcc/sched-deps.c (revision 181377) +++ gcc/sched-deps.c (working copy) @@ -2812,8 +2812,15 @@ sched_analyze_insn (struct deps_desc *de during prologue generation and avoid marking the frame pointer setup as frame-related at all. */ if (RTX_FRAME_RELATED_P (insn)) - deps->sched_before_next_jump - = alloc_INSN_LIST (insn, deps->sched_before_next_jump); + { + /* Make sure prologue insn is scheduled before next jump. */ + deps->sched_before_next_jump + = alloc_INSN_LIST (insn, deps->sched_before_next_jump); + + /* Make sure epilogue insn is scheduled after preceding jumps. */ + if (deps->pending_jump_insns) + add_dependence_list (insn, deps->pending_jump_insns, 1, REG_DEP_ANTI); + } if (code == COND_EXEC) { Index: gcc/testsuite/gcc.dg/pr50764.c =================================================================== --- /dev/null (new file) +++ gcc/testsuite/gcc.dg/pr50764.c (revision 0) @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fsched2-use-superblocks -ftree-tail-merge" } */ + +typedef int aligned __attribute__ ((aligned (64))); +extern void abort (void); + +int bar (void *p); + +void +foo (void) +{ + char *p = __builtin_alloca (13); + aligned i; + + if (bar (p) || bar (&i)) + abort (); +}