On 11/15/2011 10:07 PM, Maxim Kuvyrkov wrote:
> On 30/10/2011, at 8:17 AM, Tom de Vries wrote:
> 
>> Richard,
>>
>> I have a tentative fix for PR50764.
> 
> Richard,
> 
> Tom's patch is good (with the comments below addressed), and I would 
> appreciate you validating my review with your formal approval.
> 

Richard,

Updated patch according to comments from Maxim. Added test-case. Bootstrapped
and reg-tested on x86_64.

Ok for trunk?

Thanks,
- Tom

>>
>> In the example from the test-case, -fsched2-use-superblocks moves an insn 
>> from
>> block 4 to block 3.
>>
>>           2
>>          bar
>>           |
>>    -------+-----
>>   /             \
>>  *               *
>>  5 *------------ 3
>> abort            bar
>>                  |
>>                  |
>>                  *
>>                  4
>>                return
>>
>>
>> The insn has a REG_CFA_DEF_CFA note and is frame-related.
>> ...
>> (insn/f 51 50 52 4 (set (reg:DI 39 r10)
>>        (mem/c:DI (plus:DI (reg/f:DI 6 bp)
>>                (const_int -8 [0xfffffffffffffff8])) [3 S8 A8])) pr50764.c:13
>> 62 {*movdi_internal_rex64}
>>     (expr_list:REG_CFA_DEF_CFA (reg:DI 39 r10)
>>        (nil)))
>> ...
>>
>> This causes the assert in maybe_record_trace_start to trigger:
>> ...
>>      /* We ought to have the same state incoming to a given trace no
>>       matter how we arrive at the trace.  Anything else means we've
>>       got some kind of optimization error.  */
>>      gcc_checking_assert (cfi_row_equal_p (cur_row, ti->beg_row));
>> ...
>>
>> The assert does not occur with -fno-tree-tail-merge, but that is due to the
>> following:
>> - -fsched-use-superblocks does not handle dead labels explicitly
>> - -freorder-blocks introduces a dead label, which is not removed until after
>>  sched2
>> - -ftree-tail-merge makes a difference in which block -freorder-blocks
>>  introduces the dead label. In the case of -ftree-tail-merge, the dead label
>>  is introduced at the start of block 3, and block 3 and 4 end up in the same
>>  ebb. In the case of -fno-tree-tail-merge, the dead label is introduced at 
>> the
>>  start of block 4, and block 3 and 4 don't end up in the same ebb.
>>
>> attached untested patch fixes PR50764 in a similar way as the patch for 
>> PR49994,
>> which is also about an ICE in maybe_record_trace_start with
>> -fsched2-use-superblocks.
>>
>> The patch for PR49994 makes sure frame-related instructions are not moved 
>> past
>> the following jump.
>>
>> Attached patch makes sure frame-related instructions are not moved past the
>> preceding jump.
>>
>> Is this the way to fix this PR?
> 
> Tom,
> 
> Thank you for good analysis, your patch is the right way to go.
> 
> Scheduler should not move frame-related insns from either prologue or 
> epilogue basic blocks.  Currently sched-deps analysis handles the prologue 
> case, and your patch fixes the epilogue case.  The primary reason why we 
> didn't hit the assert before is due to the fact that we do interblock 
> scheduling after reload only on few architectures.  With single-block 
> scheduling after reload, which is what we do for most architectures, this 
> issue cannot arise.
> 
>>
>> Index: gcc/sched-deps.c
>> ===================================================================
>> --- gcc/sched-deps.c (revision 180521)
>> +++ gcc/sched-deps.c (working copy)
>> @@ -2812,8 +2812,13 @@ sched_analyze_insn (struct deps_desc *de
>>      during prologue generation and avoid marking the frame pointer setup
>>      as frame-related at all.  */
>>   if (RTX_FRAME_RELATED_P (insn))
>> -    deps->sched_before_next_jump
>> -      = alloc_INSN_LIST (insn, deps->sched_before_next_jump);
>> +    {
>> +      deps->sched_before_next_jump
>> +    = alloc_INSN_LIST (insn, deps->sched_before_next_jump);
> 
> This code is rather obscure, so additional comments would be helpful.  Please 
> add something like "Make sure prologue INSN is scheduled before next jump." 
> before the first statement; and add something like "Make sure epilogue INSN 
> is not moved before preceding jumps." before the second statement.
> 
>> +
>> +      if (deps->pending_jump_insns)
>> +    add_dependence (insn, XEXP (deps->pending_jump_insns, 0), REG_DEP_ANTI);
> 
> Please use "add_dependence_list (insn, deps->pending_jump_insns, 1, 
> REG_DEP_ANTI);" instead.  We want INSN to depend upon all of pending jumps, 
> not just one of them.  The situation where pending_jump_insns has more than a 
> single jump does not happen in current setup of scheduling runs (as sched-rgn 
> does not do interblock scheduling after reload), but that may change in the 
> future.
> 
> OK upon formal approval from Richard or other reviewer.
> 
> Thank you,
> 
> --
> Maxim Kuvyrkov
> CodeSourcery / Mentor Graphics
> 

2011-11-16  Tom de Vries  <t...@codesourcery.com>

        PR rtl-optimization/50764
        * (sched_analyze_insn): Make sure frame-related insns are not moved past
        preceding jump.

        * (gcc.dg/pr50764.c): New test.


Index: gcc/sched-deps.c
===================================================================
--- gcc/sched-deps.c (revision 181377)
+++ gcc/sched-deps.c (working copy)
@@ -2812,8 +2812,15 @@ sched_analyze_insn (struct deps_desc *de
      during prologue generation and avoid marking the frame pointer setup
      as frame-related at all.  */
   if (RTX_FRAME_RELATED_P (insn))
-    deps->sched_before_next_jump
-      = alloc_INSN_LIST (insn, deps->sched_before_next_jump);
+    {
+      /* Make sure prologue insn is scheduled before next jump.  */
+      deps->sched_before_next_jump
+	= alloc_INSN_LIST (insn, deps->sched_before_next_jump);
+
+      /* Make sure epilogue insn is scheduled after preceding jumps.  */
+      if (deps->pending_jump_insns)
+	add_dependence_list (insn, deps->pending_jump_insns, 1, REG_DEP_ANTI);
+    }
 
   if (code == COND_EXEC)
     {
Index: gcc/testsuite/gcc.dg/pr50764.c
===================================================================
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr50764.c (revision 0)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fsched2-use-superblocks -ftree-tail-merge" } */
+
+typedef int aligned __attribute__ ((aligned (64)));
+extern void abort (void);
+
+int bar (void *p);
+
+void
+foo (void)
+{
+  char *p = __builtin_alloca (13);
+  aligned i;
+
+  if (bar (p) || bar (&i))
+    abort ();
+}

Reply via email to