Hi, While trying to force some particular conditions out of the scheduler on a private branch, I was toying with some very aggressive macro-fusion patterns. In doing so, I hit on a latent bug which can cause miscompilation.
The exact conditions for the bug seem to be: We go in to macro fusion with a previous pattern which can be rewritten as a function of the current pattern. That is to say, we have something like: x21 = f00f x1 = load (x21 + 16) x21 = x21 + 8 Which could be rewritten as: x21 = f00f x21 = x21 + 8 x1 = load (x21 + 8) If we choose to fuse the instructions, we set the SCHED_GROUP_P flag on x21 = 21 + 8. recompute_todo_spec will potentially apply the rewriting above, and will apply it regardless of SCHED_GROUP_P. It also will not clear the flag, so we end up pulling the fused instruction to the head of the wrong scheduling group. After scheduling, this gives the code: x21 = x21 + 8 x21 = f00f x1 = load (x21 + 8) Which is clearly bogus. To show this concretely with scheduler dumps (with some additional debug information): ;; ====================================================== ;; -- basic block 8 from 339 to 344 -- after reload ;; ====================================================== ;; | insn | prio | ;; | 339 | 3 | x21=x19+low(`*.LANCHOR0') ca57_sx1|ca57_sx2 ;; | 341 | 2 | x21=x21+0x8 ca57_sx1|ca57_sx2 ;; | 342 | 1 | x1=[x21+0x10] ca57_load_model ;; | 204 | 0 | x21=x21+0x8 ca57_sx1|ca57_sx2 ;; | 344 | 0 | pc={(x1==0)?L4717:pc} ca57_bx ;; --------------- forward dependences: ------------ ;; --- Region Dependences --- b 8 bb 0 ;; insn code bb dep prio cost reservation ;; ---- ---- -- --- ---- ---- ----------- ;; 339 739 8 0 3 2 ca57_sx1|ca57_sx2 : 344 341m ;; 341 87 8 1 2 2 ca57_sx1|ca57_sx2 : 344 342m ;; 342 40 8 1 1 5 ca57_load_model : 344m 204 ;; + 204 87 8 1 0 2 ca57_sx1|ca57_sx2 : 344 ;; 344 17 8 4 0 0 ca57_bx : ;; Looking at: insn 204 ;; Can rewrite predecessor of :: 204 :: from: (mem/f/c:DI (plus:DI (reg/f:DI 21 x21 [1023]) (const_int 16 [0x10])) [3 MEM[(struct _Rb_tree_node_baseD.35435 * *)&_ZN12_GLOBAL__N_120section_workload_mapED.48605 + 16B]+0 S8 A64]) ;; TO (mem/f/c:DI (plus:DI (reg/f:DI 21 x21 [1023]) (const_int 8 [0x8])) [3 MEM[(struct _Rb_tree_node_baseD.35435 * *)&_ZN12_GLOBAL__N_120section_workload_mapED.48605 + 16B]+0 S8 A64]) ;; dependencies resolved: insn 204 ;; tick updated: insn 204 into ready ;; Ready list after queue_to_ready: 204:42:prio=0 339:39:prio=3 ;; Ready list after ready_sort: 204:42:prio=0 339:39:prio=3 ;; Ready list (t = 0): 204:42:prio=0 339:39:prio=3 ;; max_issue among 2 insns: 204:42:prio=0 339:39:prio=3 I'm fairly sure that it can't trigger on trunk with any of the patterns in the AArch64/i386 back ends, and I can't give a reduced testcase showing the bug (miscompilation of proprietary code :( ). My fix is simply to clear the SCHED_GROUP_P flag if we do a rewrite as above, this seems sensible, but it may not be - I don't know this code, and I might be pessimising (or worse, breaking) code which requires fusion between the re-ordered instructions. I've bootstrapped the patch on x86_64 and aarch64-none-linux-gnu with no issues. Does this look sensible? I'm fairly sure my analysis of the bug is sound, but I could imagine other possible fixes to the same issue. I guess the most obvious would be to not allow the rewriting at all for instructions with SCHED_GROUP_P set. Thanks, James --- 2015-02-06 James Greenhalgh <james.greenha...@arm.com> * haifa-sched.c (recompute_todo_spec): After applying a replacement and cancelling a dependency, also clear the SCHED_GROUP_P flag.
diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c index 75d2421..730a8db 100644 --- a/gcc/haifa-sched.c +++ b/gcc/haifa-sched.c @@ -1293,6 +1293,10 @@ recompute_todo_spec (rtx_insn *next, bool for_backtrack) apply_replacement (dep, true); } DEP_STATUS (dep) |= DEP_CANCELLED; + /* Also cancel any scheduler grouping, or we can + end up falsely claiming we are independent of + previous instructions. */ + SCHED_GROUP_P (next) = 0; } } return 0;