https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93946
--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to sandra from comment #9) > Both the new test cases are failing on nios2 at -Os, -O2, and -O3. I've > done some analysis, but I'm not sure exactly where the problem lies, and > whether this is a problem in the nios2 back end or somewhere else. > > long __attribute__((noipa)) > foo (struct bb *bv, void *ptr) > { > struct aa *a = ptr; > struct bb *b = ptr; > bv->b.u.f = 1; > a->a.u.i = 0; > b->b.u.f = 0; > return bv->b.u.f; > } > > is compiling to > > foo: > movi r2, 1 > stw r2, 0(r4) > ldw r2, 0(r4) > stw zero, 0(r5) > stw zero, 4(r5) > ret > > What's going on here is that load instructions have 3-cycle latency on > nios2, so the sched2 pass is moving the "ldw r2, 0(r4)" to load the return > value 2 instructions earlier.... ahead of the store instruction to the same > location via the aliased pointer. :-( > > I'm not an expert on the instruction scheduler, and it seems like the target > hooks and machine description syntax are all focused on modelling pipeline > costs in order to minimize stalls, not telling the scheduler that certain > instructions cannot be correctly reordered at all. Should some other pass > be inserting optimization barriers, or something like that? I feel like I'm > missing some big-picture expertise of where this needs to be fixed, so any > suggestions to point me in the right direction would be appreciated. The instruction scheduler needs to check dependences which should correctly model the constraints here already. So you need to start looking at the RTL before sched2 to see if that's sane (compare to say x86 RTL and look for obvious signs of errors - some target expanders made errors here in the past). Then debug sched2 as to what true/anti/output_dependence tests it performs and why those tell sched2 moving is OK. As you can see the fix involved touching many passes so no doubt there can be more issues elsewhere ...