On Thu, 17 Nov 2016, Martin Reinecke wrote: > Hi, > > At some point in May 2016 there was a patch to the gcc trunk which > caused one of my numerical codes to give incorrect results when compiled > with this gcc version. This may of course be caused by some undefined > behavior I'm unknowingly invoking in the code, or it may be a code > generation bug in gcc. I tried to isolate the exact gcc commit that > caused the change, but I got stuck... > > Starting from SVN revision 236320: > > 2016-05-17 Richard Biener <rguent...@suse.de> > > PR tree-optimization/71132 > * tree-loop-distribution.c (create_rdg_cd_edges): Pass in loop. > Only add control dependences for blocks in the loop. > (build_rdg): Adjust. > (generate_code_for_partition): Return whether loop should > be destroyed and delay that. > (distribute_loop): Likewise. > (pass_loop_distribution::execute): Record loops to be destroyed > and perform delayed destroying of loops. > > gcc crashes with an ICE when trying to compile the code. > Compilation works again starting with revision 236361: > > 2016-05-18 Richard Biener <rguent...@suse.de> > > PR tree-optimization/71168 > * tree-loop-distribution.c (distribute_loop): Move *destroy_p > initialization earlier. > > * gcc.dg/torture/pr71168.c: New testcase. > > but then the compiled code produces incorrect results. > > I can provide an unreduced test case, but it is several thousand lines > long and I don't have a clue how to reduce it yet. Judging from the > ChangeLog entries I strongly suspect that the tree-loop-distribution > changes caused the different behavior. > Could someone please give me a hint which sort of loops are most likely > to be compiled differently after the changes, so that I have a better > idea where to look for the problem?
You could compare -fopt-info-optimized output which for example says testsuite/gcc.dg/tree-ssa/ldist-12.c:12:3: note: Loop 1 distributed: split to 2 loops and 0 library calls. note that if you do not explicitely enable -ftree-loop-distribution then you only get (at -O3+) memcpy, memmove and memset recognition enabled. Note the above changes should not have changed code generation so likely the offending change was done between those revisions. You could bisect further by either applying the second change or reverting the earlier change inbetween those revisions. Suspicious is maybe r236356 (transforming x+x+x+x+x to 5*x with -funsafe-math-optimizations). Richard.