Andrea Corallo writes: > With this patch the first insn of the low loop overhead 'doloop_begin' > is expanded by 'doloop_modify' in loop-doloop.c. The same does not > happen with SMS.
That certainly works correct as in your first patch, doloop_begin pattern also have "!flag_modulo_sched" condition. > My understanding is that to have it working in that > case too the machine dependent reorg pass should add it later. Am I > correct on this? IMHO, this is not needed is your case. Currently, list of platforms (actually, gcc/config subfolders) which have doloop_end is rather big: aarch64*, arc, arm*, bfin, c6x, ia64, pdp11, pru, rs6000, s390, sh, tilegx*, tilepro, v850 and xtensa. I marked three of them with a star - they actually have a fake pattern, which is applied only with SMS. Reorg_loops from hw-doloop.c (see also https://gcc.gnu.org/ml/gcc-patches/2011-06/msg01593.html and https://gcc.gnu.org/ml/gcc-patches/2011-07/msg00133.html) is used only in arc, bfin, c6x, and xtensa. Certainly some other platforms may have additional loop reorg steps in target-specific part (e.q. pru), but not all of them. And that reorg is actually needed independently, whether SMS is on or off. Actually, the question was: what goes wrong if you remove that "!flag_modulo_sched" condition from three new patterns? I had actually made one step forward, removed that "!flag_modulo_sched" parts in your patch, and made the following simplest testing for such modified patch. I've build and then compared regtest results of two ARM cross-compilers: first was built from clean trunk, second with patch. Both compilers were configured -with-march=armv8.1-m.main and had modified common.opt to enable -fmodulo-sched and -fmodulo-sched-allow-regmoves by default. Regtest results are identical. > Second version of the patch here addressing comments. Thank you, now I see in second patch that aspect was solved. > SMS is disabled in tests not to break them when SMS does loop versioning. And I'm not really sure about this. First of all, there are a lot of scan-assembler-times tests which fail when modulo-scheduler is enabled, probably the same happens when some unrolling parameters are not default. It seems that any non-default optimization which creates more instruction copies can break scan-assembler-times check. IMHO, it is not necessary to workaround this in few particular tests. Second, I'm not sure how dg-skip-if directive works. When one enables SMS setting "Init(1)" directly into common.opt this won't be catched, would it? Roman