Richard Biener <richard.guent...@gmail.com> writes: > On Wed, May 27, 2020 at 6:36 AM Jiufu Guo <guoji...@linux.ibm.com> wrote: >> >> Segher Boessenkool <seg...@kernel.crashing.org> writes: >> >> > Hi! >> > >> > On Tue, May 26, 2020 at 08:58:13AM +0200, Richard Biener wrote: >> >> On Mon, May 25, 2020 at 7:44 PM Segher Boessenkool >> >> <seg...@kernel.crashing.org> wrote: >> >> > Yes, cunroll does not have its own option, and that is a problem. But >> >> > that is easy to fix! Either with an option, or just with params (the >> >> > option wouldn't do more than set params anyway?) >> >> >> >> Well, given coming up with different names for essentially the same >> >> transform is going to be challenging how about sth like >> >> >> >> -funroll-loops={early,late,static,dynamic}[insert better names here] >> > >> > User interface is hard :-) I think luckily we don't need to change >> > anything there yet, just have an internal flag? >> > >> > But complete unrolling is something quite different, so it should have >> > its own flag anyway (at least internally). >> > >> >> note there's also -fpeel-loops which may match the transform >> >> done on GIMPLE better? >> > >> > Peeling and unrolling are the same thing, if doing complete unrolling >> > (or complete peeling), followed by DCE in both cases. Peeling is a >> > nicer name here I think, yeah. >> > >> >> I'm not sure which are the technically >> >> correct terms for unrollings that elide the loop (the backedge). >> > >> > I don't know a better term than "complete", I don't remember ever seeing >> > something else either. >> >> How about "Var(flag_cunroll_grow_size) EnabledBy(funroll-loops || >> funroll-all-loops || fpeel-loops)" Or flag_cunroll_allow_grow_size? >> >> And then using this flags as: >> unsigned int val = tree_unroll_loops_completely (flag_cunroll_grow_size >> || optimize >= 3, true); >> >> And we do not need to enable this flag at -O2. > > Sure this works for me. Note I'd make funroll-loops enabled by > funroll-all-loops so you could simplify the above.
After some checking, I did not use 'EnabledBy(funroll-all-loops)' for funroll-loops, because of special behavior on -fno-unroll-all-loops. Command line -fno-unroll-all-loops turns off flag_unroll_loops, if flag_unroll_loops is enabled implicitly(not through command line). While the orignal logic is: only possitive flag_unroll_all_loops overrides flag_unroll_loops. "if (flag_unroll_all_loops) flag_unroll_loops = 1;" in process_options. Thanks. Jiufu > > Richard. > >> Thanks for all your helpful comments again! >> >> Jiufu >> >> > >> >> We're doing such kind of unrolling even if we cannot statically >> >> decide which of a set of possible exits we take (and internally >> >> call that peeling, if we can statically decide we call it complete >> >> unrolling). >> > >> > "Peeling" is placing some copies of the loop before the loop; >> > "unrolling" is placing a few copies of the loop inside the loop body. >> > Does that match usage here? >> > >> >> The RTL side OTOH only performs classical unrolling, >> >> preserving the backedge with various strategies for the >> >> remaining iterations. >> > >> > And if you do complete unrolling that way, the backedge can be removed, >> > since it can be shown never to be taken. >> > >> >> As said, for the regression on the 10 branch with ppc I'd add >> >> [a hidden] flag that controls the RTL unroller, also set by >> >> -funroll-loops and triggered by the ppc specific heuristics. >> > >> > But the problem is in cunroll? This is so backwards... Because some >> > other transform abuses the unroller flags, adding a second level flag >> > with the same meaning :-( It will work for fixing the regression, >> > sure, and it is slightly less code as well. >> > >> > >> > Segher