Re: [PATCH 1/2] rs6000: tune cunroll for simple loops at O2

Jiufu Guo via Gcc-patches Thu, 21 May 2020 02:34:22 -0700

Jan Hubicka <hubi...@ucw.cz> writes:

>> Segher Boessenkool <seg...@kernel.crashing.org> writes:
>> 
>> > On Wed, May 20, 2020 at 12:30:30PM +0200, Richard Biener wrote:
>> >> I think this is the wrong way to approach this.  You're doing too many
>> >> things at once.  Try to fix the powerpc regression with the extra
>> >> flag_rtl_unroll_loops, that could be backported.  Then you can
>> 
>> Or flag_complete_unroll_loops(-fcomplete-unroll-loops) for GIMPLE
>> cunroll?
>> >> independently see whether enabling more unrolling at -O2 makes
>> >> sense.  Because currently we _do_ unroll at -O2 when it does
>> >> not increase size.  Its just your patches make this as aggressive
>> >> as -O3.
>> 
>> I'm also thinking about enabling more cunroll at -O2 even with some size
>> increasing.  Full cunroll enablement make it like -O3. As some
>> discussion in PRs (e.g. PR88760), small/simple loops unrolling may be in
>> favor of some platforms (but not for all platforms, like x86_64?).  This
>> would make us to have target specified hook.  Or do some generic
>> setting: accept to unroll/peel limit times if the loop body is simple
>> and small, together with target specific hook.
>
> We now have --params that can be tuned differently for -O2 and -O3 so
> looking into cunroll was one of my todo for GCC 10 -O2 retuning but i did
> not get any very conclusive benchmark results outside SPEC. 
> I planned to return to it next stage1, so it may be good time.
> Do you have any benchmarks on ppc?


541.leela_r, 548.exchange2_r and 557.xz_r from SPEC2017 are visbily
affected by cunroll.  They can be used to tune cunroll, I think. 

> Of couse there is no need to keep same defaults for all targets, but in
> general having target specific defaults increases number of knobs we
> need to check and keep up to date.

Thanks,
Jiufu

>
> Honza

>> 
>> Any comments? Thanks!
>> Jiufu
>> 
>> >
>> > Just do a separate flag (and option) for cunroll, instead?
>> >
>> > The RTL unroller is *the* unroller, and has been since forever.
>> >
>> >
>> > Segher

Re: [PATCH 1/2] rs6000: tune cunroll for simple loops at O2

Reply via email to