Jiufu Guo <guoji...@linux.ibm.com> writes: > Segher Boessenkool <seg...@kernel.crashing.org> writes: > >> Hi! >> >> On Fri, Dec 20, 2019 at 02:34:05PM +0800, Jiufu Guo wrote: >>> Previously, limited unrolling was enabled at O2 for powerpc in r278034. At >>> that >>> time, -fweb and -frename-registers were not enabled together with >>> -funroll-loops >>> even for -O3. After that, we notice there are some performance degradation >>> on >>> SPEC2006fp which caused by without web and rnreg. >> >> And 2017 was fine on all tests. Annoying :-( >> >>> This patch enable -fweb >>> and -frename-registers for -O3 to align original behavior before r278034. >> >> Okay. Hopefully we can find a way to determine in what circumstances web >> and rnreg help instead of hurt, but until then, the old behaviour is >> certainly the safe choice. >> >>> --- a/gcc/testsuite/gcc.dg/torture/stackalign/builtin-return-1.c >>> +++ b/gcc/testsuite/gcc.dg/torture/stackalign/builtin-return-1.c >>> @@ -2,6 +2,7 @@ >>> /* Originator: Andrew Church <gcczi...@achurch.org> */ >>> /* { dg-do run } */ >>> /* { dg-require-effective-target untyped_assembly } */ >>> +/* { dg-additional-options "-fno-rename-registers" { target { powerpc*-*-* >>> } } } */ >> >> What is this for? What happens without it? > The reason of this fail is: -frename-registers does not work well with > __builtin_return/__builtin_apply which need to save and restore > registers which could not be renamed incorrectly. > > When this case runs with -O3, with this patch, -frename-registers is > enabled. Originally, -frename-registers is enabled with -funroll-loops > instead pure -O3. This change cause this case fail at -O3. >
To align with original behavior better, I updated the patch and attached it at the end of this mail. The updated patch also pass bootstrap and regtests. Is this patch ok for trunk? Thanks, Jiufu >> >> The rs6000/ parts are okay for trunk. Thanks! >> >> >> Segher gcc/ 2019-12-23 Jiufu Guo <guoji...@linux.ibm.com> * gcc/config/rs6000/rs6000.c (rs6000_option_override_internal): Enable -fweb and -frename-registers with -funroll-loops --- gcc/config/rs6000/rs6000.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 23b6d99..dfba6b4 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -4538,12 +4538,19 @@ rs6000_option_override_internal (bool global_init_p) param_sched_pressure_algorithm, SCHED_PRESSURE_MODEL); - /* Explicit -funroll-loops turns -munroll-only-small-loops off. */ - if (((global_options_set.x_flag_unroll_loops && flag_unroll_loops) + /* Explicit -funroll-loops turns -munroll-only-small-loops off, and + turns -fweb and -frename-registers on. */ + if ((global_options_set.x_flag_unroll_loops && flag_unroll_loops) || (global_options_set.x_flag_unroll_all_loops && flag_unroll_all_loops)) - && !global_options_set.x_unroll_only_small_loops) - unroll_only_small_loops = 0; + { + if (!global_options_set.x_unroll_only_small_loops) + unroll_only_small_loops = 0; + if (!global_options_set.x_flag_rename_registers) + flag_rename_registers = 1; + if (!global_options_set.x_flag_web) + flag_web = 1; + } /* If using typedef char *va_list, signal that __builtin_va_start (&ap, 0) can be optimized to -- 2.7.4