Re: [PATCH] i386: Enable small loop unrolling for O2

2022-10-28 Thread Richard Biener via Gcc-patches
On Fri, Oct 28, 2022 at 10:08 AM Hongyu Wang wrote: > > > Ugh, that's all quite ugly and unmaintainable, no? > Agreed, I have the same feeling. > > > I'm quite sure that if this works it's not by intention. Doesn't this > > also disable > > register renaming and web when the user explicitely spec

Re: [PATCH] i386: Enable small loop unrolling for O2

2022-10-28 Thread Hongyu Wang via Gcc-patches
> Ugh, that's all quite ugly and unmaintainable, no? Agreed, I have the same feeling. > I'm quite sure that if this works it's not by intention. Doesn't this > also disable > register renaming and web when the user explicitely specifies -funroll-loops? > > Doesn't this change -funroll-loops behav

Re: [PATCH] i386: Enable small loop unrolling for O2

2022-10-28 Thread Richard Biener via Gcc-patches
On Wed, Oct 26, 2022 at 7:53 AM Hongyu Wang wrote: > > Hi, > > Inspired by rs6000 and s390 port changes, this patch > enables loop unrolling for small size loop at O2 by default. > The default behavior is to unroll loop with unknown trip-count and > less than 4 insns by 1 time. > > This improves 5

Re: [PATCH] i386: Enable small loop unrolling for O2

2022-10-26 Thread Hongyu Wang via Gcc-patches
> Does this setting benefit all targets? IIRC, in the past all > benchmarks also enabled -funroll-loops, so it looks to me that > unrolling small loops by default is a good compromise. The idea to unroll small loops can be explained from the x86 micro-architecture. Modern x86 processors has multi

Re: [PATCH] i386: Enable small loop unrolling for O2

2022-10-25 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 26, 2022 at 7:53 AM Hongyu Wang wrote: > > Hi, > > Inspired by rs6000 and s390 port changes, this patch > enables loop unrolling for small size loop at O2 by default. > The default behavior is to unroll loop with unknown trip-count and > less than 4 insns by 1 time. > > This improves 5

[PATCH] i386: Enable small loop unrolling for O2

2022-10-25 Thread Hongyu Wang via Gcc-patches
Hi, Inspired by rs6000 and s390 port changes, this patch enables loop unrolling for small size loop at O2 by default. The default behavior is to unroll loop with unknown trip-count and less than 4 insns by 1 time. This improves 548.exchange2 by 3.5% on icelake and 6% on zen3 with 1.2% codesize in