On Sun, May 27, 2018 at 01:25:25AM +0200, Allan Sandfeld Jensen wrote: > On Sonntag, 27. Mai 2018 00:05:32 CEST Segher Boessenkool wrote: > > On Sat, May 26, 2018 at 11:32:29AM +0200, Allan Sandfeld Jensen wrote: > > > I brought this subject up earlier, and was told to suggest it again for > > > gcc 9, so I have attached the preliminary changes. > > > > > > My studies have show that with generic x86-64 optimization it reduces > > > binary size with around 0.5%, and when optimizing for x64 targets with > > > SSE4 or better, it reduces binary size by 2-3% on average. The > > > performance changes are negligible however*, and I haven't been able to > > > detect changes in compile time big enough to penetrate general noise on > > > my platform, but perhaps someone has a better setup for that? > > > > > > * I believe that is because it currently works best on non-optimized code, > > > it is better at big basic blocks doing all kinds of things than tightly > > > written inner loops. > > > > > > Anythhing else I should test or report? > > > > What does it do on other architectures? > > > I believe NEON would do the same as SSE4, but I can do a check. For > architectures without SIMD it essentially does nothing.
Sorry, I wasn't clear. What does it do to performance on other architectures? Is it (almost) always a win (or neutral)? If not, it doesn't belong in -O2, not for the generic options at least. (We'll test it on Power soon, it's weekend now :-) ). Segher