https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79712
wilco at gcc dot gnu.org changed:
What|Removed |Added
CC||wilco at gcc dot gnu.org
--- C
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79712
Bill Schmidt changed:
What|Removed |Added
CC||wschmidt at gcc dot gnu.org
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79712
Richard Biener changed:
What|Removed |Added
CC||matz at gcc dot gnu.org
--- Comment #9
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79712
--- Comment #8 from PeteVine ---
Seeing as unrolling does such a great job on aarch64, surpassing clang, should
we leave the ARM issue bunched together with this one?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79712
Richard Biener changed:
What|Removed |Added
Keywords||missed-optimization
--- Comment #7 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79712
--- Comment #6 from PeteVine ---
The difference between clang and gcc is even greater on ARMv7 Cortex A5 but
there's no way to catch up through unrolling (no effect):
gcc version 7.0.1 20170225:1227.2 Kpos/sec
clang 3.6:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79712
--- Comment #5 from PeteVine ---
Clang however gets no further improvement from -funroll-loops meaning a simple
`-O3 -mcpu=cortex-a53` produces much better performance than gcc without
unrolling.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79712
--- Comment #4 from PeteVine ---
It's a gcc version 7.0.1 20170220 (experimental) (GCC) configured with:
--enable-languages=c,c++,fortran --prefix=/usr/gcc7 --program-suffix=-7
--enable-shared --enable-linker-build-id --libexecdir=/usr/gcc7/lib
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79712
Andrew Pinski changed:
What|Removed |Added
Component|middle-end |target
--- Comment #3 from Andrew Pinski