On Mon, Jun 10, 2013 at 03:14:04PM -0400, Michael Meissner wrote: > On Fri, Jun 07, 2013 at 10:54:39AM +0930, Alan Modra wrote: > > I'd like to remove -mstrict-align for little-endian powerpc, because > > the assumption that mis-aligned accesses are massively slow isn't true > > for current powerpc processors. However I also don't want to break > > old machines, so probably should add -mstrict-align back for some set > > of cpus. Can anyone tell me the set? > > I don't know the set. I recall the original PPC's (604, 603, etc.) had the > strict alignment rule in little endian rule, and the 750 relaxed it for GPRs > (but not FPRs), but I don't know about later machines.
After some research and bothering people who ought to know, I came up with the following: Since modern powerpc processors that support little-endian (power6 and later) do so properly rather than via the low bit swizzling of older processors (603, 604, 750, 74xx) we don't take traps on all unaligned accesses. Even so, power6 and power7 still take more alignment traps in little-endian mode than big-endian, so it may be advantageous to leave -mstrict-align as the default for power7. Anton tells me that for power8, -mstrict-align is not necessary. It's not desirable since gcc easily loses track of alignment, for instance with -mstrict-align void foo (char *p, char *q) { __builtin_memcpy (p, q, 4); } generates lbz 7,0(4) lbz 8,1(4) lbz 10,2(4) lbz 9,3(4) stb 7,0(3) stb 8,1(3) stb 10,2(3) stb 9,3(3) blr whereas -mno-strict-align gives lwz 9,0(4) stw 9,0(3) blr This patch disables the default -mstrict-align on power8. Rather than testing values from enum processor_type, which is prone to error when adding a new processor, I chose one of the mask bits that happens to be set for power8, and is likely to be set for future new processors. I also take the opportunity to update PROCESSOR_DEFAULT64 to the value most likely appropriate for machines running ELFv2. Bootstrapped and regression tested powerpc64-linux. OK to apply? David, it also occurs to me that PROCESSOR_DEFAULT* isn't the best choice of macro name nowadays. The macro is only used to default -mtune, not -mcpu. If you think it a good idea, I'll s/PROCESSOR_DEFAULT/TUNE_DEFAULT/ throughout config/rs6000 as a followup commit to this one. * config/rs6000/sysv4.h (CC1_ENDIAN_LITTLE_SPEC): Define as empty. * config/rs6000/rs6000.c (rs6000_option_override_internal): Default to strict alignment on older processors when little-endian. * config/rs6000/linux64.h (PROCESSOR_DEFAULT64): Default to power8 for ELFv2. Index: gcc/config/rs6000/linux64.h =================================================================== --- gcc/config/rs6000/linux64.h (revision 205002) +++ gcc/config/rs6000/linux64.h (working copy) @@ -71,7 +71,11 @@ #undef PROCESSOR_DEFAULT #define PROCESSOR_DEFAULT PROCESSOR_POWER7 #undef PROCESSOR_DEFAULT64 +#ifdef LINUX64_DEFAULT_ABI_ELFv2 +#define PROCESSOR_DEFAULT64 PROCESSOR_POWER8 +#else #define PROCESSOR_DEFAULT64 PROCESSOR_POWER7 +#endif /* We don't need to generate entries in .fixup, except when -mrelocatable or -mrelocatable-lib is given. */ Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 205002) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -3217,6 +3217,12 @@ } } + /* If little-endian, default to -mstrict-align on older processors. + Testing for htm matches power8 and later. */ + if (!BYTES_BIG_ENDIAN + && !(processor_target_table[tune_index].target_enable & OPTION_MASK_HTM)) + rs6000_isa_flags |= ~rs6000_isa_flags_explicit & OPTION_MASK_STRICT_ALIGN; + /* Add some warnings for VSX. */ if (TARGET_VSX) { Index: gcc/config/rs6000/sysv4.h =================================================================== --- gcc/config/rs6000/sysv4.h (revision 205002) +++ gcc/config/rs6000/sysv4.h (working copy) @@ -538,12 +538,7 @@ #define CC1_ENDIAN_BIG_SPEC "" -#define CC1_ENDIAN_LITTLE_SPEC "\ -%{!mstrict-align: %{!mno-strict-align: \ - %{!mcall-i960-old: \ - -mstrict-align \ - } \ -}}" +#define CC1_ENDIAN_LITTLE_SPEC "" #define CC1_ENDIAN_DEFAULT_SPEC "%(cc1_endian_big)" -- Alan Modra Australia Development Lab, IBM