On Tue, 2017-12-12 at 14:12 -0600, Segher Boessenkool wrote: > That looks better :-) Okay for trunk, thanks!
As we discussed on IRC before christmas, I've simplified this to use TARGET_EFFICIENT_UNALIGNED_VSX instead of checking explicitly for P8/P9 processors. Has the same effect but is cleaner/clearer. Committed in 256112. Aaron Index: gcc/config/rs6000/rs6000-string.c =================================================================== --- gcc/config/rs6000/rs6000-string.c (revision 256110) +++ gcc/config/rs6000/rs6000-string.c (working copy) @@ -73,7 +73,7 @@ When optimize_size, avoid any significant code bloat; calling memset is about 4 instructions, so allow for one instruction to load zero and three to do clearing. */ - if (TARGET_ALTIVEC && align >= 128) + if (TARGET_ALTIVEC && (align >= 128 || TARGET_EFFICIENT_UNALIGNED_VSX)) clear_step = 16; else if (TARGET_POWERPC64 && (align >= 64 || !STRICT_ALIGNMENT)) clear_step = 8; @@ -90,7 +90,7 @@ machine_mode mode = BLKmode; rtx dest; - if (bytes >= 16 && TARGET_ALTIVEC && align >= 128) + if (bytes >= 16 && TARGET_ALTIVEC && (align >= 128 || TARGET_EFFICIENT_UNALIGNED_VSX)) { clear_bytes = 16; mode = V4SImode; @@ -1260,7 +1260,7 @@ /* Altivec first, since it will be faster than a string move when it applies, and usually not significantly larger. */ - if (TARGET_ALTIVEC && bytes >= 16 && align >= 128) + if (TARGET_ALTIVEC && bytes >= 16 && (TARGET_EFFICIENT_UNALIGNED_VSX || align >= 128)) { move_bytes = 16; mode = V4SImode; -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain