Ok, this fixes it. We'll output an ext...#0, which is little more than a MOV,
but that seems appropriate in the circumstance.
Regression tested check-gcc and check-g++ on aarch64-none-elf and
aarch64_be-none-elf.
Ok for trunk?
--Alan
Alan Lawrence wrote:
Yes, reproduced. Seems the mid-end doesn't elide no-op masks at -O0 after all...
Fix in progress, think it's almost (tho not quite) simply a bad assertion.
--Alan
Christophe Lyon wrote:
Hi Alan
This causes g++ to ICE on pr59378 test, for aarch64 targets:
http://cbuild.validation.linaro.org/build/cross-validation/gcc/211058/report-build-info.html
Can you check?
Thanks,
Christophe.
On 19 May 2014 14:53, Marcus Shawcroft <marcus.shawcr...@gmail.com> wrote:
On 23 April 2014 21:22, Alan Lawrence <alan.lawre...@arm.com> wrote:
2014-03-27 Alan Lawrence <alan.lawre...@arm.com>
* config/aarch64/aarch64-builtins.c
(aarch64_types_binopv_qualifiers,
TYPES_BINOPV): New static data.
* config/aarch64/aarch64-simd-builtins.def (im_lane_bound): New
builtin.
* config/aarch64/aarch64-simd.md (aarch64_ext,
aarch64_im_lane_boundsi):
New patterns.
* config/aarch64/aarch64.c (aarch64_expand_vec_perm_const_1): Match
patterns for EXT.
(aarch64_evpc_ext): New function.
* config/aarch64/iterators.md (UNSPEC_EXT): New enum element.
* config/aarch64/arm_neon.h (vext_f32, vext_f64, vext_p8, vext_p16,
vext_s8, vext_s16, vext_s32, vext_s64, vext_u8, vext_u16, vext_u32,
vext_u64, vextq_f32, vextq_f64, vextq_p8, vextq_p16, vextq_s8,
vextq_s16, vextq_s32, vextq_s64, vextq_u8, vextq_u16, vextq_u32,
vextq_u64): Replace __asm with __builtin_shuffle and
im_lane_boundsi.
OK /Marcus
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 898323820201c6a7e52b0224fbeffa2b263b3e39..36173edb3a7cd6818a511ab5bb81557bb65fa287 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8111,9 +8111,6 @@ aarch64_evpc_ext (struct expand_vec_perm_d *d)
return false;
}
- /* The mid-end handles masks that just return one of the input vectors. */
- gcc_assert (location != 0);
-
switch (d->vmode)
{
case V16QImode: gen = gen_aarch64_extv16qi; break;
@@ -8134,7 +8131,10 @@ aarch64_evpc_ext (struct expand_vec_perm_d *d)
if (d->testing_p)
return true;
- if (BYTES_BIG_ENDIAN)
+ /* The case where (location == 0) is a no-op for both big- and little-endian,
+ and is removed by the mid-end at optimization levels -O1 and higher. */
+
+ if (BYTES_BIG_ENDIAN && (location != 0))
{
/* After setup, we want the high elements of the first vector (stored
at the LSB end of the register), and the low elements of the second