Richard Guenther <richard.guent...@gmail.com> writes: > On Thu, Mar 24, 2011 at 11:57 AM, Richard Sandiford > <richard.sandif...@linaro.org> wrote: >> Chung-Lin Tang <clt...@codesourcery.com> writes: >>> PR48183 is a case where ARM NEON instrinsics, under -O -g, produce debug >>> insns that tries to expand OImode (32-byte integer) zero constants, much >>> too large to represent as two HOST_WIDE_INTs; as the internals manual >>> indicates, such large constants are not supported in general, and ICEs >>> on the GET_MODE_BITSIZE(mode) == 2*HOST_BITS_PER_WIDE_INT assertion. >>> >>> This patch allows the cases where the large integer constant is still >>> representable using a single CONST_INT, such as zero(0). Bootstrapped >>> and tested on i686 and x86_64, cross-tested on ARM, all without >>> regressions. Okay for trunk? >>> >>> Thanks, >>> Chung-Lin >>> >>> 2011-03-20 Chung-Lin Tang <clt...@codesourcery.com> >>> >>> * emit-rtl.c (immed_double_const): Allow wider than >>> 2*HOST_BITS_PER_WIDE_INT mode constants when they are >>> representable as a single const_int RTX. >> >> I realise this might be seen as a good expedient fix, but it makes >> me a bit uneasy. Not a very constructive rationale, sorry. >> >> For this particular case, the problem is that vst2q_s32 and the >> like initialise a union directly: >> >> union { int32x4x2_t __i; __builtin_neon_oi __o; } __bu = { __b; }; >> >> and this gets translated into a zeroing of the whole union followed >> by an assignment to __i: >> >> __bu = {}; >> __bu.__i = __b; > > Btw, this looks like a missed optimization in gimplification. Worth > a bugreport (or even a fix). Might be a target but as well, dependent > on how __builtin_neon_oi looks like. Do you have a complete testcase > that reproduces the above with a cross?
Yeah, build cc1 for arm-linux-gnueabi and compile the attached testcase (from Chung-Lin) using: -O2 -g -mfpu=neon -mfloat-abi=softfp Rchard
/* { dg-do compile } */ /* { dg-require-effective-target arm_neon_ok } */ /* { dg-options "-O -g" } */ /* { dg-add-options arm_neon } */ #include <arm_neon.h> void move_16bit_to_32bit (int32_t *dst, const short *src, unsigned n) { unsigned i; int16x4x2_t input; int32x4x2_t mid; int32x4x2_t output; for (i = 0; i < n/2; i += 8) { input = vld2_s16(src + i); mid.val[0] = vmovl_s16(input.val[0]); mid.val[1] = vmovl_s16(input.val[1]); output.val[0] = vshlq_n_s32(mid.val[0], 8); output.val[1] = vshlq_n_s32(mid.val[1], 8); vst2q_s32((int32_t *)dst + i, output); } }