Richard Guenther <richard.guent...@gmail.com> writes:
> On Thu, Mar 24, 2011 at 11:57 AM, Richard Sandiford
> <richard.sandif...@linaro.org> wrote:
>> Chung-Lin Tang <clt...@codesourcery.com> writes:
>>> PR48183 is a case where ARM NEON instrinsics, under -O -g, produce debug
>>> insns that tries to expand OImode (32-byte integer) zero constants, much
>>> too large to represent as two HOST_WIDE_INTs; as the internals manual
>>> indicates, such large constants are not supported in general, and ICEs
>>> on the GET_MODE_BITSIZE(mode) == 2*HOST_BITS_PER_WIDE_INT assertion.
>>>
>>> This patch allows the cases where the large integer constant is still
>>> representable using a single CONST_INT, such as zero(0). Bootstrapped
>>> and tested on i686 and x86_64, cross-tested on ARM, all without
>>> regressions. Okay for trunk?
>>>
>>> Thanks,
>>> Chung-Lin
>>>
>>> 2011-03-20  Chung-Lin Tang  <clt...@codesourcery.com>
>>>
>>>       * emit-rtl.c (immed_double_const): Allow wider than
>>>       2*HOST_BITS_PER_WIDE_INT mode constants when they are
>>>       representable as a single const_int RTX.
>>
>> I realise this might be seen as a good expedient fix, but it makes
>> me a bit uneasy.  Not a very constructive rationale, sorry.
>>
>> For this particular case, the problem is that vst2q_s32 and the
>> like initialise a union directly:
>>
>>  union { int32x4x2_t __i; __builtin_neon_oi __o; } __bu = { __b; };
>>
>> and this gets translated into a zeroing of the whole union followed
>> by an assignment to __i:
>>
>>  __bu = {};
>>  __bu.__i = __b;
>
> Btw, this looks like a missed optimization in gimplification.  Worth
> a bugreport (or even a fix).  Might be a target but as well, dependent
> on how __builtin_neon_oi looks like.  Do you have a complete testcase
> that reproduces the above with a cross?

Yeah, build cc1 for arm-linux-gnueabi and compile the attached
testcase (from Chung-Lin) using:

  -O2 -g -mfpu=neon -mfloat-abi=softfp

Rchard

/* { dg-do compile } */
/* { dg-require-effective-target arm_neon_ok } */
/* { dg-options "-O -g" } */
/* { dg-add-options arm_neon } */

#include <arm_neon.h>

void move_16bit_to_32bit (int32_t *dst, const short *src, unsigned n)
{
    unsigned i;
    int16x4x2_t input;
    int32x4x2_t mid;
    int32x4x2_t output;

    for (i = 0; i < n/2; i += 8) {
        input = vld2_s16(src + i);
        mid.val[0] = vmovl_s16(input.val[0]);
        mid.val[1] = vmovl_s16(input.val[1]);
        output.val[0] = vshlq_n_s32(mid.val[0], 8);
        output.val[1] = vshlq_n_s32(mid.val[1], 8);
        vst2q_s32((int32_t *)dst + i, output);
    }
}

Reply via email to