Tamar Christina <tamar.christ...@arm.com> writes:
> Hi,
>
> As discussed off-line this can only happen with a V1 mode, so here's a much 
> simpler patch.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> x86_64-pc-linux-gnu and no regressions.
>
>
> Ok for master?

OK, thanks.

Richard

> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>         PR rtl-optimization/103404
>         * cse.c (find_sets_in_insn): Don't select elements out of a V1 mode
>         subreg.
>
> gcc/testsuite/ChangeLog:
>
>         PR rtl-optimization/103404
>         * gcc.target/i386/pr103404.c: New test.
>
> --- inline copy of patch ---
>
> diff --git a/gcc/cse.c b/gcc/cse.c
> index 
> c1c7d0ca27b73c4b944b4719f95fece74e0358d5..dc5d5aed047c7776f44b159a4286390d6499c18d
>  100644
> --- a/gcc/cse.c
> +++ b/gcc/cse.c
> @@ -4275,7 +4275,12 @@ find_sets_in_insn (rtx_insn *insn, vec<struct set> 
> *psets)
>        else if (GET_CODE (SET_SRC (x)) == CALL)
>         ;
>        else if (GET_CODE (SET_SRC (x)) == CONST_VECTOR
> -              && GET_MODE_CLASS (GET_MODE (SET_SRC (x))) != MODE_VECTOR_BOOL)
> +              && GET_MODE_CLASS (GET_MODE (SET_SRC (x))) != MODE_VECTOR_BOOL
> +              /* Prevent duplicates from being generated if the type is a V1
> +                 type and a subreg.  Folding this will result in the same
> +                 element as folding x itself.  */
> +              && !(SUBREG_P (SET_DEST (x))
> +                   && known_eq (GET_MODE_NUNITS (GET_MODE (SET_SRC (x))), 
> 1)))
>         {
>           /* First register the vector itself.  */
>           add_to_set (psets, x);
> diff --git a/gcc/testsuite/gcc.target/i386/pr103404.c 
> b/gcc/testsuite/gcc.target/i386/pr103404.c
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..66f33645301db09503fc0977fd0f061a19e56ea5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr103404.c
> @@ -0,0 +1,32 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-Og -fcse-follow-jumps -fno-dce 
> -fno-early-inlining -fgcse -fharden-conditional-branches 
> -frerun-cse-after-loop -fno-tree-ccp -mavx5124fmaps -std=c99 -w" } */
> +
> +typedef unsigned __attribute__((__vector_size__ (4))) U;
> +typedef unsigned __attribute__((__vector_size__ (16))) V;
> +typedef unsigned __attribute__((__vector_size__ (64))) W;
> +
> +int x, y;
> +
> +V v;
> +W w;
> +
> +inline
> +int bar (U a)
> +{
> +  a |= x;
> +  W k =
> +    __builtin_shufflevector (v, 5 / a,
> +                            2, 4, 0, 2, 4, 1, 0, 1,
> +                            1, 2, 1, 3, 0, 4, 4, 0);
> +  w = k;
> +  y = 0;
> +}
> +
> +int
> +foo ()
> +{
> +  bar ((U){0xffffffff});
> +  for (unsigned i; i < sizeof (foo);)
> +    ;
> +}
> +

Reply via email to