https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87214

--- Comment #19 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> 
---
OK.  The .optimized dumps seem to be the same for both -mavx2 and
-march=skylake-avx512.  Things only diverge during expand.

It looks like it might be a bug in:

(define_insn "<mask_codefor>avx512dq_shuf_<shuffletype>64x2_1<mask_name>"
  [(set (match_operand:VI8F_256 0 "register_operand" "=v")
        (vec_select:VI8F_256
          (vec_concat:<ssedoublemode>
            (match_operand:VI8F_256 1 "register_operand" "v")
            (match_operand:VI8F_256 2 "nonimmediate_operand" "vm"))
          (parallel [(match_operand 3  "const_0_to_3_operand")
                     (match_operand 4  "const_0_to_3_operand")
                     (match_operand 5  "const_4_to_7_operand")
                     (match_operand 6  "const_4_to_7_operand")])))]
  "TARGET_AVX512VL
   && (INTVAL (operands[3]) == (INTVAL (operands[4]) - 1)
       && INTVAL (operands[5]) == (INTVAL (operands[6]) - 1))"
{
  int mask;
  mask = INTVAL (operands[3]) / 2;
  mask |= (INTVAL (operands[5]) - 4) / 2 << 1;
  operands[3] = GEN_INT (mask);
  return "vshuf<shuffletype>64x2\t{%3, %2, %1,
%0<mask_operand7>|%0<mask_operand7>, %1, %2, %3}";
}
  [(set_attr "type" "sselog")
   (set_attr "length_immediate" "1")
   (set_attr "prefix" "evex")
   (set_attr "mode" "XI")])

which AFAICT requires without checking that operands 3 and 5 are even (0 or 2
and 4 or 6 respectively).  In this case we're using it to match:

(insn 40 39 41 6 (set (reg:V4DI 101 [ vect__5.17 ])
        (vec_select:V4DI (vec_concat:V8DI (reg:V4DI 98 [ vect__5.14 ])
                (reg:V4DI 140 [ vect__5.15 ]))
            (parallel [
                    (const_int 2 [0x2])
                    (const_int 3 [0x3])
                    (const_int 5 [0x5])
                    (const_int 6 [0x6])
                ]))) "/tmp/foo.c":8:22 4069 {*avx512dq_shuf_i64x2_1}
     (nil))

and treat the permute mask as {2, 3, 4, 5} instead.

Reply via email to