n] Add AVX-512 patterns: Extracts and converts.

Richard Henderson Tue, 15 Oct 2013 08:47:22 -0700

On 10/09/2013 03:31 AM, Kirill Yukhin wrote:
> +  rtx op1 = operands[1];
> +  if (REG_P (op1))
> +    op1 = gen_rtx_REG (V16HImode, REGNO (op1));
> +  else
> +    op1 = gen_lowpart (V16HImode, op1);


The IF case is incorrect.  You need to use gen_lowpart always.

> +(define_insn "*avx512f_unpcklpd512"
> +  [(set (match_operand:V8DF 0 "register_operand" "=v,v")
> +     (vec_select:V8DF
> +       (vec_concat:V16DF
> +         (match_operand:V8DF 1 "nonimmediate_operand" "v,vm")
> +         (match_operand:V8DF 2 "nonimmediate_operand" "vm,1"))
> +       (parallel [(const_int 0) (const_int 8)
> +                  (const_int 2) (const_int 10)
> +                  (const_int 4) (const_int 12)
> +                  (const_int 6) (const_int 14)])))]
> +  "TARGET_AVX512F"
> +  "@
> +   vunpcklpd\t{%2, %1, %0|%0, %1, %2}
> +   vmovddup\t{%1, %0|%0, %1}"
> +  [(set_attr "type" "sselog")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "V8DF")])

The second alternative only matches for v/m/1.  While I imagine that it doesn't
really matter, it might be better to swap the two so that vmovddup gets used
for the v/v/1 case too.

Why do you have separate define_expand and define_insn for this pattern?

> +(define_insn "*avx512f_<code>v8div16qi2_store"
> +  [(set (match_operand:V16QI 0 "memory_operand" "=m")
> +     (vec_concat:V16QI
> +       (any_truncate:V8QI
> +         (match_operand:V8DI 1 "register_operand" "v"))
> +       (vec_select:V8QI
> +         (match_dup 0)
> +         (parallel [(const_int 8) (const_int 9)
> +                    (const_int 10) (const_int 11)
> +                    (const_int 12) (const_int 13)
> +                    (const_int 14) (const_int 15)]))))]
> +  "TARGET_AVX512F"
> +  "vpmov<trunsuffix>qb\t{%1, %0|%0, %1}"
> +  [(set_attr "type" "ssemov")
> +   (set_attr "memory" "store")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "TI")])

Any pattern that uses a match_dup of an input like this must use "+" not "=" to
signal that the operand is in_out not output.

And is this really the best description for this insn?  It's a store to memory.
 Surely it's better to say that we store V8QI.

> +(define_expand "vec_widen_umult_even_v16si"
> +  [(set (match_operand:V8DI 0 "register_operand")
> +        (mult:V8DI
> +          (zero_extend:V8DI
> +            (vec_select:V8SI
> +              (match_operand:V16SI 1 "nonimmediate_operand")
> +              (parallel [(const_int 0) (const_int 2)
> +                         (const_int 4) (const_int 6)
> +                         (const_int 8) (const_int 10)
> +                         (const_int 12) (const_int 14)])))
> +          (zero_extend:V8DI
> +            (vec_select:V8SI
> +              (match_operand:V16SI 2 "nonimmediate_operand")
> +              (parallel [(const_int 0) (const_int 2)
> +                         (const_int 4) (const_int 6)
> +                         (const_int 8) (const_int 10)
> +                         (const_int 12) (const_int 14)])))))]
> +  "TARGET_AVX512F"
> +  "ix86_fixup_binary_operands_no_copy (MULT, V16SImode, operands);")

We don't need to use ix86_fixup_binary_operands for any three-operand insn.
That function is in order to help SSE's two-operand insns.  You can drop the
define-expand and just keep the define_insn.

> +(define_insn "*vec_widen_umult_even_v16si"
> +  [(set (match_operand:V8DI 0 "register_operand" "=v")
> +        (mult:V8DI
> +          (zero_extend:V8DI
> +            (vec_select:V8SI
> +              (match_operand:V16SI 1 "nonimmediate_operand" "v")
> +              (parallel [(const_int 0) (const_int 2)
> +                         (const_int 4) (const_int 6)
> +                         (const_int 8) (const_int 10)
> +                         (const_int 12) (const_int 14)])))
> +          (zero_extend:V8DI
> +            (vec_select:V8SI
> +              (match_operand:V16SI 2 "nonimmediate_operand" "vm")
> +              (parallel [(const_int 0) (const_int 2)
> +                         (const_int 4) (const_int 6)
> +                         (const_int 8) (const_int 10)
> +                         (const_int 12) (const_int 14)])))))]
> +  "TARGET_AVX512F && ix86_binary_operator_ok (MULT, V16SImode, operands)"

You want a "%v" for operand 1, to make the operands commutative.

> +(define_insn "*vec_widen_smult_even_v16si"
> +  [(set (match_operand:V8DI 0 "register_operand" "=x")
> +        (mult:V8DI

Similarly, plus errant "x".


r~

Re: [PATCH i386 3/8] [AVX512] [19/n] Add AVX-512 patterns: Extracts and converts.

Reply via email to