Hello Richard,
Thanks for remarks, they all seems reasonable.
One question
On 21 Oct 16:01, Richard Henderson wrote:
> > +(define_insn "avx512f_moves<mode>_mask"
> > + [(set (match_operand:VF_128 0 "register_operand" "=v")
> > + (vec_merge:VF_128
> > + (vec_merge:VF_128
> > + (match_operand:VF_128 2 "register_operand" "v")
> > + (match_operand:VF_128 3 "vector_move_operand" "0C")
> > + (match_operand:<avx512fmaskmode> 4 "register_operand" "k"))
> > + (match_operand:VF_128 1 "register_operand" "v")
> > + (const_int 1)))]
> > + "TARGET_AVX512F"
> > + "vmov<ssescalarmodesuffix>\t{%2, %1, %0%{%4%}%N3|%0%{%4%}%N3, %1, %2}"
> > + [(set_attr "type" "ssemov")
> > + (set_attr "prefix" "evex")
> > + (set_attr "mode" "<sseinsnmode>")])
>
> Nested vec_merge? That seems... odd to say the least.
> How in the world does this get matched?
This is generic approach for all scalar `masked' instructions.
Reason is that we must save higher bits of vector (outer vec_merge)
and apply single-bit mask (inner vec_merge).
We may do it with unspecs though... But is it really better?
What do you think?
--
Thanks, K