Hi, vec_unpacks_lo_[si,hi,di] patterns for scalar masks don't need to extend mask elements. It means a simple register copy is good enough.
Currently vec_unpacks_lo_hi pattern uses kmovb instruction which requires AVX512DQ target. But 16-bit masks to/from 8-bit masks conversion is typical for AVX512F code with a mix of integer (or float, or logical (kind=4) for Fortran) and double computations. This patch implements vec_unpacks_lo_hi as kmovw instead to make masks conversion available for AVX512F target. Bootstrapped and tested on x96_64-unknown-linux-gnu. Does it look OK for trunk? Thanks, Ilya -- gcc/ 2016-04-19 Ilya Enkovich <ilya.enkov...@intel.com> * config/i386/sse.md (vec_unpacks_lo_hi): Always use kmovw to support AVX512F target. diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 4d2927e..c213ee1 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -13661,9 +13661,9 @@ "ix86_expand_sse_unpack (operands[0], operands[1], true, false); DONE;") (define_expand "vec_unpacks_lo_hi" - [(set (match_operand:QI 0 "register_operand") - (subreg:QI (match_operand:HI 1 "register_operand") 0))] - "TARGET_AVX512DQ") + [(set (subreg:HI (match_operand:QI 0 "register_operand") 0) + (match_operand:HI 1 "register_operand"))] + "TARGET_AVX512F") (define_expand "vec_unpacks_lo_si" [(set (match_operand:HI 0 "register_operand")