On 2/16/19, H.J. Lu <hjl.to...@gmail.com> wrote: > There is no V4HI pmulhrsw in AVX512BW and V4HI/V8HI pmulhrsw don't require > AVX2. But ssse3_pmulhrswv4hi3 requires MMX. > > PR target/89372 > * config/i386/sse.md (ssedoublemode): Remove V4HI. > (PMULHRSW): Likewise. > (<ssse3_avx2>_pmulhrsw<mode>3): Require TARGET_SSSE3, not > TARGET_AVX2. > (ssse3_pmulhrswv4hi3): New expander. > (*ssse3_pmulhrswv4hi3): Require TARGET_MMX. > --- > gcc/config/i386/sse.md | 30 ++++++++++++++++++++++++++---- > 1 file changed, 26 insertions(+), 4 deletions(-) > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index 8281fe2d398..839e38c46f0 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -596,7 +596,7 @@ > [(V4SF "V8SF") (V8SF "V16SF") (V16SF "V32SF") > (V2DF "V4DF") (V4DF "V8DF") (V8DF "V16DF") > (V16QI "V16HI") (V32QI "V32HI") (V64QI "V64HI") > - (V4HI "V4SI") (V8HI "V8SI") (V16HI "V16SI") (V32HI "V32SI") > + (V8HI "V8SI") (V16HI "V16SI") (V32HI "V32SI") > (V4SI "V4DI") (V8SI "V16SI") (V16SI "V32SI") > (V4DI "V8DI") (V8DI "V16DI")]) > > @@ -15590,7 +15590,7 @@ > (set_attr "mode" "DI")]) > > (define_mode_iterator PMULHRSW > - [V4HI V8HI (V16HI "TARGET_AVX2")]) > + [V8HI (V16HI "TARGET_AVX2")]) > > (define_expand "<ssse3_avx2>_pmulhrsw<mode>3_mask" > [(set (match_operand:PMULHRSW 0 "register_operand") > @@ -15629,7 +15629,7 @@ > (const_int 14)) > (match_dup 3)) > (const_int 1))))] > - "TARGET_AVX2" > + "TARGET_SSSE3" > { > operands[3] = CONST1_RTX(<MODE>mode); > ix86_fixup_binary_operands_no_copy (MULT, <MODE>mode, operands); > @@ -15662,6 +15662,26 @@ > (set_attr "prefix" "orig,maybe_evex,evex") > (set_attr "mode" "<sseinsnmode>")]) > > +(define_expand "ssse3_pmulhrswv4hi3" > + [(set (match_operand:V4HI 0 "register_operand") > + (truncate:V4HI > + (lshiftrt:V4SI > + (plus:V4SI > + (lshiftrt:V4SI > + (mult:V4SI > + (sign_extend:V4SI > + (match_operand:V4HI 1 "nonimmediate_operand")) > + (sign_extend:V4SI > + (match_operand:V4HI 2 "nonimmediate_operand"))) > + (const_int 14)) > + (match_dup 3)) > + (const_int 1))))] > + "TARGET_MMX && TARGET_SSSE3"
Currently, there is no need for TARGET_MMX constraint on mainline. > +{ > + operands[3] = CONST1_RTX(V4HImode); > + ix86_fixup_binary_operands_no_copy (MULT, V4HImode, operands); > +}) > + > (define_insn "*ssse3_pmulhrswv4hi3" > [(set (match_operand:V4HI 0 "register_operand" "=y") > (truncate:V4HI > @@ -15676,7 +15696,9 @@ > (const_int 14)) > (match_operand:V4HI 3 "const1_operand")) > (const_int 1))))] > - "TARGET_SSSE3 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" > + "TARGET_MMX > + && TARGET_SSSE3 > + && !(MEM_P (operands[1]) && MEM_P (operands[2]))" > "pmulhrsw\t{%2, %0|%0, %2}" > [(set_attr "type" "sseimul") > (set_attr "prefix_extra" "1") The above hunk is currently not needed. OK for mainline without TARGET_MMX constraints. Thanks, Uros.