On Thu, Apr 6, 2017 at 10:40 AM, Jakub Jelinek <ja...@redhat.com> wrote: > On Thu, Apr 06, 2017 at 09:33:58AM +0200, Uros Bizjak wrote: >> Newly introduced alternatives (x/x) and (v/v) are valid also for >> 32-bit targets, so we have to adjust insn constraint of >> *vec_extractv4si_0_zext and enable alternatives accordingly. After the > > That is true. But if we provide just the x/x and v/v alternatives in > *vec_extractv4si_0_zext, then it will be forced to always do the zero > extraction on the SSE registers in 32-bit mode. Is that what we want?
Yes, for SSE4 targets. We are sure that we have SSE source register here, and there is no direct zero-extension to a general reg in 32-bit case. > As for the define_insn_and_split that would transform sign extensions > used solely by the vector shifts by scalar shift count, did you mean > something like following (for every shift pattern)? > > --- sse.md.jj1 2017-04-04 19:51:01.000000000 +0200 > +++ sse.md 2017-04-06 10:26:26.877545109 +0200 > @@ -10696,6 +10696,22 @@ > (set_attr "prefix" "orig,vex") > (set_attr "mode" "<sseinsnmode>")]) > > +(define_insn_and_split "*<shift_insn><mode>3<mask_name>_1" > + [(set (match_operand:VI2_AVX2_AVX512BW 0 "register_operand") > + (any_lshift:VI2_AVX2_AVX512BW > + (match_operand:VI2_AVX2_AVX512BW 1 "register_operand") > + (sign_extend:DI (match_operand:SI 2 "nonmemory_operand"))))] > + "TARGET_SSE2 && <mask_mode512bit_condition> && <mask_avx512bw_condition> > + && can_create_pseudo_p ()" > + "#" > + "&& 1" > + [(set (match_dup 3) (zero_extend:DI (match_dup 2))) > + (set (match_dup 0) (any_lshift:VI2_AVX2_AVX512BW > + (match_dup 1) (match_dup 3)))] > +{ > + operands[3] = gen_reg_rtx (DImode); > +}) > Yes, something like this. You ca use any_extend instead of sign_extend, so the pattern will also remove possible zero_extend of count operand. > (define_insn "<shift_insn><mode>3<mask_name>" > [(set (match_operand:VI48_AVX2 0 "register_operand" "=x,x,v") > (any_lshift:VI48_AVX2 > > The problem with that is that apparently our infrastructure doesn't support > define_subst for define_insn_and_split (and define_split), so either we'd > need to have separate define_insn_and_split for masked and for non-masked, > or we'd need to extend the define_subst infrastructure for > define_insn_and_split somehow. Looking say at > (define_subst "mask" > [(set (match_operand:SUBST_V 0) > (match_operand:SUBST_V 1))] > "TARGET_AVX512F" > [(set (match_dup 0) > (vec_merge:SUBST_V > (match_dup 1) > (match_operand:SUBST_V 2 "vector_move_operand" "0C") > (match_operand:<avx512fmaskmode> 3 "register_operand" "Yk")))]) > that is a transformation we want to do on the define_insn part of > define_insn_and_split, but not exactly what we want to do on the split > part of the insn - there we want literaly match_dup 0, match_dup 1, > and instead of the 2 other match_operand match_dup 2 and match_dup 3. Hm, I'm not that versed in define_subst, but that looks quite a drawback of define_subst to me. Uros.