On Fri, Oct 26, 2018 at 9:37 AM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> On Fri, Oct 26, 2018 at 9:35 AM Uros Bizjak <ubiz...@gmail.com> wrote:
> >
> > On Fri, Oct 26, 2018 at 9:19 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> > >
> > > On 10/25/18, Uros Bizjak <ubiz...@gmail.com> wrote:
> > > > On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu <hjl.to...@gmail.com> wrote:
> > > >>
> > > >> Many x86 pmovzx/pmovsx instructions with memory operands are modeled in
> > > >> a wrong way.  For example:
> > > >>
> > > >> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > > >>     (any_extend:V8HI
> > > >>       (vec_select:V8QI
> > > >>         (match_operand:V16QI 1 "nonimmediate_operand" "Yrm,*xm,vm")
> > > >>         (parallel [(const_int 0) (const_int 1)
> > > >>                (const_int 2) (const_int 3)
> > > >>                (const_int 4) (const_int 5)
> > > >>                (const_int 6) (const_int 7)]))))]
> > > >>
> > > >> should be defind for memory operands as:
> > > >>
> > > >> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > > >>     (any_extend:V8HI
> > > >>       (match_operand:V8QI "memory_operand" "m,m,m")))]
> > > >>
> > > >> This set of patches updates them to
> > > >>
> > > >> (define_insn "sse4_1_<code>v8qiv8hi2<mask_name>"
> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > > >>     (any_extend:V8HI
> > > >>       (vec_select:V8QI
> > > >>         (match_operand:V16QI 1 "nonimmediate_operand" "Yr,*x,v")
> > > >>         (parallel [(const_int 0) (const_int 1)
> > > >>                (const_int 2) (const_int 3)
> > > >>                (const_int 4) (const_int 5)
> > > >>                (const_int 6) (const_int 7)]))))]
> > > >>
> > > >> (define_insn "*sse4_1_<code>v8qiv8hi2<mask_name>_1"
> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > > >>     (any_extend:V8HI
> > > >>       (match_operand:V8QI "subreg_memory_operand" "m,m,m")))]
> > > >>
> > > >> with a splitter:
> > > >>
> > > >> (define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
> > > >>   [(set (match_operand:V8HI 0 "register_operand" "=Yr,*x,v")
> > > >
> > > > No constraints needed for pre-reload splitter.
> > > >
> > > >>         (any_extend:V8HI
> > > >>           (vec_select:V8QI
> > > >>             (subreg:V16QI
> > > >>               (vec_concat:V2DI
> > > >>                 (match_operand:DI 1 "memory_operand" "m,*m,m")
> > > >>                 (const_int 0)) 0)
> > > >>             (parallel [(const_int 0) (const_int 1)
> > > >>                        (const_int 2) (const_int 3)
> > > >>                        (const_int 4) (const_int 5)
> > > >>                        (const_int 6) (const_int 7)]))))]
> > > >>   "TARGET_SSE4_1 && <mask_avx512bw_condition> &&
> > > >> <mask_avx512vl_condition>"
> > > >>   "#"
> > > >>   "&& can_create_pseudo_p ()"
> > > >>   [(set (match_dup 0) (match_dup 1))]
> > > >
> > > >  [(set (match_dup 0)
> > > >       (any_extend:V8HI (match_dup 1)))]
> > > >
> > > >> {
> > > >>   operands[1] = gen_rtx_<CODE> (V8HImode,
> > > >>                                 gen_rtx_SUBREG (V8QImode,
> > > >>                                                 operands[1], 0));
> > > >> })
> > > >
> > > > Don't create subregs of memory. Use adjust_address_nv.
> > >
> > > Here is the updated patch.
> >
> > > with a splitter:
> > >
> > > (define_insn_and_split "*sse4_1_<code>v8qiv8hi2<mask_name>_2"
> > >  [(set (match_operand:V8HI 0 "register_operand")
> > >        (any_extend:V8HI
> > >          (vec_select:V8QI
> > >            (subreg:V16QI
> > >              (vec_concat:V2DI
> > >                (match_operand:DI 1 "memory_operand")
> > >                (const_int 0)) 0)
> > >            (parallel [(const_int 0) (const_int 1)
> > >                       (const_int 2) (const_int 3)
> > >                       (const_int 4) (const_int 5)
> > >                       (const_int 6) (const_int 7)]))))]
> > >  "TARGET_SSE4_1 && <mask_avx512bw_condition> && <mask_avx512vl_condition>"
> > >  "#"
> > >  "&& can_create_pseudo_p ()"
> >
> > "can_create_pseudo_p ()" should go to the insn constraint and "&& 1"
> > should be used for split constraint. Both, insn and splitter are valid
> > only before reload.
> >
> > >  [(set (match_dup 0)
> > >        (any_extend:V8HI (match_dup 1)))]
> > > {
> > >  operands[1] = adjust_address_nv (operands[1], V8QImode, 0);
> > > })
> >
> > Please use double quotes for one-line preparation statement.
> >
> > > (any_extend:V4SI
> > >   (match_operand:V4HI 1 "memory_operand" "m,*m,m")))]
> >
> > Please remove star in front of memory constraint.
> >
> > OK with the above changes.
>
> Oh, and you should remove "q" and "k" operand modifiers in all old patterns.

Well, the new ones, obviously.

Uros.

Reply via email to