Re: Help with PR97872

Richard Biener Sun, 06 Dec 2020 23:32:08 -0800

On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:

> On Fri, 4 Dec 2020 at 17:18, Richard Biener <rguent...@suse.de> wrote:
> >
> > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> >
> > > On Thu, 3 Dec 2020 at 16:35, Richard Biener <rguent...@suse.de> wrote:
> > > >
> > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > >
> > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener <rguent...@suse.de> wrote:
> > > > > >
> > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > > For the test mentioned in PR, I was trying to see if we could do
> > > > > > > specialized expansion for vcond in target when operands are -1 
> > > > > > > and 0.
> > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > (reg:V8QI 117)
> > > > > > > (reg:V8QI 118)
> > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > >     (reg/v:V8QI 116 [ b ]))
> > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > >
> > > > > > > where r117 and r118 are set to vector constants -1 and 0 
> > > > > > > respectively.
> > > > > > > However, I am not sure if there's a way to check if the register 
> > > > > > > is
> > > > > > > constant during expansion time (since we don't have df analysis 
> > > > > > > yet) ?
> > > > > > >
> > > > > > > Alternatively, should we add a target hook that returns true if 
> > > > > > > the
> > > > > > > result of vector comparison is set to all-ones or all-zeros, and 
> > > > > > > then
> > > > > > > use this hook in gimple ISEL to effectively turn VEC_COND_EXPR 
> > > > > > > into nop ?
> > > > > >
> > > > > > Would everything match-up for a .VEC_CMP IFN producing a non-mask
> > > > > > vector type?  ISEL could special case the a ? -1 : 0 case this way.
> > > > > I think the vec_cmp pattern matches but it produces a masked vector 
> > > > > type.
> > > > > In the attached patch, I simply replaced:
> > > > > _1 = a < b
> > > > > x = _1 ? -1 : 0
> > > > > with
> > > > > x = view_convert_expr<_1>
> > > > >
> > > > > For the test-case, isel generates:
> > > > >   vector(8) <signed-boolean:8> _1;
> > > > >   vector(8) signed char _2;
> > > > >   uint8x8_t _5;
> > > > >
> > > > >   <bb 2> [local count: 1073741824]:
> > > > >   _1 = a_3(D) < b_4(D);
> > > > >   _2 = VIEW_CONVERT_EXPR<vector(8) signed char>(_1);
> > > > >   _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_2);
> > > > >   return _5;
> > > > >
> > > > > and results in desired code-gen:
> > > > > f1:
> > > > >         vcgt.s8 d0, d1, d0
> > > > >         bx      lr
> > > > >
> > > > > Altho I guess, we should remove the redundant conversions during isel 
> > > > > itself ?
> > > > > and result in:
> > > > > _1 = a_3(D) < b_4(D)
> > > > > _5 = VIEW_CONVERT_EXPR<uint8x8_t>(_1)
> > > > >
> > > > > (Patch is lightly tested with only vect.exp)
> > > >
> > > > +  /* For targets where result of comparison is all-ones or all-zeros,
> > > > +     a < b ? -1 : 0 can be reduced to a < b.  */
> > > > +
> > > > +  if (integer_minus_onep (op1) && integer_zerop (op2))
> > > > +    {
> > > >
> > > > So this really belongs here:
> > > >
> > > >           tree op0_type = TREE_TYPE (op0);
> > > >           tree op0a_type = TREE_TYPE (op0a);
> > > >
> > > > <---
> > > >
> > > >           if (used_vec_cond_exprs >= 2
> > > >               && (get_vcond_mask_icode (mode, TYPE_MODE (op0_type))
> > > >                   != CODE_FOR_nothing)
> > > >               && expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > >             {
> > > >
> > > >
> > > > +      gassign *def_stmt = dyn_cast<gassign *> (SSA_NAME_DEF_STMT 
> > > > (op0));
> > > > +      tree op0a = gimple_assign_rhs1 (def_stmt);
> > > > +      tree op0_type = TREE_TYPE (op0);
> > > > +      tree op0a_type = TREE_TYPE (op0a);
> > > > +      enum tree_code tcode = gimple_assign_rhs_code (def_stmt);
> > > > +
> > > > +      if (expand_vec_cmp_expr_p (op0a_type, op0_type, tcode))
> > > > +       {
> > > > +         tree conv_op = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs), 
> > > > op0);
> > > > +         gassign *new_stmt = gimple_build_assign (lhs, conv_op);
> > > > +         gsi_replace (gsi, new_stmt, true);
> > > >
> > > > and you need to verify that the mode of the lhs and the mode of op0
> > > > agree and that the target can actually expand_vec_cmp_expr_p
> > > Thanks for the suggestions, does the attached patch look OK ?
> > > Sorry, I am not sure how to check if target can actually expand vec_cmp ?
> > > I assume that since expand_vec_cmp_expr_p queries optab and if it gets
> > > a valid cmp icode, that
> > > should be sufficient ?
> >
> > Yes
> Hi Richard,
> I tested the patch, and it shows one regression for pr78102.c, because
> of extra pcmpeqq in code-gen for x != y on x86.
> For the test-case:
> __v2di
> baz (const __v2di x, const __v2di y)
> {
>   return x != y;
> }
> 
> Before patch:
> baz:
>         pcmpeqq %xmm1, %xmm0
>         pcmpeqd %xmm1, %xmm1
>         pandn   %xmm1, %xmm0
>         ret
> 
> After patch,
> Before ISEL:
>   vector(2) <signed-boolean:64> _1;
>   __v2di _4;
> 
>   <bb 2> [local count: 1073741824]:
>   _1 = x_2(D) != y_3(D);
>   _4 = VEC_COND_EXPR <_1, { -1, -1 }, { 0, 0 }>;
>   return _4;
> 
> After ISEL:
>   vector(2) <signed-boolean:64> _1;
>   __v2di _4;
> 
>   <bb 2> [local count: 1073741824]:
>   _1 = x_2(D) != y_3(D);
>   _4 = VIEW_CONVERT_EXPR<__v2di>(_1);
>   return _4;
> 
> which results in:
>         pcmpeqq %xmm1, %xmm0
>         pxor    %xmm1, %xmm1
>         pcmpeqq %xmm1, %xmm0
>         ret
> 
> IIUC, the new code-gen is essentially comparing two args for equality, and 
> then
> comparing the result against zero to invert it, so it looks correct ?
> I am not sure which of the above two sequences is better tho ?
> If the new code-gen is OK, would it be OK to adjust the test-case ?


In case pcmpeqq is double-issue the first variant might be faster while
the second variant has the advantage of the "free" pxor, but back-to-back
pcmpeqq might have an issue.

I think on GIMPLE the new code is preferable and adjustments are
target business.  I wouldn't be surprised if the x86 backend
special-cases vcond to {-1,-1}, {0,0} already to arrive at the first
variant.

Did you check how

a = x != y ? { -1, -1 } : {0, 0 };
b = x != y ? { 1, 2 } : { 3, 4 };

is handled before/after your patch?  That is, make the comparison
CSEd between two VEC_COND_EXPRs?

Thanks,
Richard.


> Thanks,
> Prathamesh
> >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > Richard.
> > > >
> > > >
> > > >
> > > > > Thanks,
> > > > > Prathamesh
> > > > > >
> > > > > > > Thanks,
> > > > > > > Prathamesh
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Richard Biener <rguent...@suse.de>
> > > > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 
> > > > > > Nuernberg,
> > > > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguent...@suse.de>
> > > > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > > > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> > >
> >
> > --
> > Richard Biener <rguent...@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: Help with PR97872

Reply via email to