RE: [RFC] [Patch]: Try and vectorize with shift for mult expr with power 2 integer constant.

Kumar, Venkataramanan Wed, 05 Aug 2015 02:35:18 -0700

Hi Richard,

> -----Original Message-----
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Wednesday, August 05, 2015 2:21 PM
> To: Kumar, Venkataramanan
> Cc: Jeff Law; Jakub Jelinek; gcc-patches@gcc.gnu.org
> Subject: Re: [RFC] [Patch]: Try and vectorize with shift for mult expr with
> power 2 integer constant.
> 
> On Tue, Aug 4, 2015 at 6:49 PM, Kumar, Venkataramanan
> <venkataramanan.ku...@amd.com> wrote:
> > Hi Richard,
> >
> >
> >> -----Original Message-----
> >> From: Richard Biener [mailto:richard.guent...@gmail.com]
> >> Sent: Tuesday, August 04, 2015 4:07 PM
> >> To: Kumar, Venkataramanan
> >> Cc: Jeff Law; Jakub Jelinek; gcc-patches@gcc.gnu.org
> >> Subject: Re: [RFC] [Patch]: Try and vectorize with shift for mult
> >> expr with power 2 integer constant.
> >>
> >> On Tue, Aug 4, 2015 at 10:52 AM, Kumar, Venkataramanan
> >> <venkataramanan.ku...@amd.com> wrote:
> >> > Hi Jeff,
> >> >
> >> >> -----Original Message-----
> >> >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> >> >> ow...@gcc.gnu.org] On Behalf Of Jeff Law
> >> >> Sent: Monday, August 03, 2015 11:42 PM
> >> >> To: Kumar, Venkataramanan; Jakub Jelinek
> >> >> Cc: Richard Beiner (richard.guent...@gmail.com);
> >> >> gcc-patches@gcc.gnu.org
> >> >> Subject: Re: [RFC] [Patch]: Try and vectorize with shift for mult
> >> >> expr with power 2 integer constant.
> >> >>
> >> >> On 08/02/2015 05:03 AM, Kumar, Venkataramanan wrote:
> >> >> > Hi Jakub,
> >> >> >
> >> >> > Thank you for reviewing the patch.
> >> >> >
> >> >> > I have incorporated your comments in the attached patch.
> >> >> Note Jakub is on PTO for the next 3 weeks.
> >> >
> >> >  Thank you for this information.
> >> >
> >> >>
> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >> > vectorize_mults_via_shift.diff.txt
> >> >> >
> >> >> >
> >> >> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c
> >> >> > b/gcc/testsuite/gcc.dg/vect/vect-mult-patterns.c
> >> >> Jakub would probably like more testcases :-)
> >> >>
> >> >> The most obvious thing to test would be other shift factors.
> >> >>
> >> >> A negative test to verify we don't try to turn a multiply by
> >> >> non-constant or multiply by a constant that is not a power of 2 into
> shifts.
> >> >
> >> > I have added negative test in the attached patch.
> >> >
> >> >
> >> >>
> >> >> [ Would it make sense, for example, to turn a multiply by 3 into a
> >> >> shift-add sequence?  As Jakub said, choose_mult_variant can be
> >> >> your friend. ]
> >> >
> >> > Yes I will do that in a follow up patch.
> >> >
> >> > The new change log becomes
> >> >
> >> > gcc/ChangeLog
> >> > 2015-08-04  Venkataramanan Kumar
> >> <venkataramanan.ku...@amd.com>
> >> >      * tree-vect-patterns.c (vect_recog_mult_pattern): New function
> >> > for
> >> vectorizing
> >> >         multiplication patterns.
> >> >      * tree-vectorizer.h: Adjust the number of patterns.
> >> >
> >> > gcc/testsuite/ChangeLog
> >> > 2015-08-04  Venkataramanan Kumar
> >> <venkataramanan.ku...@amd.com>
> >> >      * gcc.dg/vect/vect-mult-pattern-1.c: New
> >> >     * gcc.dg/vect/vect-mult-pattern-2.c: New
> >> >
> >> > Bootstrapped and reg tested on aarch64-unknown-linux-gnu.
> >> >
> >> > Ok for trunk ?
> >>
> >> +  if (TREE_CODE (oprnd0) != SSA_NAME
> >> +      || TREE_CODE (oprnd1) != INTEGER_CST
> >> +      || TREE_CODE (itype) != INTEGER_TYPE
> >>
> >> INTEGRAL_TYPE_P (itype)
> >>
> >> +  optab = optab_for_tree_code (LSHIFT_EXPR, vectype, optab_vector);
> >> + if (!optab
> >> +      || optab_handler (optab, TYPE_MODE (vectype)) ==
> >> CODE_FOR_nothing)
> >> +       return NULL;
> >> +
> >>
> >> indent of the return stmt looks wrong
> >>
> >> +  /* Handle constant operands that are postive or negative powers of 2.
> >> + */  if ( wi::exact_log2 (oprnd1) != -1  ||
> >> +       wi::exact_log2 (wi::neg (oprnd1)) != -1)
> >>
> >> no space after (, || goes to the next line.
> >>
> >> +    {
> >> +      tree shift;
> >> +
> >> +      if (wi::exact_log2 (oprnd1) != -1)
> >>
> >> please cache wi::exact_log2
> >>
> >> in fact the first if () looks redundant if you simply put an else
> >> return NULL after a else if (wi::exact_log2 (wi::neg (oprnd1)) != -1)
> >>
> >> Note that the issue with INT_MIN is that wi::neg (INT_MIN) is INT_MIN
> >> again, but it seems that wi::exact_log2 returns -1 in that case so
> >> you are fine (and in fact not handling this case).
> >>
> >
> > I have updated your review comments in the attached patch.
> >
> > For the INT_MIN case, I am getting  vectorized output with the patch.   I
> believe x86_64 also vectorizes but does not negates the results.
> >
> > #include <limits.h>
> > unsigned long int  __attribute__ ((aligned (64)))arr[100];
> >
> > int i;
> > #if 1
> > void test_vector_shifts()
> > {
> >         for(i=0; i<=99;i++)
> >         arr[i]=arr[i] * INT_MIN;
> > }
> > #endif
> >
> > void test_vectorshift_via_mul()
> > {
> >         for(i=0; i<=99;i++)
> >         arr[i]=arr[i]*(-INT_MIN);
> >
> > }
> >
> > Before
> > ---------
> >         ldr     x1, [x0]
> >         neg     x1, x1, lsl 31
> >         str     x1, [x0], 8
> >         cmp     x0, x2
> >
> > After
> > -------
> >         ldr     q0, [x0]
> >         shl     v0.2d, v0.2d, 31
> >         neg     v0.2d, v0.2d
> >         str     q0, [x0], 16
> >         cmp     x1, x0
> >
> > is this fine ?
> 
> The interesting case is of course LONG_MIN if you are vectorizing a long
> multiplication.
> Also check with arr[] being 'long', not 'unsigned long'.  Note that with 
> 'long'
> doing
> arr[i]*(-LONG_MIN) invokes undefined behavior.
> 
> Richard.
 
I checked this case


#include<stdlib.h>
#include <limits.h>
long int __attribute__ ((aligned (64))) arr[100]={1};
long int i;
void test_vector_shifts();
void main()
{
        test_vector_shifts();
        if(!(arr[0] == (long int)(1*-LONG_MIN))) abort();
        printf("%x, %x", arr[0], (long int)(1*-LONG_MIN));
}

void test_vector_shifts()
{
        for(i=0; i<=99;i++)
        arr[i]=arr[i]*-LONG_MIN;
}

X86_64 vectorizes this and later converts to shifts 

.L2:
        vmovdqa (%rax), %xmm0
        addq    $16, %rax
        vpsllq  $63, %xmm0, %xmm0
        vmovaps %xmm0, -16(%rax)
        cmpq    $arr+800, %rax
        jne     .L2

On Aarch64 without my patch 
.L2:
        ldr     x1, [x0]
        lsl     x1, x1, 63
        str     x1, [x0], 8
        cmp     x0, x2
        bne     .L2

With my patch 

.L2:
        ldr     q0, [x0]
        shl     v0.2d, v0.2d, 63
        str     q0, [x0], 16
        cmp     x1, x0
        bne     .L2

All these cases outputs 0 for the above test case.


> 
> >  > Thanks,
> >> Richard.
> >>
> >> >>
> >> >>
> >> >>
> >> >> > @@ -2147,6 +2152,140 @@ vect_recog_vector_vector_shift_pattern
> >> >> (vec<gimple> *stmts,
> >> >> >     return pattern_stmt;
> >> >> >   }
> >> >> >
> >> >> > +/* Detect multiplication by constant which are postive or
> >> >> > +negatives of power 2,
> >> >> s/postive/positive/
> >> >>
> >> >>
> >> >> Jeff
> >> >
> >> > Regards,
> >> > Venkat.
> >> >

Regards,
Venkat.

RE: [RFC] [Patch]: Try and vectorize with shift for mult expr with power 2 integer constant.

Reply via email to