On Wed, Jun 7, 2017 at 4:14 PM, Bill Schmidt <wschm...@linux.vnet.ibm.com> wrote: > >> On Jun 6, 2017, at 11:37 AM, Will Schmidt <will_schm...@vnet.ibm.com> wrote: >> >> On Thu, 2017-06-01 at 10:15 -0500, Bill Schmidt wrote: >>>> On Jun 1, 2017, at 2:48 AM, Richard Biener <richard.guent...@gmail.com> >>>> wrote: >>>> >>>> On Wed, May 31, 2017 at 10:01 PM, Will Schmidt >>>> <will_schm...@vnet.ibm.com> wrote: >>>>> Hi, >>>>> >>>>> Add support for early expansion of vector shifts. Including >>>>> vec_sl (shift left), vec_sr (shift right), vec_sra (shift >>>>> right algebraic), vec_rl (rotate left). >>>>> Part of this includes adding the vector shift right instructions to >>>>> the list of those instructions having an unsigned second argument. >>>>> >>>>> The VSR (vector shift right) folding is a bit more complex than >>>>> the others. This is due to requiring arg0 be unsigned for an algebraic >>>>> shift before the gimple RSHIFT_EXPR assignment is built. >>>> >>>> Jakub, do we sanitize that undefinedness of left shifts of negative values >>>> and/or overflow of left shift of nonnegative values? >> >> >> On Thu, 2017-06-01 at 10:17 +0200, Jakub Jelinek wrote: >>> We don't yet, see PR77823 - all I've managed to do before stage1 was over >>> was instrumentation of signed arithmetic integer overflow on vectors, >>> division, shift etc. are tasks maybe for this stage1. >>> >>> That said, shift instrumentation in particular is done early because every >>> FE has different rules, and so if it is coming from target builtins that are >>> folded into something, it wouldn't be instrumented anyway. >> >> >> On Thu, 2017-06-01 at 10:15 -0500, Bill Schmidt wrote: >>>> >>>> Will, how is that defined in the intrinsics operation? It might need >>>> similar >>>> treatment as the abs case. >>> >>> Answering for Will -- vec_sl is defined to simply shift bits off the end to >>> the >>> left and fill with zeros from the right, regardless of whether the source >>> type >>> is signed or unsigned. The result type is signed iff the source type is >>> signed. So a negative value can become positive as a result of the >>> operation. >>> >>> The same is true of vec_rl, which will naturally rotate bits regardless of >>> signedness. >> >> >>>> >>>> [I'd rather make the negative left shift case implementation defined >>>> given C and C++ standards >>>> do not agree to 100% AFAIK] >> >> With the above answers, how does this one stand? >> >> [ I have no issue adding the TYPE_OVERFLOW_WRAPS logic to treat some of >> the cases differently, I'm just unclear on whether none/some/all of the >> shifts will require that logic. :-) ] > > I have to defer to Richard here, I don't know the subtleties well enough.
I'd say play safe and guard folding of left shifts with TYPE_OVERFLOW_WRAPS. Richard. > Bill > >> >> thanks, >> -Will >> >> >> >> >>>> >>>> Richard. >>>> >>>>> [gcc] >>>>> >>>>> 2017-05-26 Will Schmidt <will_schm...@vnet.ibm.com> >>>>> >>>>> * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add handling >>>>> for early expansion of vector shifts (sl,sr,sra,rl). >>>>> (builtin_function_type): Add vector shift right instructions >>>>> to the unsigned argument list. >>>>> >>>>> [gcc/testsuite] >>>>> >>>>> 2017-05-26 Will Schmidt <will_schm...@vnet.ibm.com> >>>>> >>>>> * testsuite/gcc.target/powerpc/fold-vec-shift-char.c: New. >>>>> * testsuite/gcc.target/powerpc/fold-vec-shift-int.c: New. >>>>> * testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c: New. >>>>> * testsuite/gcc.target/powerpc/fold-vec-shift-short.c: New. >>>>> >>>>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c >>>>> index 8adbc06..6ee0bfd 100644 >>>>> --- a/gcc/config/rs6000/rs6000.c >>>>> +++ b/gcc/config/rs6000/rs6000.c >>>>> @@ -17408,6 +17408,76 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator >>>>> *gsi) >>>>> gsi_replace (gsi, g, true); >>>>> return true; >>>>> } >>>>> + /* Flavors of vec_rotate_left . */ >>>>> + case ALTIVEC_BUILTIN_VRLB: >>>>> + case ALTIVEC_BUILTIN_VRLH: >>>>> + case ALTIVEC_BUILTIN_VRLW: >>>>> + case P8V_BUILTIN_VRLD: >>>>> + { >>>>> + arg0 = gimple_call_arg (stmt, 0); >>>>> + arg1 = gimple_call_arg (stmt, 1); >>>>> + lhs = gimple_call_lhs (stmt); >>>>> + gimple *g = gimple_build_assign (lhs, LROTATE_EXPR, arg0, arg1); >>>>> + gimple_set_location (g, gimple_location (stmt)); >>>>> + gsi_replace (gsi, g, true); >>>>> + return true; >>>>> + } >>>>> + /* Flavors of vector shift right algebraic. vec_sra{b,h,w} -> >>>>> vsra{b,h,w}. */ >>>>> + case ALTIVEC_BUILTIN_VSRAB: >>>>> + case ALTIVEC_BUILTIN_VSRAH: >>>>> + case ALTIVEC_BUILTIN_VSRAW: >>>>> + case P8V_BUILTIN_VSRAD: >>>>> + { >>>>> + arg0 = gimple_call_arg (stmt, 0); >>>>> + arg1 = gimple_call_arg (stmt, 1); >>>>> + lhs = gimple_call_lhs (stmt); >>>>> + gimple *g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, arg1); >>>>> + gimple_set_location (g, gimple_location (stmt)); >>>>> + gsi_replace (gsi, g, true); >>>>> + return true; >>>>> + } >>>>> + /* Flavors of vector shift left. builtin_altivec_vsl{b,h,w} -> >>>>> vsl{b,h,w}. */ >>>>> + case ALTIVEC_BUILTIN_VSLB: >>>>> + case ALTIVEC_BUILTIN_VSLH: >>>>> + case ALTIVEC_BUILTIN_VSLW: >>>>> + case P8V_BUILTIN_VSLD: >>>>> + { >>>>> + arg0 = gimple_call_arg (stmt, 0); >>>>> + arg1 = gimple_call_arg (stmt, 1); >>>>> + lhs = gimple_call_lhs (stmt); >>>>> + gimple *g = gimple_build_assign (lhs, LSHIFT_EXPR, arg0, arg1); >>>>> + gimple_set_location (g, gimple_location (stmt)); >>>>> + gsi_replace (gsi, g, true); >>>>> + return true; >>>>> + } >>>>> + /* Flavors of vector shift right. */ >>>>> + case ALTIVEC_BUILTIN_VSRB: >>>>> + case ALTIVEC_BUILTIN_VSRH: >>>>> + case ALTIVEC_BUILTIN_VSRW: >>>>> + case P8V_BUILTIN_VSRD: >>>>> + { >>>>> + arg0 = gimple_call_arg (stmt, 0); >>>>> + arg1 = gimple_call_arg (stmt, 1); >>>>> + lhs = gimple_call_lhs (stmt); >>>>> + gimple *g; >>>>> + /* convert arg0 to unsigned */ >>>>> + arg0 = convert(unsigned_type_for(TREE_TYPE(arg0)),arg0); >>>>> + tree arg0_uns = >>>>> create_tmp_reg_or_ssa_name(unsigned_type_for(TREE_TYPE(arg0))); >>>>> + g = gimple_build_assign(arg0_uns,arg0); >>>>> + gimple_set_location (g, gimple_location (stmt)); >>>>> + gsi_insert_before (gsi, g, GSI_SAME_STMT); >>>>> + /* convert lhs to unsigned and do the shift. */ >>>>> + tree lhs_uns = >>>>> create_tmp_reg_or_ssa_name(unsigned_type_for(TREE_TYPE(lhs))); >>>>> + g = gimple_build_assign (lhs_uns, RSHIFT_EXPR, arg0_uns, arg1); >>>>> + gimple_set_location (g, gimple_location (stmt)); >>>>> + gsi_insert_before (gsi, g, GSI_SAME_STMT); >>>>> + /* convert lhs back to a signed type for the return. */ >>>>> + lhs_uns = convert(signed_type_for(TREE_TYPE(lhs)),lhs_uns); >>>>> + g = gimple_build_assign(lhs,lhs_uns); >>>>> + gimple_set_location (g, gimple_location (stmt)); >>>>> + gsi_replace (gsi, g, true); >>>>> + return true; >>>>> + } >>>>> default: >>>>> break; >>>>> } >>>>> @@ -19128,6 +19198,14 @@ builtin_function_type (machine_mode mode_ret, >>>>> machine_mode mode_arg0, >>>>> h.uns_p[2] = 1; >>>>> break; >>>>> >>>>> + /* unsigned second arguments (vector shift right). */ >>>>> + case ALTIVEC_BUILTIN_VSRB: >>>>> + case ALTIVEC_BUILTIN_VSRH: >>>>> + case ALTIVEC_BUILTIN_VSRW: >>>>> + case P8V_BUILTIN_VSRD: >>>>> + h.uns_p[2] = 1; >>>>> + break; >>>>> + >>>>> default: >>>>> break; >>>>> } >>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-char.c >>>>> b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-char.c >>>>> new file mode 100644 >>>>> index 0000000..ebe91e7 >>>>> --- /dev/null >>>>> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-char.c >>>>> @@ -0,0 +1,66 @@ >>>>> +/* Verify that overloaded built-ins for vec_sl with char >>>>> + inputs produce the right results. */ >>>>> + >>>>> +/* { dg-do compile } */ >>>>> +/* { dg-require-effective-target powerpc_altivec_ok } */ >>>>> +/* { dg-options "-maltivec -O2" } */ >>>>> + >>>>> +#include <altivec.h> >>>>> + >>>>> +//# vec_sl - shift left >>>>> +//# vec_sr - shift right >>>>> +//# vec_sra - shift right algebraic >>>>> +//# vec_rl - rotate left >>>>> + >>>>> +vector signed char >>>>> +testsl_signed (vector signed char x, vector unsigned char y) >>>>> +{ >>>>> + return vec_sl (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned char >>>>> +testsl_unsigned (vector unsigned char x, vector unsigned char y) >>>>> +{ >>>>> + return vec_sl (x, y); >>>>> +} >>>>> + >>>>> +vector signed char >>>>> +testsr_signed (vector signed char x, vector unsigned char y) >>>>> +{ >>>>> + return vec_sr (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned char >>>>> +testsr_unsigned (vector unsigned char x, vector unsigned char y) >>>>> +{ >>>>> + return vec_sr (x, y); >>>>> +} >>>>> + >>>>> +vector signed char >>>>> +testsra_signed (vector signed char x, vector unsigned char y) >>>>> +{ >>>>> + return vec_sra (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned char >>>>> +testsra_unsigned (vector unsigned char x, vector unsigned char y) >>>>> +{ >>>>> + return vec_sra (x, y); >>>>> +} >>>>> + >>>>> +vector signed char >>>>> +testrl_signed (vector signed char x, vector unsigned char y) >>>>> +{ >>>>> + return vec_rl (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned char >>>>> +testrl_unsigned (vector unsigned char x, vector unsigned char y) >>>>> +{ >>>>> + return vec_rl (x, y); >>>>> +} >>>>> + >>>>> +/* { dg-final { scan-assembler-times "vslb" 2 } } */ >>>>> +/* { dg-final { scan-assembler-times "vsrb" 2 } } */ >>>>> +/* { dg-final { scan-assembler-times "vsrab" 2 } } */ >>>>> +/* { dg-final { scan-assembler-times "vrlb" 2 } } */ >>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-int.c >>>>> b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-int.c >>>>> new file mode 100644 >>>>> index 0000000..e9c5fe1 >>>>> --- /dev/null >>>>> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-int.c >>>>> @@ -0,0 +1,61 @@ >>>>> +/* Verify that overloaded built-ins for vec_sl with int >>>>> + inputs produce the right results. */ >>>>> + >>>>> +/* { dg-do compile } */ >>>>> +/* { dg-require-effective-target powerpc_altivec_ok } */ >>>>> +/* { dg-options "-maltivec -O2" } */ >>>>> + >>>>> +#include <altivec.h> >>>>> + >>>>> +vector signed int >>>>> +testsl_signed (vector signed int x, vector unsigned int y) >>>>> +{ >>>>> + return vec_sl (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned int >>>>> +testsl_unsigned (vector unsigned int x, vector unsigned int y) >>>>> +{ >>>>> + return vec_sl (x, y); >>>>> +} >>>>> + >>>>> +vector signed int >>>>> +testsr_signed (vector signed int x, vector unsigned int y) >>>>> +{ >>>>> + return vec_sr (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned int >>>>> +testsr_unsigned (vector unsigned int x, vector unsigned int y) >>>>> +{ >>>>> + return vec_sr (x, y); >>>>> +} >>>>> + >>>>> +vector signed int >>>>> +testsra_signed (vector signed int x, vector unsigned int y) >>>>> +{ >>>>> + return vec_sra (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned int >>>>> +testsra_unsigned (vector unsigned int x, vector unsigned int y) >>>>> +{ >>>>> + return vec_sra (x, y); >>>>> +} >>>>> + >>>>> +vector signed int >>>>> +testrl_signed (vector signed int x, vector unsigned int y) >>>>> +{ >>>>> + return vec_rl (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned int >>>>> +testrl_unsigned (vector unsigned int x, vector unsigned int y) >>>>> +{ >>>>> + return vec_rl (x, y); >>>>> +} >>>>> + >>>>> +/* { dg-final { scan-assembler-times "vslw" 2 } } */ >>>>> +/* { dg-final { scan-assembler-times "vsrw" 2 } } */ >>>>> +/* { dg-final { scan-assembler-times "vsraw" 2 } } */ >>>>> +/* { dg-final { scan-assembler-times "vrlw" 2 } } */ >>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c >>>>> b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c >>>>> new file mode 100644 >>>>> index 0000000..97b82cf >>>>> --- /dev/null >>>>> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-longlong.c >>>>> @@ -0,0 +1,63 @@ >>>>> +/* Verify that overloaded built-ins for vec_sl with long long >>>>> + inputs produce the right results. */ >>>>> + >>>>> +/* { dg-do compile } */ >>>>> +/* { dg-require-effective-target powerpc_p8vector_ok } */ >>>>> +/* { dg-options "-mpower8-vector -O2" } */ >>>>> + >>>>> +#include <altivec.h> >>>>> + >>>>> +vector signed long long >>>>> +testsl_signed (vector signed long long x, vector unsigned long long y) >>>>> +{ >>>>> + return vec_sl (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned long long >>>>> +testsl_unsigned (vector unsigned long long x, vector unsigned long long >>>>> y) >>>>> +{ >>>>> + return vec_sl (x, y); >>>>> +} >>>>> + >>>>> +vector signed long long >>>>> +testsr_signed (vector signed long long x, vector unsigned long long y) >>>>> +{ >>>>> + return vec_sr (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned long long >>>>> +testsr_unsigned (vector unsigned long long x, vector unsigned long long >>>>> y) >>>>> +{ >>>>> + return vec_sr (x, y); >>>>> +} >>>>> + >>>>> +vector signed long long >>>>> +testsra_signed (vector signed long long x, vector unsigned long long y) >>>>> +{ >>>>> + return vec_sra (x, y); >>>>> +} >>>>> + >>>>> +/* watch for PR 79544 here (vsrd / vsrad issue) */ >>>>> +vector unsigned long long >>>>> +testsra_unsigned (vector unsigned long long x, vector unsigned long long >>>>> y) >>>>> +{ >>>>> + return vec_sra (x, y); >>>>> +} >>>>> + >>>>> +vector signed long long >>>>> +testrl_signed (vector signed long long x, vector unsigned long long y) >>>>> +{ >>>>> + return vec_rl (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned long long >>>>> +testrl_unsigned (vector unsigned long long x, vector unsigned long long >>>>> y) >>>>> +{ >>>>> + return vec_rl (x, y); >>>>> +} >>>>> + >>>>> +/* { dg-final { scan-assembler-times "vsld" 2 } } */ >>>>> +/* { dg-final { scan-assembler-times "vsrd" 2 } } */ >>>>> +/* { dg-final { scan-assembler-times "vsrad" 2 } } */ >>>>> +/* { dg-final { scan-assembler-times "vrld" 2 } } */ >>>>> + >>>>> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-short.c >>>>> b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-short.c >>>>> new file mode 100644 >>>>> index 0000000..4ca7c18 >>>>> --- /dev/null >>>>> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-shift-short.c >>>>> @@ -0,0 +1,61 @@ >>>>> +/* Verify that overloaded built-ins for vec_sl with short >>>>> + inputs produce the right results. */ >>>>> + >>>>> +/* { dg-do compile } */ >>>>> +/* { dg-require-effective-target powerpc_altivec_ok } */ >>>>> +/* { dg-options "-maltivec -O2" } */ >>>>> + >>>>> +#include <altivec.h> >>>>> + >>>>> +vector signed short >>>>> +testsl_signed (vector signed short x, vector unsigned short y) >>>>> +{ >>>>> + return vec_sl (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned short >>>>> +testsl_unsigned (vector unsigned short x, vector unsigned short y) >>>>> +{ >>>>> + return vec_sl (x, y); >>>>> +} >>>>> + >>>>> +vector signed short >>>>> +testsr_signed (vector signed short x, vector unsigned short y) >>>>> +{ >>>>> + return vec_sr (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned short >>>>> +testsr_unsigned (vector unsigned short x, vector unsigned short y) >>>>> +{ >>>>> + return vec_sr (x, y); >>>>> +} >>>>> + >>>>> +vector signed short >>>>> +testsra_signed (vector signed short x, vector unsigned short y) >>>>> +{ >>>>> + return vec_sra (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned short >>>>> +testsra_unsigned (vector unsigned short x, vector unsigned short y) >>>>> +{ >>>>> + return vec_sra (x, y); >>>>> +} >>>>> + >>>>> +vector signed short >>>>> +testrl_signed (vector signed short x, vector unsigned short y) >>>>> +{ >>>>> + return vec_rl (x, y); >>>>> +} >>>>> + >>>>> +vector unsigned short >>>>> +testrl_unsigned (vector unsigned short x, vector unsigned short y) >>>>> +{ >>>>> + return vec_rl (x, y); >>>>> +} >>>>> + >>>>> +/* { dg-final { scan-assembler-times "vslh" 2 } } */ >>>>> +/* { dg-final { scan-assembler-times "vsrh" 2 } } */ >>>>> +/* { dg-final { scan-assembler-times "vsrah" 2 } } */ >>>>> +/* { dg-final { scan-assembler-times "vrlh" 2 } } */ >