On Fri, Jun 29, 2018 at 11:21 AM Richard Sandiford <richard.sandif...@arm.com> wrote: > > This patch adds detection of average instructions: > > a = (((wide) b + (wide) c) >> 1); > --> a = (wide) .AVG_FLOOR (b, c); > > a = (((wide) b + (wide) c + 1) >> 1); > --> a = (wide) .AVG_CEIL (b, c); > > in cases where users of "a" need only the low half of the result, > making the cast to (wide) redundant. The heavy lifting was done by > earlier patches. > > This showed up another problem in vectorizable_call: if the call is a > pattern definition statement rather than the main pattern statement, > the type of vectorised call might be different from the type of the > original statement. > > Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install?
OK. Thanks, Richard. > Richard > > > 2018-06-29 Richard Sandiford <richard.sandif...@arm.com> > > gcc/ > PR tree-optimization/85694 > * doc/md.texi (avgM3_floor, uavgM3_floor, avgM3_ceil) > (uavgM3_ceil): Document new optabs. > * doc/sourcebuild.texi (vect_avg_qi): Document new target selector. > * internal-fn.def (IFN_AVG_FLOOR, IFN_AVG_CEIL): New internal > functions. > * optabs.def (savg_floor_optab, uavg_floor_optab, savg_ceil_optab) > (savg_ceil_optab): New optabs. > * tree-vect-patterns.c (vect_recog_average_pattern): New function. > (vect_vect_recog_func_ptrs): Add it. > * tree-vect-stmts.c (vectorizable_call): Get the type of the zero > constant directly from the associated lhs. > > gcc/testsuite/ > PR tree-optimization/85694 > * lib/target-supports.exp (check_effective_target_vect_avg_qi): New > proc. > * gcc.dg/vect/vect-avg-1.c: New test. > * gcc.dg/vect/vect-avg-2.c: Likewise. > * gcc.dg/vect/vect-avg-3.c: Likewise. > * gcc.dg/vect/vect-avg-4.c: Likewise. > * gcc.dg/vect/vect-avg-5.c: Likewise. > * gcc.dg/vect/vect-avg-6.c: Likewise. > * gcc.dg/vect/vect-avg-7.c: Likewise. > * gcc.dg/vect/vect-avg-8.c: Likewise. > * gcc.dg/vect/vect-avg-9.c: Likewise. > * gcc.dg/vect/vect-avg-10.c: Likewise. > * gcc.dg/vect/vect-avg-11.c: Likewise. > * gcc.dg/vect/vect-avg-12.c: Likewise. > * gcc.dg/vect/vect-avg-13.c: Likewise. > * gcc.dg/vect/vect-avg-14.c: Likewise. > > Index: gcc/doc/md.texi > =================================================================== > --- gcc/doc/md.texi 2018-06-29 10:14:49.425353913 +0100 > +++ gcc/doc/md.texi 2018-06-29 10:16:31.936416331 +0100 > @@ -5599,6 +5599,34 @@ Other shift and rotate instructions, ana > Vector shift and rotate instructions that take vectors as operand 2 > instead of a scalar type. > > +@cindex @code{avg@var{m}3_floor} instruction pattern > +@cindex @code{uavg@var{m}3_floor} instruction pattern > +@item @samp{avg@var{m}3_floor} > +@itemx @samp{uavg@var{m}3_floor} > +Signed and unsigned average instructions. These instructions add > +operands 1 and 2 without truncation, divide the result by 2, > +round towards -Inf, and store the result in operand 0. This is > +equivalent to the C code: > +@smallexample > +narrow op0, op1, op2; > +@dots{} > +op0 = (narrow) (((wide) op1 + (wide) op2) >> 1); > +@end smallexample > +where the sign of @samp{narrow} determines whether this is a signed > +or unsigned operation. > + > +@cindex @code{avg@var{m}3_ceil} instruction pattern > +@cindex @code{uavg@var{m}3_ceil} instruction pattern > +@item @samp{avg@var{m}3_ceil} > +@itemx @samp{uavg@var{m}3_ceil} > +Like @samp{avg@var{m}3_floor} and @samp{uavg@var{m}3_floor}, but round > +towards +Inf. This is equivalent to the C code: > +@smallexample > +narrow op0, op1, op2; > +@dots{} > +op0 = (narrow) (((wide) op1 + (wide) op2 + 1) >> 1); > +@end smallexample > + > @cindex @code{bswap@var{m}2} instruction pattern > @item @samp{bswap@var{m}2} > Reverse the order of bytes of operand 1 and store the result in operand 0. > Index: gcc/doc/sourcebuild.texi > =================================================================== > --- gcc/doc/sourcebuild.texi 2018-06-14 12:27:24.156171818 +0100 > +++ gcc/doc/sourcebuild.texi 2018-06-29 10:16:31.936416331 +0100 > @@ -1417,6 +1417,10 @@ Target supports Fortran @code{real} kind > The target's ABI allows stack variables to be aligned to the preferred > vector alignment. > > +@item vect_avg_qi > +Target supports both signed and unsigned averaging operations on vectors > +of bytes. > + > @item vect_condition > Target supports vector conditional operations. > > Index: gcc/internal-fn.def > =================================================================== > --- gcc/internal-fn.def 2018-06-14 12:27:34.108084438 +0100 > +++ gcc/internal-fn.def 2018-06-29 10:16:31.936416331 +0100 > @@ -143,6 +143,11 @@ DEF_INTERNAL_OPTAB_FN (FMS, ECF_CONST, f > DEF_INTERNAL_OPTAB_FN (FNMA, ECF_CONST, fnma, ternary) > DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST, fnms, ternary) > > +DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_FLOOR, ECF_CONST | ECF_NOTHROW, first, > + savg_floor, uavg_floor, binary) > +DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_CEIL, ECF_CONST | ECF_NOTHROW, first, > + savg_ceil, uavg_ceil, binary) > + > DEF_INTERNAL_OPTAB_FN (COND_ADD, ECF_CONST, cond_add, cond_binary) > DEF_INTERNAL_OPTAB_FN (COND_SUB, ECF_CONST, cond_sub, cond_binary) > DEF_INTERNAL_OPTAB_FN (COND_MUL, ECF_CONST, cond_smul, cond_binary) > Index: gcc/optabs.def > =================================================================== > --- gcc/optabs.def 2018-06-14 12:27:24.852165699 +0100 > +++ gcc/optabs.def 2018-06-29 10:16:31.936416331 +0100 > @@ -316,6 +316,10 @@ OPTAB_D (fold_left_plus_optab, "fold_lef > OPTAB_D (extract_last_optab, "extract_last_$a") > OPTAB_D (fold_extract_last_optab, "fold_extract_last_$a") > > +OPTAB_D (savg_floor_optab, "avg$a3_floor") > +OPTAB_D (uavg_floor_optab, "uavg$a3_floor") > +OPTAB_D (savg_ceil_optab, "avg$a3_ceil") > +OPTAB_D (uavg_ceil_optab, "uavg$a3_ceil") > OPTAB_D (sdot_prod_optab, "sdot_prod$I$a") > OPTAB_D (ssum_widen_optab, "widen_ssum$I$a3") > OPTAB_D (udot_prod_optab, "udot_prod$I$a") > Index: gcc/tree-vect-patterns.c > =================================================================== > --- gcc/tree-vect-patterns.c 2018-06-29 10:16:18.472539356 +0100 > +++ gcc/tree-vect-patterns.c 2018-06-29 10:16:31.940416295 +0100 > @@ -1721,6 +1721,153 @@ vect_recog_over_widening_pattern (vec<gi > return pattern_stmt; > } > > +/* Recognize the patterns: > + > + ATYPE a; // narrower than TYPE > + BTYPE b; // narrower than TYPE > + (1) TYPE avg = ((TYPE) a + (TYPE) b) >> 1; > + or (2) TYPE avg = ((TYPE) a + (TYPE) b + 1) >> 1; > + > + where only the bottom half of avg is used. Try to transform them into: > + > + (1) NTYPE avg' = .AVG_FLOOR ((NTYPE) a, (NTYPE) b); > + or (2) NTYPE avg' = .AVG_CEIL ((NTYPE) a, (NTYPE) b); > + > + followed by: > + > + TYPE avg = (TYPE) avg'; > + > + where NTYPE is no wider than half of TYPE. Since only the bottom half > + of avg is used, all or part of the cast of avg' should become redundant. > */ > + > +static gimple * > +vect_recog_average_pattern (vec<gimple *> *stmts, tree *type_out) > +{ > + /* Check for a shift right by one bit. */ > + gassign *last_stmt = dyn_cast <gassign *> (stmts->pop ()); > + if (!last_stmt > + || gimple_assign_rhs_code (last_stmt) != RSHIFT_EXPR > + || !integer_onep (gimple_assign_rhs2 (last_stmt))) > + return NULL; > + > + stmt_vec_info last_stmt_info = vinfo_for_stmt (last_stmt); > + vec_info *vinfo = last_stmt_info->vinfo; > + > + /* Check that the shift result is wider than the users of the > + result need (i.e. that narrowing would be a natural choice). */ > + tree lhs = gimple_assign_lhs (last_stmt); > + tree type = TREE_TYPE (lhs); > + unsigned int target_precision > + = vect_element_precision (last_stmt_info->min_output_precision); > + if (!INTEGRAL_TYPE_P (type) || target_precision >= TYPE_PRECISION (type)) > + return NULL; > + > + /* Get the definition of the shift input. */ > + tree rshift_rhs = gimple_assign_rhs1 (last_stmt); > + stmt_vec_info plus_stmt_info = vect_get_internal_def (vinfo, rshift_rhs); > + if (!plus_stmt_info) > + return NULL; > + > + /* Check whether the shift input can be seen as a tree of additions on > + 2 or 3 widened inputs. > + > + Note that the pattern should be a win even if the result of one or > + more additions is reused elsewhere: if the pattern matches, we'd be > + replacing 2N RSHIFT_EXPRs and N VEC_PACK_*s with N IFN_AVG_*s. */ > + internal_fn ifn = IFN_AVG_FLOOR; > + vect_unpromoted_value unprom[3]; > + tree new_type; > + unsigned int nops = vect_widened_op_tree (plus_stmt_info, PLUS_EXPR, > + PLUS_EXPR, false, 3, > + unprom, &new_type); > + if (nops == 0) > + return NULL; > + if (nops == 3) > + { > + /* Check that one operand is 1. */ > + unsigned int i; > + for (i = 0; i < 3; ++i) > + if (integer_onep (unprom[i].op)) > + break; > + if (i == 3) > + return NULL; > + /* Throw away the 1 operand and keep the other two. */ > + if (i < 2) > + unprom[i] = unprom[2]; > + ifn = IFN_AVG_CEIL; > + } > + > + vect_pattern_detected ("vect_recog_average_pattern", last_stmt); > + > + /* We know that: > + > + (a) the operation can be viewed as: > + > + TYPE widened0 = (TYPE) UNPROM[0]; > + TYPE widened1 = (TYPE) UNPROM[1]; > + TYPE tmp1 = widened0 + widened1 {+ 1}; > + TYPE tmp2 = tmp1 >> 1; // LAST_STMT_INFO > + > + (b) the first two statements are equivalent to: > + > + TYPE widened0 = (TYPE) (NEW_TYPE) UNPROM[0]; > + TYPE widened1 = (TYPE) (NEW_TYPE) UNPROM[1]; > + > + (c) vect_recog_over_widening_pattern has already tried to narrow TYPE > + where sensible; > + > + (d) all the operations can be performed correctly at twice the width of > + NEW_TYPE, due to the nature of the average operation; and > + > + (e) users of the result of the right shift need only TARGET_PRECISION > + bits, where TARGET_PRECISION is no more than half of TYPE's > + precision. > + > + Under these circumstances, the only situation in which NEW_TYPE > + could be narrower than TARGET_PRECISION is if widened0, widened1 > + and an addition result are all used more than once. Thus we can > + treat any widening of UNPROM[0] and UNPROM[1] to TARGET_PRECISION > + as "free", whereas widening the result of the average instruction > + from NEW_TYPE to TARGET_PRECISION would be a new operation. It's > + therefore better not to go narrower than TARGET_PRECISION. */ > + if (TYPE_PRECISION (new_type) < target_precision) > + new_type = build_nonstandard_integer_type (target_precision, > + TYPE_UNSIGNED (new_type)); > + > + /* Check for target support. */ > + tree new_vectype = get_vectype_for_scalar_type (new_type); > + if (!new_vectype > + || !direct_internal_fn_supported_p (ifn, new_vectype, > + OPTIMIZE_FOR_SPEED)) > + return NULL; > + > + /* The IR requires a valid vector type for the cast result, even though > + it's likely to be discarded. */ > + *type_out = get_vectype_for_scalar_type (type); > + if (!*type_out) > + return NULL; > + > + /* Generate the IFN_AVG* call. */ > + tree new_var = vect_recog_temp_ssa_var (new_type, NULL); > + tree new_ops[2]; > + vect_convert_inputs (last_stmt_info, 2, new_ops, new_type, > + unprom, new_vectype); > + gcall *average_stmt = gimple_build_call_internal (ifn, 2, new_ops[0], > + new_ops[1]); > + gimple_call_set_lhs (average_stmt, new_var); > + gimple_set_location (average_stmt, gimple_location (last_stmt)); > + > + if (dump_enabled_p ()) > + { > + dump_printf_loc (MSG_NOTE, vect_location, > + "created pattern stmt: "); > + dump_gimple_stmt (MSG_NOTE, TDF_SLIM, average_stmt, 0); > + } > + > + stmts->safe_push (last_stmt); > + return vect_convert_output (last_stmt_info, type, average_stmt, > new_vectype); > +} > + > /* Recognize cases in which the input to a cast is wider than its > output, and the input is fed by a widening operation. Fold this > by removing the unnecessary intermediate widening. E.g.: > @@ -4670,6 +4817,9 @@ struct vect_recog_func > less comples onex (widen_sum only after dot_prod or sad for example). */ > static vect_recog_func vect_vect_recog_func_ptrs[] = { > { vect_recog_over_widening_pattern, "over_widening" }, > + /* Must come after over_widening, which narrows the shift as much as > + possible beforehand. */ > + { vect_recog_average_pattern, "average" }, > { vect_recog_cast_forwprop_pattern, "cast_forwprop" }, > { vect_recog_widen_mult_pattern, "widen_mult" }, > { vect_recog_dot_prod_pattern, "dot_prod" }, > Index: gcc/tree-vect-stmts.c > =================================================================== > --- gcc/tree-vect-stmts.c 2018-06-29 10:14:52.941321720 +0100 > +++ gcc/tree-vect-stmts.c 2018-06-29 10:16:31.940416295 +0100 > @@ -3116,7 +3116,7 @@ vectorizable_call (gimple *gs, gimple_st > gcall *stmt; > tree vec_dest; > tree scalar_dest; > - tree op, type; > + tree op; > tree vec_oprnd0 = NULL_TREE, vec_oprnd1 = NULL_TREE; > stmt_vec_info stmt_info = vinfo_for_stmt (gs), prev_stmt_info; > tree vectype_out, vectype_in; > @@ -3592,12 +3592,11 @@ vectorizable_call (gimple *gs, gimple_st > if (slp_node) > return true; > > - type = TREE_TYPE (scalar_dest); > if (is_pattern_stmt_p (stmt_info)) > stmt_info = vinfo_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info)); > lhs = gimple_get_lhs (stmt_info->stmt); > > - new_stmt = gimple_build_assign (lhs, build_zero_cst (type)); > + new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs))); > set_vinfo_for_stmt (new_stmt, stmt_info); > set_vinfo_for_stmt (stmt_info->stmt, NULL); > STMT_VINFO_STMT (stmt_info) = new_stmt; > Index: gcc/testsuite/lib/target-supports.exp > =================================================================== > --- gcc/testsuite/lib/target-supports.exp 2018-06-27 10:27:09.358654355 > +0100 > +++ gcc/testsuite/lib/target-supports.exp 2018-06-29 10:16:31.940416295 > +0100 > @@ -6313,6 +6313,13 @@ proc check_effective_target_vect_usad_ch > return $et_vect_usad_char_saved($et_index) > } > > +# Return 1 if the target plus current options supports both signed > +# and unsigned average operations on vectors of bytes. > + > +proc check_effective_target_vect_avg_qi {} { > + return 0 > +} > + > # Return 1 if the target plus current options supports a vector > # demotion (packing) of shorts (to chars) and ints (to shorts) > # using modulo arithmetic, 0 otherwise. > Index: gcc/testsuite/gcc.dg/vect/vect-avg-1.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-1.c 2018-06-29 10:16:31.936416331 > +0100 > @@ -0,0 +1,47 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#include "tree-vect.h" > + > +#define N 50 > + > +#ifndef SIGNEDNESS > +#define SIGNEDNESS unsigned > +#endif > +#ifndef BIAS > +#define BIAS 0 > +#endif > + > +void __attribute__ ((noipa)) > +f (SIGNEDNESS char *restrict a, SIGNEDNESS char *restrict b, > + SIGNEDNESS char *restrict c) > +{ > + for (__INTPTR_TYPE__ i = 0; i < N; ++i) > + a[i] = (b[i] + c[i] + BIAS) >> 1; > +} > + > +#define BASE1 ((SIGNEDNESS int) -1 < 0 ? -126 : 4) > +#define BASE2 ((SIGNEDNESS int) -1 < 0 ? -101 : 26) > + > +int > +main (void) > +{ > + check_vect (); > + > + SIGNEDNESS char a[N], b[N], c[N]; > + for (int i = 0; i < N; ++i) > + { > + b[i] = BASE1 + i * 5; > + c[i] = BASE2 + i * 4; > + asm volatile ("" ::: "memory"); > + } > + f (a, b, c); > + for (int i = 0; i < N; ++i) > + if (a[i] != ((BASE1 + BASE2 + i * 9 + BIAS) >> 1)) > + __builtin_abort (); > + return 0; > +} > + > +/* { dg-final { scan-tree-dump "vect_recog_average_pattern: detected" "vect" > } } */ > +/* { dg-final { scan-tree-dump {\.AVG_FLOOR} "vect" { target vect_avg_qi } } > } */ > +/* { dg-final { scan-tree-dump-not {vector\([^\n]*short} "vect" { target > vect_avg_qi } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target > vect_avg_qi } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-avg-2.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-2.c 2018-06-29 10:16:31.936416331 > +0100 > @@ -0,0 +1,10 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#define SIGNEDNESS signed > + > +#include "vect-avg-1.c" > + > +/* { dg-final { scan-tree-dump "vect_recog_average_pattern: detected" "vect" > } } */ > +/* { dg-final { scan-tree-dump {\.AVG_FLOOR} "vect" { target vect_avg_qi } } > } */ > +/* { dg-final { scan-tree-dump-not {vector\([^\n]*short} "vect" { target > vect_avg_qi } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target > vect_avg_qi } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-avg-3.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-3.c 2018-06-29 10:16:31.936416331 > +0100 > @@ -0,0 +1,11 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#define SIGNEDNESS unsigned > +#define BIAS 1 > + > +#include "vect-avg-1.c" > + > +/* { dg-final { scan-tree-dump "vect_recog_average_pattern: detected" "vect" > } } */ > +/* { dg-final { scan-tree-dump {\.AVG_CEIL} "vect" { target vect_avg_qi } } > } */ > +/* { dg-final { scan-tree-dump-not {vector\([^\n]*short} "vect" { target > vect_avg_qi } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target > vect_avg_qi } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-avg-4.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-4.c 2018-06-29 10:16:31.936416331 > +0100 > @@ -0,0 +1,11 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#define SIGNEDNESS signed > +#define BIAS 1 > + > +#include "vect-avg-1.c" > + > +/* { dg-final { scan-tree-dump "vect_recog_average_pattern: detected" "vect" > } } */ > +/* { dg-final { scan-tree-dump {\.AVG_CEIL} "vect" { target vect_avg_qi } } > } */ > +/* { dg-final { scan-tree-dump-not {vector\([^\n]*short} "vect" { target > vect_avg_qi } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target > vect_avg_qi } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-avg-5.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-5.c 2018-06-29 10:16:31.936416331 > +0100 > @@ -0,0 +1,51 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#include "tree-vect.h" > + > +#define N 50 > + > +#ifndef SIGNEDNESS > +#define SIGNEDNESS unsigned > +#endif > +#ifndef BIAS > +#define BIAS 0 > +#endif > + > +void __attribute__ ((noipa)) > +f (SIGNEDNESS char *restrict a, SIGNEDNESS char *restrict b, > + SIGNEDNESS char *restrict c) > +{ > + for (__INTPTR_TYPE__ i = 0; i < N; ++i) > + { > + int tmp1 = b[i] + BIAS; > + int tmp2 = tmp1 + c[i]; > + a[i] = tmp2 >> 1; > + } > +} > + > +#define BASE1 ((SIGNEDNESS int) -1 < 0 ? -126 : 4) > +#define BASE2 ((SIGNEDNESS int) -1 < 0 ? -101 : 26) > + > +int > +main (void) > +{ > + check_vect (); > + > + SIGNEDNESS char a[N], b[N], c[N]; > + for (int i = 0; i < N; ++i) > + { > + b[i] = BASE1 + i * 5; > + c[i] = BASE2 + i * 4; > + asm volatile ("" ::: "memory"); > + } > + f (a, b, c); > + for (int i = 0; i < N; ++i) > + if (a[i] != ((BASE1 + BASE2 + i * 9 + BIAS) >> 1)) > + __builtin_abort (); > + return 0; > +} > + > +/* { dg-final { scan-tree-dump "vect_recog_average_pattern: detected" "vect" > } } */ > +/* { dg-final { scan-tree-dump {\.AVG_FLOOR} "vect" { target vect_avg_qi } } > } */ > +/* { dg-final { scan-tree-dump-not {vector\([^\n]*short} "vect" { target > vect_avg_qi } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target > vect_avg_qi } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-avg-6.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-6.c 2018-06-29 10:16:31.940416295 > +0100 > @@ -0,0 +1,10 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#define SIGNEDNESS signed > + > +#include "vect-avg-5.c" > + > +/* { dg-final { scan-tree-dump "vect_recog_average_pattern: detected" "vect" > } } */ > +/* { dg-final { scan-tree-dump {\.AVG_FLOOR} "vect" { target vect_avg_qi } } > } */ > +/* { dg-final { scan-tree-dump-not {vector\([^\n]*short} "vect" { target > vect_avg_qi } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target > vect_avg_qi } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-avg-7.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-7.c 2018-06-29 10:16:31.940416295 > +0100 > @@ -0,0 +1,11 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#define SIGNEDNESS unsigned > +#define BIAS 1 > + > +#include "vect-avg-5.c" > + > +/* { dg-final { scan-tree-dump "vect_recog_average_pattern: detected" "vect" > } } */ > +/* { dg-final { scan-tree-dump {\.AVG_CEIL} "vect" { target vect_avg_qi } } > } */ > +/* { dg-final { scan-tree-dump-not {vector\([^\n]*short} "vect" { target > vect_avg_qi } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target > vect_avg_qi } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-avg-8.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-8.c 2018-06-29 10:16:31.940416295 > +0100 > @@ -0,0 +1,11 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#define SIGNEDNESS signed > +#define BIAS 1 > + > +#include "vect-avg-5.c" > + > +/* { dg-final { scan-tree-dump "vect_recog_average_pattern: detected" "vect" > } } */ > +/* { dg-final { scan-tree-dump {\.AVG_CEIL} "vect" { target vect_avg_qi } } > } */ > +/* { dg-final { scan-tree-dump-not {vector\([^\n]*short} "vect" { target > vect_avg_qi } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target > vect_avg_qi } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-avg-9.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-9.c 2018-06-29 10:16:31.940416295 > +0100 > @@ -0,0 +1,8 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#define SIGNEDNESS unsigned > +#define BIAS 2 > + > +#include "vect-avg-5.c" > + > +/* { dg-final { scan-tree-dump-not "vect_recog_average_pattern: detected" > "vect" } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-avg-10.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-10.c 2018-06-29 10:16:31.936416331 > +0100 > @@ -0,0 +1,8 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#define SIGNEDNESS signed > +#define BIAS 2 > + > +#include "vect-avg-5.c" > + > +/* { dg-final { scan-tree-dump-not "vect_recog_average_pattern: detected" > "vect" } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-avg-11.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-11.c 2018-06-29 10:16:31.936416331 > +0100 > @@ -0,0 +1,57 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#include "tree-vect.h" > + > +#define N 50 > + > +#ifndef SIGNEDNESS > +#define SIGNEDNESS unsigned > +#endif > +#ifndef BIAS > +#define BIAS 0 > +#endif > + > +void __attribute__ ((noipa)) > +f (SIGNEDNESS char *restrict a, SIGNEDNESS char *restrict b, > + SIGNEDNESS char *restrict c) > +{ > + for (__INTPTR_TYPE__ i = 0; i < N; ++i) > + { > + int tmp = b[i]; > + tmp ^= 0x55; > + tmp += BIAS; > + tmp += c[i]; > + tmp >>= 1; > + tmp |= 0x40; > + a[i] = tmp; > + } > +} > + > +#define BASE1 ((SIGNEDNESS int) -1 < 0 ? -126 : 4) > +#define BASE2 ((SIGNEDNESS int) -1 < 0 ? -101 : 26) > + > +int > +main (void) > +{ > + check_vect (); > + > + SIGNEDNESS char a[N], b[N], c[N]; > + for (int i = 0; i < N; ++i) > + { > + b[i] = BASE1 + i * 5; > + c[i] = BASE2 + i * 4; > + asm volatile ("" ::: "memory"); > + } > + f (a, b, c); > + for (int i = 0; i < N; ++i) > + if (a[i] != (((((BASE1 + i * 5) ^ 0x55) > + + (BASE2 + i * 4) > + + BIAS) >> 1) | 0x40)) > + __builtin_abort (); > + return 0; > +} > + > +/* { dg-final { scan-tree-dump "vect_recog_average_pattern: detected" "vect" > } } */ > +/* { dg-final { scan-tree-dump {\.AVG_FLOOR} "vect" { target vect_avg_qi } } > } */ > +/* { dg-final { scan-tree-dump-not {vector\([^\n]*short} "vect" { target > vect_avg_qi } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target > vect_avg_qi } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-avg-12.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-12.c 2018-06-29 10:16:31.936416331 > +0100 > @@ -0,0 +1,10 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#define SIGNEDNESS signed > + > +#include "vect-avg-11.c" > + > +/* { dg-final { scan-tree-dump "vect_recog_average_pattern: detected" "vect" > } } */ > +/* { dg-final { scan-tree-dump {\.AVG_FLOOR} "vect" { target vect_avg_qi } } > } */ > +/* { dg-final { scan-tree-dump-not {vector\([^\n]*short} "vect" { target > vect_avg_qi } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target > vect_avg_qi } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-avg-13.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-13.c 2018-06-29 10:16:31.936416331 > +0100 > @@ -0,0 +1,11 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#define SIGNEDNESS unsigned > +#define BIAS 1 > + > +#include "vect-avg-11.c" > + > +/* { dg-final { scan-tree-dump "vect_recog_average_pattern: detected" "vect" > } } */ > +/* { dg-final { scan-tree-dump {\.AVG_CEIL} "vect" { target vect_avg_qi } } > } */ > +/* { dg-final { scan-tree-dump-not {vector\([^\n]*short} "vect" { target > vect_avg_qi } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target > vect_avg_qi } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-avg-14.c > =================================================================== > --- /dev/null 2018-06-13 14:36:57.192460992 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-avg-14.c 2018-06-29 10:16:31.936416331 > +0100 > @@ -0,0 +1,11 @@ > +/* { dg-require-effective-target vect_int } */ > + > +#define SIGNEDNESS signed > +#define BIAS 1 > + > +#include "vect-avg-11.c" > + > +/* { dg-final { scan-tree-dump "vect_recog_average_pattern: detected" "vect" > } } */ > +/* { dg-final { scan-tree-dump {\.AVG_CEIL} "vect" { target vect_avg_qi } } > } */ > +/* { dg-final { scan-tree-dump-not {vector\([^\n]*short} "vect" { target > vect_avg_qi } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target > vect_avg_qi } } } */