Hi Prathamesh, I've just committed a patch that fixes a large number of SVE reduction-related failures. Could you rebase and retest on top of that? Sorry for messing you around, but regression testing based on the state before the patch wouldn't have been that meaningful. In particular...
Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes: > diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c > index a70d52eb2ca..82814e2c2af 100644 > --- a/gcc/tree-vect-loop.c > +++ b/gcc/tree-vect-loop.c > @@ -6428,6 +6428,7 @@ vectorizable_reduction (stmt_vec_info stmt_info, > slp_tree slp_node, > if (loop_vinfo && LOOP_VINFO_CAN_FULLY_MASK_P (loop_vinfo)) > { > if (reduction_type != FOLD_LEFT_REDUCTION > + && reduction_type != EXTRACT_LAST_REDUCTION > && !mask_by_cond_expr > && (cond_fn == IFN_LAST > || !direct_internal_fn_supported_p (cond_fn, vectype_in, ...after today's patch, it's instead necessary to remove: if (loop_vinfo && LOOP_VINFO_CAN_FULLY_MASK_P (loop_vinfo) && reduction_type == EXTRACT_LAST_REDUCTION) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "can't yet use a fully-masked loop for" " EXTRACT_LAST_REDUCTION.\n"); LOOP_VINFO_CAN_FULLY_MASK_P (loop_vinfo) = false; } from vectorizable_condition. We no longer need any changes to vectorizable_reduction itself. > @@ -10180,18 +10181,29 @@ vectorizable_condition (stmt_vec_info stmt_info, > gimple_stmt_iterator *gsi, > vec != { 0, ... } (masked in the MASK_LOAD, > unmasked in the VEC_COND_EXPR). */ > > - if (loop_mask) > + if (masks) > { > - if (COMPARISON_CLASS_P (vec_compare)) > + unsigned vec_num = vec_oprnds0.length (); > + loop_mask = vect_get_loop_mask (gsi, masks, vec_num * ncopies, > + vectype, vec_num * j + i); Ah... now that the two cases are merged (good!), just "if (masks)" isn't right after all, sorry for the misleading comment. I think this should instead be: /* Force vec_compare to be an SSA_NAME rather than a comparison, in cases where that's necessary. */ if (masks || reduction_type == EXTRACT_LAST_REDUCTION) { Not doing that would break unmasked EXTRACT_LAST_REDUCTIONs. Then make the existing: tree tmp2 = make_ssa_name (vec_cmp_type); gassign *g = gimple_build_assign (tmp2, BIT_AND_EXPR, vec_compare, loop_mask); vect_finish_stmt_generation (stmt_info, g, gsi); vec_compare = tmp2; conditional on "if (masks)" only, and defer the calculation of loop_mask to this point too. [ It ould be good to spot-check that aarch64-sve.exp passes after making the changes to the stmt-generation part of vectorizable_condition, but before removing the: LOOP_VINFO_CAN_FULLY_MASK_P (loop_vinfo) = false; quoted above. That would show that unmasked fold-left reductions still work after the changes. There are still some lingering cases in which we can test unmasked SVE loops directly, but they're becoming rarer and should eventually go away altogether. So I don't think it's worth trying to construct an unmasked test for the testsuite. ] > + > + if (!is_gimple_val (vec_compare)) > + { > + tree vec_compare_name = make_ssa_name (vec_cmp_type); > + gassign *new_stmt = gimple_build_assign (vec_compare_name, > + vec_compare); > + vect_finish_stmt_generation (stmt_info, new_stmt, gsi); > + vec_compare = vec_compare_name; > + } Should use tab-based indentation. Thanks, Richard