Hi,

This is one fix following Richi's comments here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542232.html

I noticed the current half vector support for no peeling gaps
handled some cases which never check the half size vector
support.  By further investigation, those cases are safe
to play without peeling gaps due to ideal alignment.  It
means they don't require half vector handlings, we should
avoid to use half vector for them.

The fix is to add alignment check as a part of conditions for
half vector support avoiding redundant half vector codes.

Bootstrapped/regtested on powerpc64le-linux-gnu P8, while
aarch64-linux-gnu testing is ongoing.

Is it ok for trunk if all testings are fine?

BR,
Kewen
----------------

gcc/ChangeLog

2020-MM-DD  Kewen Lin  <li...@gcc.gnu.org>

        * gcc/tree-vect-stmts.c (vectorizable_load): Check alignment to avoid
        redundant half vector handlings for no peeling gaps.
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 7f3a9fb5fb3..bfd2fabaa81 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -9582,6 +9582,12 @@ vectorizable_load (stmt_vec_info stmt_info, 
gimple_stmt_iterator *gsi,
                      {
                        tree ltype = vectype;
                        tree new_vtype = NULL_TREE;
+                       unsigned HOST_WIDE_INT gap
+                         = DR_GROUP_GAP (first_stmt_info);
+                       unsigned int vect_align
+                         = vect_known_alignment_in_bytes (first_dr_info);
+                       unsigned int scalar_dr_size
+                         = vect_get_scalar_dr_size (first_dr_info);
                        /* If there's no peeling for gaps but we have a gap
                           with slp loads then load the lower half of the
                           vector only.  See get_group_load_store_type for
@@ -9589,11 +9595,10 @@ vectorizable_load (stmt_vec_info stmt_info, 
gimple_stmt_iterator *gsi,
                        if (slp
                            && loop_vinfo
                            && !LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
-                           && DR_GROUP_GAP (first_stmt_info) != 0
-                           && known_eq (nunits,
-                                        (group_size
-                                         - DR_GROUP_GAP (first_stmt_info)) * 2)
-                           && known_eq (nunits, group_size))
+                           && gap != 0
+                           && known_eq (nunits, (group_size - gap) * 2)
+                           && known_eq (nunits, group_size)
+                           && gap >= (vect_align / scalar_dr_size))
                          {
                            tree half_vtype;
                            new_vtype
@@ -9608,10 +9613,9 @@ vectorizable_load (stmt_vec_info stmt_info, 
gimple_stmt_iterator *gsi,
                        if (ltype != vectype
                            && memory_access_type == VMAT_CONTIGUOUS_REVERSE)
                          {
-                           unsigned HOST_WIDE_INT gap
-                             = DR_GROUP_GAP (first_stmt_info);
-                           gap *= tree_to_uhwi (TYPE_SIZE_UNIT (elem_type));
-                           tree gapcst = build_int_cst (ref_type, gap);
+                           unsigned HOST_WIDE_INT gap_offset
+                             = gap * tree_to_uhwi (TYPE_SIZE_UNIT (elem_type));
+                           tree gapcst = build_int_cst (ref_type, gap_offset);
                            offset = size_binop (PLUS_EXPR, offset, gapcst);
                          }
                        data_ref

Reply via email to