On September 26, 2017 5:20:25 PM GMT+02:00, Martin Jambor <mjam...@suse.cz> 
wrote:
>Hi,
>
>On Mon, Sep 25, 2017 at 04:22:06PM +0300, Alexander Monakov wrote:
>> 
>> Thanks!  If this is resolved, haifa-sched autoprefetch ranking will
>become the
>> last remaining (among discovered so far) inconsistent qsort
>comparator in GCC.
>> 
>
>So the following has passed bootstrap and testing on x86_64-linux
>(including Ada).  I have failed to create a C/C++ testcase but when I
>was trying to come up with one I realized there is code in
>analyze_access_subtree that replaces integer types with smaller
>precision with full-precision ones for all cases but bit-fields.
>Therefore I adjusted the disqualification to only trigger for
>bit-fields (that happen to be of equal size as a non-integer register
>type), which I think is very unlikely to ever happen.
>
>Richi, is it OK for trunk?

OK. 

Richard. 

>Thanks,
>
>Martin
>
>
>2017-09-26  Martin Jambor  <mjam...@suse.cz>
>
>       * tree-sra.c (compare_access_positions): Put integral types first,
>       stabilize sorting of integral types, remove conditions putting
>       non-full-precision integers last.
>       (sort_and_splice_var_accesses): Disable scalarization if a
>       non-integert would be represented by a non-full-precision integer.
>---
> gcc/tree-sra.c | 42 ++++++++++++++++++++++++++++++++----------
> 1 file changed, 32 insertions(+), 10 deletions(-)
>
>diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
>index 163b7a2d03b..f5675edc7f1 100644
>--- a/gcc/tree-sra.c
>+++ b/gcc/tree-sra.c
>@@ -1542,19 +1542,20 @@ compare_access_positions (const void *a, const
>void *b)
>              && TREE_CODE (f2->type) != COMPLEX_TYPE
>              && TREE_CODE (f2->type) != VECTOR_TYPE)
>       return -1;
>-      /* Put the integral type with the bigger precision first.  */
>+      /* Put any integral type before any non-integral type.  When
>splicing, we
>+       make sure that those with insufficient precision and occupying the
>+       same space are not scalarized.  */
>       else if (INTEGRAL_TYPE_P (f1->type)
>+             && !INTEGRAL_TYPE_P (f2->type))
>+      return -1;
>+      else if (!INTEGRAL_TYPE_P (f1->type)
>              && INTEGRAL_TYPE_P (f2->type))
>-      return TYPE_PRECISION (f2->type) - TYPE_PRECISION (f1->type);
>-      /* Put any integral type with non-full precision last.  */
>-      else if (INTEGRAL_TYPE_P (f1->type)
>-             && (TREE_INT_CST_LOW (TYPE_SIZE (f1->type))
>-                 != TYPE_PRECISION (f1->type)))
>       return 1;
>-      else if (INTEGRAL_TYPE_P (f2->type)
>-             && (TREE_INT_CST_LOW (TYPE_SIZE (f2->type))
>-                 != TYPE_PRECISION (f2->type)))
>-      return -1;
>+      /* Put the integral type with the bigger precision first.  */
>+      else if (INTEGRAL_TYPE_P (f1->type)
>+             && INTEGRAL_TYPE_P (f2->type)
>+             && (TYPE_PRECISION (f2->type) != TYPE_PRECISION (f1->type)))
>+      return TYPE_PRECISION (f2->type) - TYPE_PRECISION (f1->type);
>       /* Stabilize the sort.  */
>       return TYPE_UID (f1->type) - TYPE_UID (f2->type);
>     }
>@@ -2055,6 +2056,11 @@ sort_and_splice_var_accesses (tree var)
>       bool grp_partial_lhs = access->grp_partial_lhs;
>       bool first_scalar = is_gimple_reg_type (access->type);
>       bool unscalarizable_region = access->grp_unscalarizable_region;
>+      bool bf_non_full_precision
>+      = (INTEGRAL_TYPE_P (access->type)
>+         && TYPE_PRECISION (access->type) != access->size
>+         && TREE_CODE (access->expr) == COMPONENT_REF
>+         && DECL_BIT_FIELD (TREE_OPERAND (access->expr, 1)));
> 
>       if (first || access->offset >= high)
>       {
>@@ -2102,6 +2108,22 @@ sort_and_splice_var_accesses (tree var)
>            this combination of size and offset, the comparison function
>            should have put the scalars first.  */
>         gcc_assert (first_scalar || !is_gimple_reg_type (ac2->type));
>+        /* It also prefers integral types to non-integral.  However, when
>the
>+           precision of the selected type does not span the entire area and
>+           should also be used for a non-integer (i.e. float), we must not
>+           let that happen.  Normally analyze_access_subtree expands the
>type
>+           to cover the entire area but for bit-fields it doesn't.  */
>+        if (bf_non_full_precision && !INTEGRAL_TYPE_P (ac2->type))
>+          {
>+            if (dump_file && (dump_flags & TDF_DETAILS))
>+              {
>+                fprintf (dump_file, "Cannot scalarize the following access "
>+                         "because insufficient precision integer type was "
>+                         "selected.\n  ");
>+                dump_access (dump_file, access, false);
>+              }
>+            unscalarizable_region = true;
>+          }
>         ac2->group_representative = access;
>         j++;
>       }

Reply via email to