On Tue, May 26, 2015 at 5:04 PM, Richard Biener
<[email protected]> wrote:
> On Sun, May 24, 2015 at 8:47 AM, Bin.Cheng <[email protected]> wrote:
>> On Fri, May 22, 2015 at 7:45 PM, Richard Biener
>> <[email protected]> wrote:
>>> On Wed, May 20, 2015 at 11:41 AM, Bin Cheng <[email protected]> wrote:
>>>> Hi,
>>>> As we know, GCC is too conservative when checking overflow behavior in SCEV
>>>> and loop related optimizers. Result is some variable can't be recognized
>>>> as
>>>> scalar evolution and thus optimizations are missed. To be specific,
>>>> optimizers like ivopts and vectorizer are affected.
>>>> This issue is more severe on 64 bit platforms, for example, PR62173 is
>>>> failed on aarch64; scev-3.c and scev-4.c were marked as XFAIL on lp64
>>>> platforms.
>>>>
>>>> As the first part to improve overflow checking in GCC, this patch does
>>>> below
>>>> improvements:
>>>> 1) Ideally, chrec_convert should be responsible to convert scev like
>>>> "(type){base, step}" to scev like "{(type)base, (type)step} when the result
>>>> scev doesn't overflow; chrec_convert_aggressive should do the conversion if
>>>> the result scev could overflow/wrap. Unfortunately, current implementation
>>>> may use chrec_convert_aggressive to return a scev that won't overflow.
>>>> This
>>>> is because of a) the static parameter "fold_conversions" for
>>>> instantiate_scev_convert can only tracks whether chrec_convert_aggressive
>>>> may be called, rather than if it does some overflow conversion or not; b)
>>>> the implementation of instantiate_scev_convert sometimes shortcuts the call
>>>> to chrec_convert and misses conversion opportunities. This patch improves
>>>> this.
>>>> 2) iv->no_overflow computed in simple_iv is too conservative. With 1)
>>>> fixed, iv->no_overflow should reflects whether chrec_convert_aggressive
>>>> does
>>>> return an overflow scev. This patch improves this.
>>>> 3) chrec_convert should be able to prove the resulting scev won't
>>>> overflow
>>>> with loop niter information. This patch doesn't finish this, but it
>>>> factored a new interface out of scev_probably_wraps_p for future
>>>> improvement. And that will be the part II patch.
>>>>
>>>> With the improvements in SCEV, this patch also improves optimizer(IVOPT)
>>>> that uses scev information like below:
>>>> For array reference in the form of arr[IV], GCC tries to derive new
>>>> address iv {arr+iv.base, iv.step*elem_size} from IV. If IV overflow wrto a
>>>> type that is narrower than address space, this derivation is not true
>>>> because &arr[IV] isn't a scev. Root cause why scev-*.c are failed now is
>>>> the overflow information of IV is too conservative. IVOPT has to be
>>>> conservative to reject &arr[IV] as a scev. With more accurate overflow
>>>> information, IVOPT can be improved too. So this patch fixes the mentioned
>>>> long standing issues.
>>>>
>>>> Bootstrap and test on x86_64, x86 and aarch64.
>>>> BTW, test gcc.target/i386/pr49781-1.c failed on x86_64, but I can confirmed
>>>> it's not this patch's fault.
>>>>
>>>> So what's your opinion on this?.
>>>
>>> I maybe mixing things up but does
>>>
>>> +chrec_convert_aggressive (tree type, tree chrec, bool *fold_conversions)
>>> {
>>> ...
>>> + if (evolution_function_is_affine_p (chrec))
>>> + {
>>> + tree base, step;
>>> + struct loop *loop;
>>> +
>>> + loop = get_chrec_loop (chrec);
>>> + base = CHREC_LEFT (chrec);
>>> + step = CHREC_RIGHT (chrec);
>>> + if (convert_affine_scev (loop, type, &base, &step, NULL, true))
>>> + return build_polynomial_chrec (loop->num, base, step);
>>>
>>> ^^^ not forget to set *fold_conversions to true? Or we need to use
>>> convert_affine_scev (..., false)?
>>
>> Nice catch. It's supposed to be called only if source scev has no
>> overflow behavior introduced by previous call to
>> chrec_convert_aggressive. In other words, it should be guarded by
>> "!*fold_conversions" like below:
>>
>> +
>> + if (!*fold_conversions && evolution_function_is_affine_p (chrec))
>> + {
>> + tree base, step;
>> + struct loop *loop;
>> +
>> + loop = get_chrec_loop (chrec);
>> + base = CHREC_LEFT (chrec);
>> + step = CHREC_RIGHT (chrec);
>> + if (convert_affine_scev (loop, type, &base, &step, NULL, true))
>> + return build_polynomial_chrec (loop->num, base, step);
>> + }
>>
>> The scenario is rare that didn't exposed in either bootstrap or reg-test.
>>
>> Here is the updated patch without any other difference. Bootstrap and
>> test on x86_64 & AArch64.
>
> Ok.
>
> Thanks,
> Richard.
>
>> Thanks,
>> bin
>>>
>>> + }
>>>
>>> (bah, and the diff somehow messes up -p context :/ which is why I like
>>> context diffs more)
>>>
>>> Other from the above the patch looks good to me.
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Thanks,
>>>> bin
>>>>
>>>> 2015-05-20 Bin Cheng <[email protected]>
>>>>
>>>> PR tree-optimization/62173
>>>> * tree-ssa-loop-ivopts.c (struct iv): New field. Reorder fields.
>>>> (alloc_iv, set_iv): New parameter.
>>>> (determine_biv_step): Delete.
>>>> (find_bivs): Inline original determine_biv_step. Pass new
>>>> argument to set_iv.
>>>> (idx_find_step): Use no_overflow information for conversion.
>>>> * tree-scalar-evolution.c (analyze_scalar_evolution_in_loop): Let
>>>> resolve_mixers handle folded_casts.
>>>> (instantiate_scev_name): Change bool parameter to bool pointer.
>>>> (instantiate_scev_poly, instantiate_scev_binary): Ditto.
>>>> (instantiate_array_ref, instantiate_scev_not): Ditto.
>>>> (instantiate_scev_3, instantiate_scev_2): Ditto.
>>>> (instantiate_scev_1, instantiate_scev_r): Ditto.
>>>> (instantiate_scev_convert, ): Change parameter. Pass argument
>>>> to chrec_convert_aggressive.
>>>> (instantiate_scev): Change argument.
>>>> (resolve_mixers): New parameter and set it.
>>>> (scev_const_prop): New argument.
>>>> * tree-scalar-evolution.h (resolve_mixers): New parameter.
>>>> * tree-chrec.c (convert_affine_scev): Call chrec_convert instead
>>>> of chrec_conert_1.
>>>> (chrec_convert): New parameter. Move definition below.
>>>> (chrec_convert_aggressive): New parameter and set it. Call
>>>> convert_affine_scev.
>>>> * tree-chrec.h (chrec_convert): New parameter.
>>>> (chrec_convert_aggressive): Ditto.
>>>> * tree-ssa-loop-niter.c (loop_exits_before_overflow): New function.
>>>> (scev_probably_wraps_p): Factor loop niter related code into
>>>> loop_exits_before_overflow.
>>>>
>>>> gcc/testsuite/ChangeLog
>>>> 2015-05-20 Bin Cheng <[email protected]>
>>>>
>>>> PR tree-optimization/62173
>>>> * gcc.dg/tree-ssa/scev-3.c: Remove xfail.
>>>> * gcc.dg/tree-ssa/scev-4.c: Ditto.
>>>> * gcc.dg/tree-ssa/scev-8.c: New.
Attached patch applied as revision 224009. I removed changes in
tree-ssa-loop-niter.c (also the new test) in this patch because it's
just code refactoring and will be covered by the second patch.
Thanks,
bin
Index: gcc/testsuite/gcc.dg/tree-ssa/scev-3.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/scev-3.c (revision 224008)
+++ gcc/testsuite/gcc.dg/tree-ssa/scev-3.c (working copy)
@@ -15,4 +15,4 @@ f(int k)
}
}
-/* { dg-final { scan-tree-dump-times "&a" 1 "optimized" { xfail { lp64 ||
llp64 } } } } */
+/* { dg-final { scan-tree-dump-times "&a" 1 "optimized" } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/scev-4.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/scev-4.c (revision 224008)
+++ gcc/testsuite/gcc.dg/tree-ssa/scev-4.c (working copy)
@@ -20,4 +20,4 @@ f(int k)
}
}
-/* { dg-final { scan-tree-dump-times "&a" 1 "optimized" { xfail { lp64 ||
llp64 } } } } */
+/* { dg-final { scan-tree-dump-times "&a" 1 "optimized" } } */
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c (revision 224008)
+++ gcc/tree-ssa-loop-ivopts.c (working copy)
@@ -171,9 +171,10 @@ struct iv
tree base_object; /* A memory object to that the induction variable
points. */
tree step; /* Step of the iv (constant only). */
tree ssa_name; /* The ssa name with the value. */
+ unsigned use_id; /* The identifier in the use if it is the case. */
bool biv_p; /* Is it a biv? */
bool have_use_for; /* Do we already have a use for it? */
- unsigned use_id; /* The identifier in the use if it is the case. */
+ bool no_overflow; /* True if the iv doesn't overflow. */
};
/* Per-ssa version information (induction variable descriptions, etc.). */
@@ -1005,10 +1006,10 @@ contain_complex_addr_expr (tree expr)
}
/* Allocates an induction variable with given initial value BASE and step STEP
- for loop LOOP. */
+ for loop LOOP. NO_OVERFLOW implies the iv doesn't overflow. */
static struct iv *
-alloc_iv (tree base, tree step)
+alloc_iv (tree base, tree step, bool no_overflow = false)
{
tree expr = base;
struct iv *iv = XCNEW (struct iv);
@@ -1035,21 +1036,24 @@ static struct iv *
iv->have_use_for = false;
iv->use_id = 0;
iv->ssa_name = NULL_TREE;
+ iv->no_overflow = no_overflow;
return iv;
}
-/* Sets STEP and BASE for induction variable IV. */
+/* Sets STEP and BASE for induction variable IV. NO_OVERFLOW implies the IV
+ doesn't overflow. */
static void
-set_iv (struct ivopts_data *data, tree iv, tree base, tree step)
+set_iv (struct ivopts_data *data, tree iv, tree base, tree step,
+ bool no_overflow)
{
struct version_info *info = name_info (data, iv);
gcc_assert (!info->iv);
bitmap_set_bit (data->relevant, SSA_NAME_VERSION (iv));
- info->iv = alloc_iv (base, step);
+ info->iv = alloc_iv (base, step, no_overflow);
info->iv->ssa_name = iv;
}
@@ -1071,31 +1075,12 @@ get_iv (struct ivopts_data *data, tree var)
if (!bb
|| !flow_bb_inside_loop_p (data->current_loop, bb))
- set_iv (data, var, var, build_int_cst (type, 0));
+ set_iv (data, var, var, build_int_cst (type, 0), true);
}
return name_info (data, var)->iv;
}
-/* Determines the step of a biv defined in PHI. Returns NULL if PHI does
- not define a simple affine biv with nonzero step. */
-
-static tree
-determine_biv_step (gphi *phi)
-{
- struct loop *loop = gimple_bb (phi)->loop_father;
- tree name = PHI_RESULT (phi);
- affine_iv iv;
-
- if (virtual_operand_p (name))
- return NULL_TREE;
-
- if (!simple_iv (loop, loop, name, &iv, true))
- return NULL_TREE;
-
- return integer_zerop (iv.step) ? NULL_TREE : iv.step;
-}
-
/* Return the first non-invariant ssa var found in EXPR. */
static tree
@@ -1129,6 +1114,7 @@ static bool
find_bivs (struct ivopts_data *data)
{
gphi *phi;
+ affine_iv iv;
tree step, type, base, stop;
bool found = false;
struct loop *loop = data->current_loop;
@@ -1141,10 +1127,16 @@ find_bivs (struct ivopts_data *data)
if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (PHI_RESULT (phi)))
continue;
- step = determine_biv_step (phi);
- if (!step)
+ if (virtual_operand_p (PHI_RESULT (phi)))
continue;
+ if (!simple_iv (loop, loop, PHI_RESULT (phi), &iv, true))
+ continue;
+
+ if (integer_zerop (iv.step))
+ continue;
+
+ step = iv.step;
base = PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
/* Stop expanding iv base at the first ssa var referred by iv step.
Ideally we should stop at any ssa var, because that's expensive
@@ -1167,7 +1159,7 @@ find_bivs (struct ivopts_data *data)
step = fold_convert (type, step);
}
- set_iv (data, PHI_RESULT (phi), base, step);
+ set_iv (data, PHI_RESULT (phi), base, step, iv.no_overflow);
found = true;
}
@@ -1270,7 +1262,7 @@ find_givs_in_stmt (struct ivopts_data *data, gimpl
if (!find_givs_in_stmt_scev (data, stmt, &iv))
return;
- set_iv (data, gimple_assign_lhs (stmt), iv.base, iv.step);
+ set_iv (data, gimple_assign_lhs (stmt), iv.base, iv.step, iv.no_overflow);
}
/* Finds general ivs in basic block BB. */
@@ -1683,6 +1675,7 @@ idx_find_step (tree base, tree *idx, void *data)
{
struct ifs_ivopts_data *dta = (struct ifs_ivopts_data *) data;
struct iv *iv;
+ bool use_overflow_semantics = false;
tree step, iv_base, iv_step, lbound, off;
struct loop *loop = dta->ivopts_data->current_loop;
@@ -1742,9 +1735,12 @@ idx_find_step (tree base, tree *idx, void *data)
iv_base = iv->base;
iv_step = iv->step;
+ if (iv->no_overflow && nowrap_type_p (TREE_TYPE (iv_step)))
+ use_overflow_semantics = true;
+
if (!convert_affine_scev (dta->ivopts_data->current_loop,
sizetype, &iv_base, &iv_step, dta->stmt,
- false))
+ use_overflow_semantics))
{
/* The index might wrap. */
return false;
Index: gcc/tree-scalar-evolution.c
===================================================================
--- gcc/tree-scalar-evolution.c (revision 224008)
+++ gcc/tree-scalar-evolution.c (working copy)
@@ -2145,7 +2145,7 @@ analyze_scalar_evolution_in_loop (struct loop *wrt
/* We cannot just do
tmp = analyze_scalar_evolution (use_loop, version);
- ev = resolve_mixers (wrto_loop, tmp);
+ ev = resolve_mixers (wrto_loop, tmp, folded_casts);
as resolve_mixers would query the scalar evolution with respect to
wrto_loop. For example, in the situation described in the function
@@ -2154,9 +2154,9 @@ analyze_scalar_evolution_in_loop (struct loop *wrt
analyze_scalar_evolution (use_loop, version) = k2
- and resolve_mixers (loop1, k2) finds that the value of k2 in loop 1
- is 100, which is a wrong result, since we are interested in the
- value in loop 3.
+ and resolve_mixers (loop1, k2, folded_casts) finds that the value of
+ k2 in loop 1 is 100, which is a wrong result, since we are interested
+ in the value in loop 3.
Instead, we need to proceed from use_loop to wrto_loop loop by loop,
each time checking that there is no evolution in the inner loop. */
@@ -2166,11 +2166,8 @@ analyze_scalar_evolution_in_loop (struct loop *wrt
while (1)
{
tmp = analyze_scalar_evolution (use_loop, ev);
- ev = resolve_mixers (use_loop, tmp);
+ ev = resolve_mixers (use_loop, tmp, folded_casts);
- if (folded_casts && tmp != ev)
- *folded_casts = true;
-
if (use_loop == wrto_loop)
return ev;
@@ -2292,7 +2289,7 @@ loop_closed_phi_def (tree var)
}
static tree instantiate_scev_r (basic_block, struct loop *, struct loop *,
- tree, bool, int);
+ tree, bool *, int);
/* Analyze all the parameters of the chrec, between INSTANTIATE_BELOW
and EVOLUTION_LOOP, that were left under a symbolic form.
@@ -2301,9 +2298,10 @@ static tree instantiate_scev_r (basic_block, struc
CACHE is the cache of already instantiated values.
- FOLD_CONVERSIONS should be set to true when the conversions that
- may wrap in signed/pointer type are folded, as long as the value of
- the chrec is preserved.
+ Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
+ conversions that may wrap in signed/pointer type are folded, as long
+ as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL
+ then we don't do such fold.
SIZE_EXPR is used for computing the size of the expression to be
instantiated, and to stop if it exceeds some limit. */
@@ -2312,7 +2310,7 @@ static tree
instantiate_scev_name (basic_block instantiate_below,
struct loop *evolution_loop, struct loop *inner_loop,
tree chrec,
- bool fold_conversions,
+ bool *fold_conversions,
int size_expr)
{
tree res;
@@ -2406,9 +2404,10 @@ instantiate_scev_name (basic_block instantiate_bel
CACHE is the cache of already instantiated values.
- FOLD_CONVERSIONS should be set to true when the conversions that
- may wrap in signed/pointer type are folded, as long as the value of
- the chrec is preserved.
+ Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
+ conversions that may wrap in signed/pointer type are folded, as long
+ as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL
+ then we don't do such fold.
SIZE_EXPR is used for computing the size of the expression to be
instantiated, and to stop if it exceeds some limit. */
@@ -2416,7 +2415,7 @@ instantiate_scev_name (basic_block instantiate_bel
static tree
instantiate_scev_poly (basic_block instantiate_below,
struct loop *evolution_loop, struct loop *,
- tree chrec, bool fold_conversions, int size_expr)
+ tree chrec, bool *fold_conversions, int size_expr)
{
tree op1;
tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
@@ -2450,9 +2449,10 @@ instantiate_scev_poly (basic_block instantiate_bel
CACHE is the cache of already instantiated values.
- FOLD_CONVERSIONS should be set to true when the conversions that
- may wrap in signed/pointer type are folded, as long as the value of
- the chrec is preserved.
+ Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
+ conversions that may wrap in signed/pointer type are folded, as long
+ as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL
+ then we don't do such fold.
SIZE_EXPR is used for computing the size of the expression to be
instantiated, and to stop if it exceeds some limit. */
@@ -2462,7 +2462,7 @@ instantiate_scev_binary (basic_block instantiate_b
struct loop *evolution_loop, struct loop *inner_loop,
tree chrec, enum tree_code code,
tree type, tree c0, tree c1,
- bool fold_conversions, int size_expr)
+ bool *fold_conversions, int size_expr)
{
tree op1;
tree op0 = instantiate_scev_r (instantiate_below, evolution_loop, inner_loop,
@@ -2508,9 +2508,10 @@ instantiate_scev_binary (basic_block instantiate_b
CACHE is the cache of already instantiated values.
- FOLD_CONVERSIONS should be set to true when the conversions that
- may wrap in signed/pointer type are folded, as long as the value of
- the chrec is preserved.
+ Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
+ conversions that may wrap in signed/pointer type are folded, as long
+ as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL
+ then we don't do such fold.
SIZE_EXPR is used for computing the size of the expression to be
instantiated, and to stop if it exceeds some limit. */
@@ -2518,7 +2519,7 @@ instantiate_scev_binary (basic_block instantiate_b
static tree
instantiate_array_ref (basic_block instantiate_below,
struct loop *evolution_loop, struct loop *inner_loop,
- tree chrec, bool fold_conversions, int size_expr)
+ tree chrec, bool *fold_conversions, int size_expr)
{
tree res;
tree index = TREE_OPERAND (chrec, 1);
@@ -2545,9 +2546,10 @@ instantiate_array_ref (basic_block instantiate_bel
CACHE is the cache of already instantiated values.
- FOLD_CONVERSIONS should be set to true when the conversions that
- may wrap in signed/pointer type are folded, as long as the value of
- the chrec is preserved.
+ Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
+ conversions that may wrap in signed/pointer type are folded, as long
+ as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL
+ then we don't do such fold.
SIZE_EXPR is used for computing the size of the expression to be
instantiated, and to stop if it exceeds some limit. */
@@ -2556,7 +2558,7 @@ static tree
instantiate_scev_convert (basic_block instantiate_below,
struct loop *evolution_loop, struct loop *inner_loop,
tree chrec, tree type, tree op,
- bool fold_conversions, int size_expr)
+ bool *fold_conversions, int size_expr)
{
tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
inner_loop, op,
@@ -2567,19 +2569,21 @@ instantiate_scev_convert (basic_block instantiate_
if (fold_conversions)
{
- tree tmp = chrec_convert_aggressive (type, op0);
+ tree tmp = chrec_convert_aggressive (type, op0, fold_conversions);
if (tmp)
return tmp;
- }
- if (chrec && op0 == op)
- return chrec;
+ /* If we used chrec_convert_aggressive, we can no longer assume that
+ signed chrecs do not overflow, as chrec_convert does, so avoid
+ calling it in that case. */
+ if (*fold_conversions)
+ {
+ if (chrec && op0 == op)
+ return chrec;
- /* If we used chrec_convert_aggressive, we can no longer assume that
- signed chrecs do not overflow, as chrec_convert does, so avoid
- calling it in that case. */
- if (fold_conversions)
- return fold_convert (type, op0);
+ return fold_convert (type, op0);
+ }
+ }
return chrec_convert (type, op0, NULL);
}
@@ -2593,9 +2597,10 @@ instantiate_scev_convert (basic_block instantiate_
CACHE is the cache of already instantiated values.
- FOLD_CONVERSIONS should be set to true when the conversions that
- may wrap in signed/pointer type are folded, as long as the value of
- the chrec is preserved.
+ Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
+ conversions that may wrap in signed/pointer type are folded, as long
+ as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL
+ then we don't do such fold.
SIZE_EXPR is used for computing the size of the expression to be
instantiated, and to stop if it exceeds some limit. */
@@ -2605,7 +2610,7 @@ instantiate_scev_not (basic_block instantiate_belo
struct loop *evolution_loop, struct loop *inner_loop,
tree chrec,
enum tree_code code, tree type, tree op,
- bool fold_conversions, int size_expr)
+ bool *fold_conversions, int size_expr)
{
tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
inner_loop, op,
@@ -2643,9 +2648,10 @@ instantiate_scev_not (basic_block instantiate_belo
CACHE is the cache of already instantiated values.
- FOLD_CONVERSIONS should be set to true when the conversions that
- may wrap in signed/pointer type are folded, as long as the value of
- the chrec is preserved.
+ Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
+ conversions that may wrap in signed/pointer type are folded, as long
+ as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL
+ then we don't do such fold.
SIZE_EXPR is used for computing the size of the expression to be
instantiated, and to stop if it exceeds some limit. */
@@ -2654,7 +2660,7 @@ static tree
instantiate_scev_3 (basic_block instantiate_below,
struct loop *evolution_loop, struct loop *inner_loop,
tree chrec,
- bool fold_conversions, int size_expr)
+ bool *fold_conversions, int size_expr)
{
tree op1, op2;
tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
@@ -2691,9 +2697,10 @@ instantiate_scev_3 (basic_block instantiate_below,
CACHE is the cache of already instantiated values.
- FOLD_CONVERSIONS should be set to true when the conversions that
- may wrap in signed/pointer type are folded, as long as the value of
- the chrec is preserved.
+ Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
+ conversions that may wrap in signed/pointer type are folded, as long
+ as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL
+ then we don't do such fold.
SIZE_EXPR is used for computing the size of the expression to be
instantiated, and to stop if it exceeds some limit. */
@@ -2702,7 +2709,7 @@ static tree
instantiate_scev_2 (basic_block instantiate_below,
struct loop *evolution_loop, struct loop *inner_loop,
tree chrec,
- bool fold_conversions, int size_expr)
+ bool *fold_conversions, int size_expr)
{
tree op1;
tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
@@ -2731,9 +2738,10 @@ instantiate_scev_2 (basic_block instantiate_below,
CACHE is the cache of already instantiated values.
- FOLD_CONVERSIONS should be set to true when the conversions that
- may wrap in signed/pointer type are folded, as long as the value of
- the chrec is preserved.
+ Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
+ conversions that may wrap in signed/pointer type are folded, as long
+ as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL
+ then we don't do such fold.
SIZE_EXPR is used for computing the size of the expression to be
instantiated, and to stop if it exceeds some limit. */
@@ -2742,7 +2750,7 @@ static tree
instantiate_scev_1 (basic_block instantiate_below,
struct loop *evolution_loop, struct loop *inner_loop,
tree chrec,
- bool fold_conversions, int size_expr)
+ bool *fold_conversions, int size_expr)
{
tree op0 = instantiate_scev_r (instantiate_below, evolution_loop,
inner_loop, TREE_OPERAND (chrec, 0),
@@ -2764,9 +2772,10 @@ instantiate_scev_1 (basic_block instantiate_below,
CACHE is the cache of already instantiated values.
- FOLD_CONVERSIONS should be set to true when the conversions that
- may wrap in signed/pointer type are folded, as long as the value of
- the chrec is preserved.
+ Variable pointed by FOLD_CONVERSIONS is set to TRUE when the
+ conversions that may wrap in signed/pointer type are folded, as long
+ as the value of the chrec is preserved. If FOLD_CONVERSIONS is NULL
+ then we don't do such fold.
SIZE_EXPR is used for computing the size of the expression to be
instantiated, and to stop if it exceeds some limit. */
@@ -2775,7 +2784,7 @@ static tree
instantiate_scev_r (basic_block instantiate_below,
struct loop *evolution_loop, struct loop *inner_loop,
tree chrec,
- bool fold_conversions, int size_expr)
+ bool *fold_conversions, int size_expr)
{
/* Give up if the expression is larger than the MAX that we allow. */
if (size_expr++ > PARAM_VALUE (PARAM_SCEV_MAX_EXPR_SIZE))
@@ -2900,7 +2909,7 @@ instantiate_scev (basic_block instantiate_below, s
}
res = instantiate_scev_r (instantiate_below, evolution_loop,
- NULL, chrec, false, 0);
+ NULL, chrec, NULL, 0);
if (destr)
{
@@ -2924,9 +2933,10 @@ instantiate_scev (basic_block instantiate_below, s
of an expression. */
tree
-resolve_mixers (struct loop *loop, tree chrec)
+resolve_mixers (struct loop *loop, tree chrec, bool *folded_casts)
{
bool destr = false;
+ bool fold_conversions = false;
if (!global_cache)
{
global_cache = new instantiate_cache_type;
@@ -2934,8 +2944,11 @@ tree
}
tree ret = instantiate_scev_r (block_before_loop (loop), loop, NULL,
- chrec, true, 0);
+ chrec, &fold_conversions, 0);
+ if (folded_casts && !*folded_casts)
+ *folded_casts = fold_conversions;
+
if (destr)
{
delete global_cache;
@@ -3387,7 +3400,8 @@ scev_const_prop (void)
&& !INTEGRAL_TYPE_P (type))
continue;
- ev = resolve_mixers (loop, analyze_scalar_evolution (loop, name));
+ ev = resolve_mixers (loop, analyze_scalar_evolution (loop, name),
+ NULL);
if (!is_gimple_min_invariant (ev)
|| !may_propagate_copy (name, ev))
continue;
Index: gcc/tree-scalar-evolution.h
===================================================================
--- gcc/tree-scalar-evolution.h (revision 224008)
+++ gcc/tree-scalar-evolution.h (working copy)
@@ -31,7 +31,7 @@ extern void scev_reset_htab (void);
extern void scev_finalize (void);
extern tree analyze_scalar_evolution (struct loop *, tree);
extern tree instantiate_scev (basic_block, struct loop *, tree);
-extern tree resolve_mixers (struct loop *, tree);
+extern tree resolve_mixers (struct loop *, tree, bool *);
extern void gather_stats_on_scev_database (void);
extern unsigned int scev_const_prop (void);
extern bool expression_expensive_p (tree);
Index: gcc/tree-chrec.c
===================================================================
--- gcc/tree-chrec.c (revision 224008)
+++ gcc/tree-chrec.c (working copy)
@@ -1178,8 +1178,6 @@ nb_vars_in_chrec (tree chrec)
}
}
-static tree chrec_convert_1 (tree, tree, gimple, bool);
-
/* Converts BASE and STEP of affine scev to TYPE. LOOP is the loop whose iv
the scev corresponds to. AT_STMT is the statement at that the scev is
evaluated. USE_OVERFLOW_SEMANTICS is true if this function should assume
that
@@ -1254,8 +1252,7 @@ convert_affine_scev (struct loop *loop, tree type,
use_overflow_semantics))
return false;
- new_base = chrec_convert_1 (type, *base, at_stmt,
- use_overflow_semantics);
+ new_base = chrec_convert (type, *base, at_stmt, use_overflow_semantics);
/* The step must be sign extended, regardless of the signedness
of CT and TYPE. This only needs to be handled specially when
CT is unsigned -- to avoid e.g. unsigned char [100, +, 255]
@@ -1266,10 +1263,11 @@ convert_affine_scev (struct loop *loop, tree type,
if (TYPE_PRECISION (step_type) > TYPE_PRECISION (ct) && TYPE_UNSIGNED (ct))
{
tree signed_ct = build_nonstandard_integer_type (TYPE_PRECISION (ct), 0);
- new_step = chrec_convert_1 (signed_ct, new_step, at_stmt,
- use_overflow_semantics);
+ new_step = chrec_convert (signed_ct, new_step, at_stmt,
+ use_overflow_semantics);
}
- new_step = chrec_convert_1 (step_type, new_step, at_stmt,
use_overflow_semantics);
+ new_step = chrec_convert (step_type, new_step, at_stmt,
+ use_overflow_semantics);
if (automatically_generated_chrec_p (new_base)
|| automatically_generated_chrec_p (new_step))
@@ -1306,36 +1304,6 @@ chrec_convert_rhs (tree type, tree chrec, gimple a
determining a more accurate estimation of the number of iterations.
By default AT_STMT could be safely set to NULL_TREE.
- The following rule is always true: TREE_TYPE (chrec) ==
- TREE_TYPE (CHREC_LEFT (chrec)) == TREE_TYPE (CHREC_RIGHT (chrec)).
- An example of what could happen when adding two chrecs and the type
- of the CHREC_RIGHT is different than CHREC_LEFT is:
-
- {(uint) 0, +, (uchar) 10} +
- {(uint) 0, +, (uchar) 250}
-
- that would produce a wrong result if CHREC_RIGHT is not (uint):
-
- {(uint) 0, +, (uchar) 4}
-
- instead of
-
- {(uint) 0, +, (uint) 260}
-*/
-
-tree
-chrec_convert (tree type, tree chrec, gimple at_stmt)
-{
- return chrec_convert_1 (type, chrec, at_stmt, true);
-}
-
-/* Convert CHREC to TYPE. When the analyzer knows the context in
- which the CHREC is built, it sets AT_STMT to the statement that
- contains the definition of the analyzed variable, otherwise the
- conversion is less accurate: the information is used for
- determining a more accurate estimation of the number of iterations.
- By default AT_STMT could be safely set to NULL_TREE.
-
USE_OVERFLOW_SEMANTICS is true if this function should assume that
the rules for overflow of the given language apply (e.g., that signed
arithmetics in C does not overflow) -- i.e., to use them to avoid
unnecessary
@@ -1420,15 +1388,53 @@ keep_cast:
return res;
}
+/* Convert CHREC to TYPE. When the analyzer knows the context in
+ which the CHREC is built, it sets AT_STMT to the statement that
+ contains the definition of the analyzed variable, otherwise the
+ conversion is less accurate: the information is used for
+ determining a more accurate estimation of the number of iterations.
+ By default AT_STMT could be safely set to NULL_TREE.
+
+ The following rule is always true: TREE_TYPE (chrec) ==
+ TREE_TYPE (CHREC_LEFT (chrec)) == TREE_TYPE (CHREC_RIGHT (chrec)).
+ An example of what could happen when adding two chrecs and the type
+ of the CHREC_RIGHT is different than CHREC_LEFT is:
+
+ {(uint) 0, +, (uchar) 10} +
+ {(uint) 0, +, (uchar) 250}
+
+ that would produce a wrong result if CHREC_RIGHT is not (uint):
+
+ {(uint) 0, +, (uchar) 4}
+
+ instead of
+
+ {(uint) 0, +, (uint) 260}
+
+ USE_OVERFLOW_SEMANTICS is true if this function should assume that
+ the rules for overflow of the given language apply (e.g., that signed
+ arithmetics in C does not overflow) -- i.e., to use them to avoid
unnecessary
+ tests, but also to enforce that the result follows them. */
+
+tree
+chrec_convert (tree type, tree chrec, gimple at_stmt,
+ bool use_overflow_semantics)
+{
+ return chrec_convert_1 (type, chrec, at_stmt, use_overflow_semantics);
+}
+
/* Convert CHREC to TYPE, without regard to signed overflows. Returns the new
chrec if something else than what chrec_convert would do happens, NULL_TREE
- otherwise. */
+ otherwise. This function set TRUE to variable pointed by FOLD_CONVERSIONS
+ if the result chrec may overflow. */
tree
-chrec_convert_aggressive (tree type, tree chrec)
+chrec_convert_aggressive (tree type, tree chrec, bool *fold_conversions)
{
tree inner_type, left, right, lc, rc, rtype;
+ gcc_assert (fold_conversions != NULL);
+
if (automatically_generated_chrec_p (chrec)
|| TREE_CODE (chrec) != POLYNOMIAL_CHREC)
return NULL_TREE;
@@ -1437,17 +1443,33 @@ tree
if (TYPE_PRECISION (type) > TYPE_PRECISION (inner_type))
return NULL_TREE;
+ if (useless_type_conversion_p (type, inner_type))
+ return NULL_TREE;
+
+ if (!*fold_conversions && evolution_function_is_affine_p (chrec))
+ {
+ tree base, step;
+ struct loop *loop;
+
+ loop = get_chrec_loop (chrec);
+ base = CHREC_LEFT (chrec);
+ step = CHREC_RIGHT (chrec);
+ if (convert_affine_scev (loop, type, &base, &step, NULL, true))
+ return build_polynomial_chrec (loop->num, base, step);
+ }
rtype = POINTER_TYPE_P (type) ? sizetype : type;
left = CHREC_LEFT (chrec);
right = CHREC_RIGHT (chrec);
- lc = chrec_convert_aggressive (type, left);
+ lc = chrec_convert_aggressive (type, left, fold_conversions);
if (!lc)
lc = chrec_convert (type, left, NULL);
- rc = chrec_convert_aggressive (rtype, right);
+ rc = chrec_convert_aggressive (rtype, right, fold_conversions);
if (!rc)
rc = chrec_convert (rtype, right, NULL);
+ *fold_conversions = true;
+
return build_polynomial_chrec (CHREC_VARIABLE (chrec), lc, rc);
}
Index: gcc/tree-chrec.h
===================================================================
--- gcc/tree-chrec.h (revision 224008)
+++ gcc/tree-chrec.h (working copy)
@@ -59,9 +59,9 @@ enum ev_direction scev_direction (const_tree);
extern tree chrec_fold_plus (tree, tree, tree);
extern tree chrec_fold_minus (tree, tree, tree);
extern tree chrec_fold_multiply (tree, tree, tree);
-extern tree chrec_convert (tree, tree, gimple);
+extern tree chrec_convert (tree, tree, gimple, bool = true);
extern tree chrec_convert_rhs (tree, tree, gimple);
-extern tree chrec_convert_aggressive (tree, tree);
+extern tree chrec_convert_aggressive (tree, tree, bool *);
/* Operations. */
extern tree chrec_apply (unsigned, tree, tree);
Index: gcc/ChangeLog
===================================================================
--- gcc/ChangeLog (revision 224008)
+++ gcc/ChangeLog (working copy)
@@ -1,3 +1,34 @@
+2015-06-02 Bin Cheng <[email protected]>
+
+ PR tree-optimization/52563
+ PR tree-optimization/62173
+ * tree-ssa-loop-ivopts.c (struct iv): New field. Reorder fields.
+ (alloc_iv, set_iv): New parameter.
+ (determine_biv_step): Delete.
+ (find_bivs): Inline original determine_biv_step. Pass new
+ argument to set_iv.
+ (idx_find_step): Use no_overflow information for conversion.
+ * tree-scalar-evolution.c (analyze_scalar_evolution_in_loop): Let
+ resolve_mixers handle folded_casts.
+ (instantiate_scev_name): Change bool parameter to bool pointer.
+ (instantiate_scev_poly, instantiate_scev_binary): Ditto.
+ (instantiate_array_ref, instantiate_scev_not): Ditto.
+ (instantiate_scev_3, instantiate_scev_2): Ditto.
+ (instantiate_scev_1, instantiate_scev_r): Ditto.
+ (instantiate_scev_convert, ): Change parameter. Pass argument
+ to chrec_convert_aggressive.
+ (instantiate_scev): Change argument.
+ (resolve_mixers): New parameter and set it.
+ (scev_const_prop): New argument.
+ * tree-scalar-evolution.h (resolve_mixers): New parameter.
+ * tree-chrec.c (convert_affine_scev): Call chrec_convert instead
+ of chrec_conert_1.
+ (chrec_convert): New parameter. Move definition below.
+ (chrec_convert_aggressive): New parameter and set it. Call
+ convert_affine_scev.
+ * tree-chrec.h (chrec_convert): New parameter.
+ (chrec_convert_aggressive): Ditto.
+
2015-06-01 Eric Botcazou <[email protected]>
* gimplify.c (gimplify_modify_expr_rhs): Use simple test on the size.
Index: gcc/testsuite/ChangeLog
===================================================================
--- gcc/testsuite/ChangeLog (revision 224008)
+++ gcc/testsuite/ChangeLog (working copy)
@@ -1,3 +1,10 @@
+2015-06-02 Bin Cheng <[email protected]>
+
+ PR tree-optimization/52563
+ PR tree-optimization/62173
+ * gcc.dg/tree-ssa/scev-3.c: Remove xfail.
+ * gcc.dg/tree-ssa/scev-4.c: Ditto.
+
2015-06-01 Eric Botcazou <[email protected]>
* gnat.dg/specs/varsize_return2.ads: New test.