Time to hopefully earn some goodwill from the team; this patch fixes a P1 wrong-code-on-valid regression in ivopts. Many thanks to Andrew Pinski for help with the analysis.
Consider the code fragment below: int i; for (j=0; j<10; j++) i++; This results in a loop containing two induction variables, i and j, where j is initialized, but i isn't (typically indicated in tree dumps by qualified ssa names like i(D). In PR 100810, the loop optimizers end up selecting i as the "best" candidate (perhaps because having no initialization it's cheaper) which leads to problems in later passes when (the equivalent of) j is considered not to have the value 10 after the loop, as its definition is now computed from an undefined/uninitialized value. The fix below is to add a field to track whether any "iv_group" contains a use of an iv based on a undefined value, and then prohibit IVs that are based on undefined values from being candidates for groups that don't use undefined values. This may seem lenient, but it allows an IV with an undefined base to be a candidate for itself, and is a sufficient condition to avoid the above bug/regression. A stricter condition might be to only allow "undefined_value iv"s as candidates for iv_groups where *all* uses are to "undefined_value ivs"? My concern was that this might lead to cases/loops that no longer have suitable candidates (i.e. a possible performance regression). Hopefully, the tree-loop optimization experts agree with my analysis/fix. This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-08-31 Roger Sayle <ro...@nextmovesoftware.com> Andrew Pinski <apin...@marvell.com> gcc/ChangeLog PR middle-end/100810 * tree-ssa-loop-ivopts.c (struct iv_group): Add a new uses_undefined_value_p field, indicating this group has uses of iv's whose base is ssa_undefined_value_p. (record_use): Update uses_undefined_value_p as required. (record_group): Initialize uses_undefined_value_p to false. (determine_group_iv_cost_generic): Consider a candidate with a ssa_undefined_value_p base to have infinite_cost for a group where uses_undefined_value_p is false. gcc/testsuite/ChangeLog PR middle-end/100810 * gcc.dg/pr100810.c: New test case. Roger --
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c index 4a498ab..ca8f526 100644 --- a/gcc/tree-ssa-loop-ivopts.c +++ b/gcc/tree-ssa-loop-ivopts.c @@ -432,6 +432,8 @@ struct iv_group struct iv_cand *selected; /* To indicate this is a doloop use group. */ bool doloop_p; + /* This group uses undefined values. */ + bool uses_undefined_value_p; /* Uses in the group. */ vec<struct iv_use *> vuses; }; @@ -1540,6 +1542,12 @@ record_use (struct iv_group *group, tree *use_p, struct iv *iv, use->addr_offset = addr_offset; group->vuses.safe_push (use); + + /* Record uses of undefined values. */ + if (TREE_CODE (iv->base) == SSA_NAME + && ssa_undefined_value_p (iv->base)) + group->uses_undefined_value_p = true; + return use; } @@ -1582,6 +1590,7 @@ record_group (struct ivopts_data *data, enum use_type type) group->related_cands = BITMAP_ALLOC (NULL); group->vuses.create (1); group->doloop_p = false; + group->uses_undefined_value_p = false; data->vgroups.safe_push (group); return group; @@ -4960,6 +4969,12 @@ determine_group_iv_cost_generic (struct ivopts_data *data, the candidate. */ if (cand->pos == IP_ORIGINAL && cand->incremented_at == use->stmt) cost = no_cost; + /* Disallow using an iv based on an undefined value as a candidate + replacement for a group that uses only defined values. */ + else if (!group->uses_undefined_value_p + && TREE_CODE (cand->iv->base) == SSA_NAME + && ssa_undefined_value_p (cand->iv->base)) + cost = infinite_cost; else cost = get_computation_cost (data, use, cand, false, &inv_vars, NULL, &inv_expr);
/* { dg-do run } */ /* { dg-options "-O2" } */ int a, b = 1, c = 1, e, f = 1, g, h, j; volatile int d; static void k() { int i; h = b; if (c && a >= 0) { while (a) { i++; h--; } if (g) for (h = 0; h < 2; h++) ; if (!b) i &&d; } } static void l() { for (; j < 1; j++) if (!e && c && f) k(); } int main() { if (f) l(); if (h != 1) __builtin_abort(); return 0; }