Ping~
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03360.html
Thanks,
Yufeng
On 11/26/13 15:02, Yufeng Zhang wrote:
On 11/26/13 12:45, Richard Biener wrote:
On Thu, Nov 14, 2013 at 12:25 AM, Yufeng Zhang<yufeng.zh...@arm.com> wrote:
On 11/13/13 20:54, Bill Schmidt wrote:
The second version of your original patch is ok with me with the
following changes. Sorry for the little side adventure into the
next-interp logic; in the end that's going to hurt more than it helps in
this case. Thanks for having a look at it, anyway. Thanks also for
cleaning up this version to be less intrusive to common interfaces; I
appreciate it.
Thanks a lot for the review. I've attached an updated patch with the
suggested changes incorporated.
For the next-interp adventure, I was quite happy to do the experiment; it's
a good chance of gaining insight into the pass. Many thanks for your prompt
replies and patience in guiding!
Everything else looks OK to me. Please ask Richard for final approval,
as I'm not a maintainer.
Hi Richard, would you be happy to OK the patch?
Hmm,
+static tree
+get_alternative_base (tree base)
+{
+ tree *result = (tree *) pointer_map_contains (alt_base_map, base);
+
+ if (result == NULL)
+ {
+ tree expr;
+ aff_tree aff;
+
+ tree_to_aff_combination_expand (base, TREE_TYPE (base),
+&aff,&name_expansions);
+ aff.offset = tree_to_double_int (integer_zero_node);
+ expr = aff_combination_to_tree (&aff);
+
+ result = (tree *) pointer_map_insert (alt_base_map, base);
+ gcc_assert (!*result);
I believe this cache will never hit (unless you repeatedly ask for
the exact same statement?) - any non-trivial 'base' trees are
not shared and thus not pointer equivalent.
Yes, you are right about the non-trivial 'base' tree are rarely shared.
The cached is introduced mainly because get_alternative_base () may be
called twice on the same 'base' tree, once in the
find_basis_for_candidate () for look-up and the other time in
alloc_cand_and_find_basis () for record_potential_basis (). I'm happy
to leave out the cache if you think the benefit is trivial.
Also using tree_to_aff_combination_expand to get at - what
exactly? The address with any constant offset stripped?
Where do you re-construct that offset? That is, aff.offset,
which you definitely need to get at a candidate?
As explained in the previous RFC emails, the expanded and
constant-offset-stripped base expr is only used for the purpose of basis
look-up. The corresponding candidate still has the unexpanded base expr
as its 'base_expr', therefore the info on the constant offset is not
lost and doesn't need to be re-constructed.
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-slsr" } */
+
+typedef int arr_2[50][50];
+
+void foo (arr_2 a2, int v1)
+{
+ int i, j;
+
+ i = v1 + 5;
+ j = i;
+ a2 [i-10] [j] = 2;
+ a2 [i] [j++] = i;
+ a2 [i+20] [j++] = i;
+ a2 [i-3] [i-1] += 1;
+ return;
+}
+
+/* { dg-final { scan-tree-dump-times "MEM" 5 "slsr" } } */
+/* { dg-final { cleanup-tree-dump "slsr" } } */
scanning for 5 MEMs looks non-sensical. What transform do
you expect? I see other slsr testcases do similar non-sensical
checking which is bad, too.
As the slsr optimizes CAND_REF candidates by simply lowering them to
MEM_REF from e.g. ARRAY_REF, I think scanning for the number of MEM_REFs
is an effective check. Alternatively, I can add a follow-up patch to
add some dumping facility in replace_ref () to print out the replacing
actions when -fdump-tree-slsr-details is on.
I hope these can address your concerns.
Regards,
Yufeng
Richard.
Regards,
Yufeng
gcc/
* gimple-ssa-strength-reduction.c: Include tree-affine.h.
(name_expansions): New static variable.
(alt_base_map): Ditto.
(get_alternative_base): New function.
(find_basis_for_candidate): For CAND_REF, optionally call
find_basis_for_base_expr with the returned value from
get_alternative_base.
(record_potential_basis): Add new parameter 'base' of type 'tree';
add an assertion of non-NULL base; use base to set node->base_expr.
(alloc_cand_and_find_basis): Update; call record_potential_basis
for CAND_REF with the returned value from get_alternative_base.
(execute_strength_reduction): Call pointer_map_create for
alt_base_map; call free_affine_expand_cache with&name_expansions.
gcc/testsuite/
* gcc.dg/tree-ssa/slsr-41.c: New test.
diff --git a/gcc/gimple-ssa-strength-reduction.c
b/gcc/gimple-ssa-strength-reduction.c
index 88afc91..26502c3 100644
--- a/gcc/gimple-ssa-strength-reduction.c
+++ b/gcc/gimple-ssa-strength-reduction.c
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3. If not see
#include "params.h"
#include "hash-table.h"
#include "tree-ssa-address.h"
+#include "tree-affine.h"
/* Information about a strength reduction candidate. Each statement
in the candidate table represents an expression of one of the
@@ -420,6 +421,42 @@ cand_chain_hasher::equal (const value_type *chain1, const
compare_type *chain2)
/* Hash table embodying a mapping from base exprs to chains of candidates. */
static hash_table <cand_chain_hasher> base_cand_map;
+/* Pointer map used by tree_to_aff_combination_expand. */
+static struct pointer_map_t *name_expansions;
+/* Pointer map embodying a mapping from bases to alternative bases. */
+static struct pointer_map_t *alt_base_map;
+
+/* Given BASE, use the tree affine combiniation facilities to
+ find the underlying tree expression for BASE, with any
+ immediate offset excluded. */
+
+static tree
+get_alternative_base (tree base)
+{
+ tree *result = (tree *) pointer_map_contains (alt_base_map, base);
+
+ if (result == NULL)
+ {
+ tree expr;
+ aff_tree aff;
+
+ tree_to_aff_combination_expand (base, TREE_TYPE (base),
+ &aff, &name_expansions);
+ aff.offset = tree_to_double_int (integer_zero_node);
+ expr = aff_combination_to_tree (&aff);
+
+ result = (tree *) pointer_map_insert (alt_base_map, base);
+ gcc_assert (!*result);
+
+ if (expr == base)
+ *result = NULL;
+ else
+ *result = expr;
+ }
+
+ return *result;
+}
+
/* Look in the candidate table for a CAND_PHI that defines BASE and
return it if found; otherwise return NULL. */
@@ -440,8 +477,9 @@ find_phi_def (tree base)
}
/* Helper routine for find_basis_for_candidate. May be called twice:
- once for the candidate's base expr, and optionally again for the
- candidate's phi definition. */
+ once for the candidate's base expr, and optionally again either for
+ the candidate's phi definition or for a CAND_REF's alternative base
+ expression. */
static slsr_cand_t
find_basis_for_base_expr (slsr_cand_t c, tree base_expr)
@@ -518,6 +556,13 @@ find_basis_for_candidate (slsr_cand_t c)
}
}
+ if (!basis && c->kind == CAND_REF)
+ {
+ tree alt_base_expr = get_alternative_base (c->base_expr);
+ if (alt_base_expr)
+ basis = find_basis_for_base_expr (c, alt_base_expr);
+ }
+
if (basis)
{
c->sibling = basis->dependent;
@@ -528,17 +573,21 @@ find_basis_for_candidate (slsr_cand_t c)
return 0;
}
-/* Record a mapping from the base expression of C to C itself, indicating that
- C may potentially serve as a basis using that base expression. */
+/* Record a mapping from BASE to C, indicating that C may potentially serve
+ as a basis using that base expression. BASE may be the same as
+ C->BASE_EXPR; alternatively BASE can be a different tree that share the
+ underlining expression of C->BASE_EXPR. */
static void
-record_potential_basis (slsr_cand_t c)
+record_potential_basis (slsr_cand_t c, tree base)
{
cand_chain_t node;
cand_chain **slot;
+ gcc_assert (base);
+
node = (cand_chain_t) obstack_alloc (&chain_obstack, sizeof (cand_chain));
- node->base_expr = c->base_expr;
+ node->base_expr = base;
node->cand = c;
node->next = NULL;
slot = base_cand_map.find_slot (node, INSERT);
@@ -554,10 +603,18 @@ record_potential_basis (slsr_cand_t c)
}
/* Allocate storage for a new candidate and initialize its fields.
- Attempt to find a basis for the candidate. */
+ Attempt to find a basis for the candidate.
+
+ For CAND_REF, an alternative base may also be recorded and used
+ to find a basis. This helps cases where the expression hidden
+ behind BASE (which is usually an SSA_NAME) has immediate offset,
+ e.g.
+
+ a2[i][j] = 1;
+ a2[i + 20][j] = 2; */
static slsr_cand_t
-alloc_cand_and_find_basis (enum cand_kind kind, gimple gs, tree base,
+alloc_cand_and_find_basis (enum cand_kind kind, gimple gs, tree base,
double_int index, tree stride, tree ctype,
unsigned savings)
{
@@ -583,7 +640,13 @@ alloc_cand_and_find_basis (enum cand_kind kind, gimple gs,
tree base,
else
c->basis = find_basis_for_candidate (c);
- record_potential_basis (c);
+ record_potential_basis (c, base);
+ if (kind == CAND_REF)
+ {
+ tree alt_base = get_alternative_base (base);
+ if (alt_base)
+ record_potential_basis (c, alt_base);
+ }
return c;
}
@@ -3524,6 +3587,9 @@ execute_strength_reduction (void)
/* Allocate the mapping from base expressions to candidate chains. */
base_cand_map.create (500);
+ /* Allocate the mapping from bases to alternative bases. */
+ alt_base_map = pointer_map_create ();
+
/* Initialize the loop optimizer. We need to detect flow across
back edges, and this gives us dominator information as well. */
loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
@@ -3539,6 +3605,9 @@ execute_strength_reduction (void)
dump_cand_chains ();
}
+ pointer_map_destroy (alt_base_map);
+ free_affine_expand_cache (&name_expansions);
+
/* Analyze costs and make appropriate replacements. */
analyze_candidates_and_replace ();
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/slsr-41.c
b/gcc/testsuite/gcc.dg/tree-ssa/slsr-41.c
new file mode 100644
index 0000000..870d714
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/slsr-41.c
@@ -0,0 +1,24 @@
+/* Verify straight-line strength reduction in using
+ alternative base expr to record and look for the
+ potential candidate. */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-slsr" } */
+
+typedef int arr_2[50][50];
+
+void foo (arr_2 a2, int v1)
+{
+ int i, j;
+
+ i = v1 + 5;
+ j = i;
+ a2 [i-10] [j] = 2;
+ a2 [i] [j++] = i;
+ a2 [i+20] [j++] = i;
+ a2 [i-3] [i-1] += 1;
+ return;
+}
+
+/* { dg-final { scan-tree-dump-times "MEM" 5 "slsr" } } */
+/* { dg-final { cleanup-tree-dump "slsr" } } */