Ok, so with recent activity in that mgrid bug (PR55334) I tried to remember what solution we thought of after determining that ADD_RESTRICT is a no-go.
The following very prototypish patch implements the idea of computing known non-dependences and maintaining them over the compilation (mainly inlining / cloning for PR55334). So the patch piggy-backs on PTA (bad - PTA doesn't compute must-aliases, so it will work only for a very limited set of testcases now - but at least it might work for non-register restrict stuff). The representation of "non-dependences" is the most interesting bit of course. We partition memory references into different cliques in which all references are independent. And we split that clique again into sets of references based on the same pointer. That allows us to disambiguate equal clique but distinct pointer memory references. For restrict you'd put all(!) memory references in the BLOCK where the restrict pointer is live into the same clique and assign a distinct ptr based on which restrict pointer this is based on. You make all references not based on any restrict pointer have ptr == 0 (not implemented in the prototype patch - they don't get a clique assigned). The patch simplifies this by taking function scope as the only BLOCK to consider, thus inside a function the clique will be unique (before inlining). I can see issues arising with assigning numbers in the frontend based on real BLOCKs and { int * restrict q; { int * restrict p; *p = *q; } { int * restrict r; *r = *q; } that is, non-dependences based on pointers coming from different nested scopes. The FE in this case would need to "duplicate" the ptr value for 'q' to not make *p and *r falsely a non-dependence. But I think we're not planning to assign this in the C frontend family (maybe the Fortran FE though). To preserve correctness cliques from inlined functions need to be remapped to an unused clique number so struct function gets a max_clique number. (remapping not implemented in the prototype) Any correctness / missed-optimization holes to punch? That is, given a set of non-dependent reference pairs, can you assign that clique, ptr values in a correct and optimal way? (it somehow assumes transitivity, if the set is *a, *b; *b, *c; then *a, *c are also non-dependent) It all depends on a conservative but agressive must-be-based-on analysis of course. You'd meet that with the conservative may-be-based-on analysis from PTA and compute the proper non-dependences from that. Comments welcome. Patch tested on static inline void foo (int * __restrict__ p, int * __restrict__ q) { int i; for (i = 0; i < 1024; ++i) p[i] = q[i]; } void bar (int *p, int *q) { foo (p, q); } where it still vectorizes the loop in bar without alias check. Richard. Index: trunk/gcc/function.h =================================================================== *** trunk.orig/gcc/function.h 2014-08-29 14:18:27.221364973 +0200 --- trunk/gcc/function.h 2014-09-03 13:15:11.433883630 +0200 *************** struct GTY(()) function { *** 592,597 **** --- 592,600 ---- a string describing the reason for failure. */ const char * GTY((skip)) cannot_be_copied_reason; + /* Last assigned restrict UID. */ + unsigned short last_restrict_uid; + /* Collected bit flags. */ /* Number of units of general registers that need saving in stdarg Index: trunk/gcc/tree-inline.c =================================================================== *** trunk.orig/gcc/tree-inline.c 2014-08-26 13:40:18.811368135 +0200 --- trunk/gcc/tree-inline.c 2014-09-03 14:15:14.792635543 +0200 *************** remap_gimple_op_r (tree *tp, int *walk_s *** 929,934 **** --- 929,937 ---- TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old); TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old); TREE_NO_WARNING (*tp) = TREE_NO_WARNING (old); + /* ??? Properly remap the clique to a new one. */ + if (old->base.u.version != 0) + (*tp)->base.u.version = old->base.u.version; /* We cannot propagate the TREE_THIS_NOTRAP flag if we have remapped a parameter as the property might be valid only for the parameter itself. */ *************** copy_tree_body_r (tree *tp, int *walk_su *** 1180,1185 **** --- 1183,1191 ---- TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old); TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old); TREE_NO_WARNING (*tp) = TREE_NO_WARNING (old); + /* ??? Properly remap the clique to a new one. */ + if (old->base.u.version != 0) + (*tp)->base.u.version = old->base.u.version; /* We cannot propagate the TREE_THIS_NOTRAP flag if we have remapped a parameter as the property might be valid only for the parameter itself. */ Index: trunk/gcc/tree-pretty-print.c =================================================================== *** trunk.orig/gcc/tree-pretty-print.c 2014-08-04 11:50:13.421690694 +0200 --- trunk/gcc/tree-pretty-print.c 2014-09-03 14:04:41.816679122 +0200 *************** dump_generic_node (pretty_printer *buffe *** 1071,1077 **** /* Same value types ignoring qualifiers. */ && (TYPE_MAIN_VARIANT (TREE_TYPE (node)) == TYPE_MAIN_VARIANT ! (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1)))))) { if (TREE_CODE (TREE_OPERAND (node, 0)) != ADDR_EXPR) { --- 1071,1078 ---- /* Same value types ignoring qualifiers. */ && (TYPE_MAIN_VARIANT (TREE_TYPE (node)) == TYPE_MAIN_VARIANT ! (TREE_TYPE (TREE_TYPE (TREE_OPERAND (node, 1))))) ! && node->base.u.version == 0) { if (TREE_CODE (TREE_OPERAND (node, 0)) != ADDR_EXPR) { *************** dump_generic_node (pretty_printer *buffe *** 1102,1107 **** --- 1103,1115 ---- dump_generic_node (buffer, TREE_OPERAND (node, 1), spc, flags, false); } + if (node->base.u.version != 0) + { + pp_string (buffer, " clq "); + pp_unsigned_wide_integer (buffer, node->base.u.version >> 16); + pp_string (buffer, " ptr "); + pp_unsigned_wide_integer (buffer, node->base.u.version & 0xffff); + } pp_right_bracket (buffer); } break; *************** dump_generic_node (pretty_printer *buffe *** 1440,1446 **** /* Same value types ignoring qualifiers. */ && (TYPE_MAIN_VARIANT (TREE_TYPE (op0)) == TYPE_MAIN_VARIANT ! (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1)))))))) { op0 = TREE_OPERAND (op0, 0); str = "->"; --- 1448,1455 ---- /* Same value types ignoring qualifiers. */ && (TYPE_MAIN_VARIANT (TREE_TYPE (op0)) == TYPE_MAIN_VARIANT ! (TREE_TYPE (TREE_TYPE (TREE_OPERAND (op0, 1))))) ! && op0->base.u.version == 0))) { op0 = TREE_OPERAND (op0, 0); str = "->"; Index: trunk/gcc/tree-ssa-alias.c =================================================================== *** trunk.orig/gcc/tree-ssa-alias.c 2014-08-26 13:40:16.710368279 +0200 --- trunk/gcc/tree-ssa-alias.c 2014-09-03 14:25:17.324594059 +0200 *************** refs_may_alias_p_1 (ao_ref *ref1, ao_ref *** 1371,1376 **** --- 1371,1406 ---- return decl_refs_may_alias_p (ref1->ref, base1, offset1, max_size1, ref2->ref, base2, offset2, max_size2); + /* Handle restrict based accesses. + ??? ao_ref_base strips inner MEM_REF [&decl], recover from that + here. */ + tree rbase1 = base1; + tree rbase2 = base2; + if (var1_p) + { + rbase1 = ref1->ref; + if (rbase1) + while (handled_component_p (rbase1)) + rbase1 = TREE_OPERAND (rbase1, 0); + } + if (var2_p) + { + rbase2 = ref2->ref; + if (rbase2) + while (handled_component_p (rbase2)) + rbase2 = TREE_OPERAND (rbase2, 0); + } + if (rbase1 && rbase2 + && TREE_CODE (base1) == MEM_REF + && TREE_CODE (base2) == MEM_REF + && base1->base.u.version != 0 + && base2->base.u.version != 0 + /* If the accesses are in the same restrict clique... */ + && (base1->base.u.version >> 16) == (base2->base.u.version >> 16) + /* But based on different pointers they do not alias. */ + && (base1->base.u.version & 0xffff) != (base2->base.u.version & 0xffff)) + return false; + ind1_p = (TREE_CODE (base1) == MEM_REF || TREE_CODE (base1) == TARGET_MEM_REF); ind2_p = (TREE_CODE (base2) == MEM_REF Index: trunk/gcc/tree-ssa-structalias.c =================================================================== *** trunk.orig/gcc/tree-ssa-structalias.c 2014-09-02 10:13:58.150580778 +0200 --- trunk/gcc/tree-ssa-structalias.c 2014-09-03 14:23:10.540602788 +0200 *************** *** 52,57 **** --- 52,59 ---- #include "splay-tree.h" #include "params.h" #include "alias.h" + #include "tree-phinodes.h" + #include "ssa-iterators.h" /* The idea behind this analyzer is to generate set constraints from the program, then solve the resulting constraints in order to generate the *************** struct variable_info *** 272,283 **** --- 274,292 ---- /* True if this field has only restrict qualified pointers. */ unsigned int only_restrict_pointers : 1; + /* True if this represents a heap var created for a restrict qualified + pointer. */ + unsigned int is_restrict_var : 1; + /* True if this represents a global variable. */ unsigned int is_global_var : 1; /* True if this represents a IPA function info. */ unsigned int is_fn_info : 1; + /* ??? Store somewhere better. */ + unsigned short ruid; + /* The ID of the variable for the next field in this structure or zero for the last field in this structure. */ unsigned next; *************** new_var_info (tree t, const char *name) *** 369,374 **** --- 378,384 ---- ret->is_heap_var = false; ret->may_have_pointers = true; ret->only_restrict_pointers = false; + ret->is_restrict_var = false; ret->is_global_var = (t == NULL_TREE); ret->is_fn_info = false; if (t && DECL_P (t)) *************** static varinfo_t *** 3773,3778 **** --- 3783,3789 ---- make_constraint_from_restrict (varinfo_t lhs, const char *name) { varinfo_t vi = make_heapvar (name); + vi->is_restrict_var = 1; vi->is_global_var = 1; vi->may_have_pointers = 1; make_constraint_from (lhs, vi->id); *************** compute_may_aliases (void) *** 6948,6953 **** --- 6959,7046 ---- if (dump_file) dump_alias_info (dump_file); + { + unsigned short clique = 0; + unsigned short last_ruid = 0; + for (unsigned i = 0; i < num_ssa_names; ++i) + { + tree ptr = ssa_name (i); + if (!ptr || !POINTER_TYPE_P (TREE_TYPE (ptr))) + continue; + + /* Avoid all this when ptr is not dereferenced? */ + tree p = ptr; + if (SSA_NAME_IS_DEFAULT_DEF (ptr) + && (TREE_CODE (SSA_NAME_VAR (ptr)) == PARM_DECL + || TREE_CODE (SSA_NAME_VAR (ptr)) == RESULT_DECL)) + p = SSA_NAME_VAR (ptr); + + varinfo_t vi = get_varinfo (find (lookup_vi_for_tree (p)->id)); + bitmap_iterator bi; + unsigned j; + varinfo_t restrict_var = NULL; + EXECUTE_IF_SET_IN_BITMAP (vi->solution, 0, j, bi) + { + varinfo_t oi = get_varinfo (j); + if (oi->is_restrict_var) + { + if (restrict_var) + { + restrict_var = NULL; + break; + } + restrict_var = oi; + } + /* NULL is the only other valid points-to entry. */ + else if (oi->id != nothing_id) + { + restrict_var = NULL; + break; + } + } + /* Ok, found that ptr must(!) point to a single(!) restrict + variable. */ + /* ??? PTA isn't really a proper propagation engine to compute + this property. + ??? We can handle merging of two restricts + by unifying them. */ + if (restrict_var) + { + if (clique == 0) + clique = ++cfun->last_restrict_uid; + if (restrict_var->ruid == 0) + restrict_var->ruid = ++last_ruid; + + /* Now look at possible dereferences of ptr. */ + imm_use_iterator ui; + gimple use_stmt; + FOR_EACH_IMM_USE_STMT (use_stmt, ui, ptr) + { + /* ??? Calls and asms. */ + if (!gimple_assign_single_p (use_stmt)) + continue; + tree *ref = gimple_assign_lhs_ptr (use_stmt); + while (handled_component_p (*ref)) + ref = &TREE_OPERAND (*ref, 0); + if (TREE_CODE (*ref) == MEM_REF + && TREE_OPERAND (*ref, 0) == ptr) + (*ref)->base.u.version = clique << 16 | restrict_var->ruid; + ref = gimple_assign_rhs1_ptr (use_stmt); + while (handled_component_p (*ref)) + ref = &TREE_OPERAND (*ref, 0); + if (TREE_CODE (*ref) == MEM_REF + && TREE_OPERAND (*ref, 0) == ptr) + (*ref)->base.u.version = clique << 16 | restrict_var->ruid; + } + + /* ??? Accesses not based on a restrict pointer should + all get into the clique but share a single ptr ID. + That way they get disabiguated against restrict + accesses but not against each other. */ + } + } + } + /* Deallocate memory used by aliasing data structures and the internal points-to solution. */ delete_points_to_sets (); Index: trunk/gcc/tree-data-ref.c =================================================================== *** trunk.orig/gcc/tree-data-ref.c 2014-08-14 14:38:53.735508565 +0200 --- trunk/gcc/tree-data-ref.c 2014-09-03 14:55:51.995467744 +0200 *************** dr_analyze_indices (struct data_referenc *** 979,987 **** --- 979,989 ---- guaranteed. As a band-aid, mark the access so we can special-case it in dr_may_alias_p. */ + tree old = ref; ref = fold_build2_loc (EXPR_LOCATION (ref), MEM_REF, TREE_TYPE (ref), base, memoff); + ref->base.u.version = old->base.u.version; access_fns.safe_push (access_fn); } } *************** dr_may_alias_p (const struct data_refere *** 1388,1393 **** --- 1390,1403 ---- return false; } + if (TREE_CODE (addr_a) == MEM_REF + && TREE_CODE (addr_b) == MEM_REF + && addr_a->base.u.version != 0 + && addr_b->base.u.version != 0 + && (addr_a->base.u.version >> 16) == (addr_b->base.u.version >> 16) + && (addr_a->base.u.version & 0xffff) != (addr_b->base.u.version & 0xffff)) + return false; + /* If we had an evolution in a pointer-based MEM_REF BASE_OBJECT we do not know the size of the base-object. So we cannot do any offset/overlap based analysis but have to rely on points-to