Jan Hubicka <hubi...@ucw.cz> writes: >> >> >> >> Well, in tree-ssa code we do assume these to be either disjoint objects >> >> or equal (in decl_refs_may_alias_p that continues in case >> >> compare_base_decls is -1). I am not sure if we win much by threating >> >> them differently on RTL level. I would preffer staying consistent here. >> >> Yeah, I see your point. My concern here was that the fallback case >> applies to SYMBOL_REFs without decls, which might not have been visible >> at the tree-ssa level. E.g. they might be ABI-defined symbols that have >> no known relation to source-level constructs. >> >> E.g. the small-data base symbol _gp on MIPS points at a fixed offset >> from the start of the small-data area (0x7ff0 IIRC). If the target >> generated rtl code that used _gp directly, we could wrongly assume >> that _gp+X can't alias BASE+Y when X != Y, even though the real test >> for small-data BASEs would be whether X + 0x7ff0 != Y. >> >> I don't think that could occur in tree-ssa. No valid C code would >> be able to refer directly to _gp in this way. >> >> On the other hand, I don't have a specific example of where this does >> go wrong, it's just a feeling that it might. I can drop it if you >> think that's better. > > I would lean towards not disabling optimization when we have no good > reason for that - we already did it bit too many times in aliasing code > and it is hard to figure out what optimizations are missed purposefully > and what are missed just as omission. > > We already comitted to a very conservative assumption that every > external symbol can be alias of another. I think we should have > originally required units that reffers to same memory location via > different symbols to declare it explicitly (i.e. make external alias to > external symbol), but we do not even allow external aliases (symtab > supports that though) and also it may depend on use of the module what > symbols are aliased. > > We also decided to disable TBAA for direct accesses to decls to allow > type punning using unions. > > This keeps the offset+range check to be only means of disambiguation. > While for modern programs global arrays are not common, for Fortran > stuff they are, so I would preffer to not cripple them even more. > (I am not sure how often the arrays are external though)
OK, the version below drops the new -2 return value and tries to clarify the comments in compare_base_symbol_refs. Lightly tested on aarch64-linux-gnu so far. Does it look OK if full tests pass? Thanks, Richard ---- memrefs_conflict_p assumes that: [XB + XO, XB + XO + XS) does not alias [YB + YO, YB + YO + YS) whenever: [XO, XO + XS) does not intersect [YO, YO + YS) In other words, the accesses can alias only if XB == YB at runtime. However, this doesn't cope correctly with section anchors. For example, if XB is an anchor symbol and YB is at offset XO from the anchor, then: [XB + XO, XB + XO + XS) overlaps [YB, YB + YS) whatever the value of XO is. In other words, when doing the alias check for two symbols whose local definitions are in the same block, we should apply the known difference between their block offsets to the intersection test above. gcc/ PR rtl-optimization/92294 * alias.c (compare_base_symbol_refs): Take an extra parameter and add the distance between two symbols to it. Enshrine in comments that -1 means "either 0 or 1, but we can't tell which at compile time". (memrefs_conflict_p): Update call accordingly. (rtx_equal_for_memref_p): Likewise. Take the distance between symbols into account. --- gcc/alias.c | 47 +++++++++++++++++++++++++++++++---------------- 1 file changed, 31 insertions(+), 16 deletions(-) diff --git a/gcc/alias.c b/gcc/alias.c index 8d3575e4e27..69e1eb89ac6 100644 --- a/gcc/alias.c +++ b/gcc/alias.c @@ -159,7 +159,8 @@ static tree decl_for_component_ref (tree); static int write_dependence_p (const_rtx, const_rtx, machine_mode, rtx, bool, bool, bool); -static int compare_base_symbol_refs (const_rtx, const_rtx); +static int compare_base_symbol_refs (const_rtx, const_rtx, + HOST_WIDE_INT * = NULL); static void memory_modified_1 (rtx, const_rtx, void *); @@ -1837,7 +1838,11 @@ rtx_equal_for_memref_p (const_rtx x, const_rtx y) return label_ref_label (x) == label_ref_label (y); case SYMBOL_REF: - return compare_base_symbol_refs (x, y) == 1; + { + HOST_WIDE_INT distance = 0; + return (compare_base_symbol_refs (x, y, &distance) == 1 + && distance == 0); + } case ENTRY_VALUE: /* This is magic, don't go through canonicalization et al. */ @@ -2172,10 +2177,20 @@ compare_base_decls (tree base1, tree base2) return ret; } -/* Same as compare_base_decls but for SYMBOL_REF. */ +/* Compare SYMBOL_REFs X_BASE and Y_BASE. + + - Return 1 if Y_BASE - X_BASE is constant, adding that constant + to *DISTANCE if DISTANCE is nonnull. + + - Return 0 if no accesses based on X_BASE can alias Y_BASE. + + - Return -1 if one of the two results applies, but we can't tell + which at compile time. Update DISTANCE in the same way as + for a return value of 1, for the case in which that holds. */ static int -compare_base_symbol_refs (const_rtx x_base, const_rtx y_base) +compare_base_symbol_refs (const_rtx x_base, const_rtx y_base, + HOST_WIDE_INT *distance) { tree x_decl = SYMBOL_REF_DECL (x_base); tree y_decl = SYMBOL_REF_DECL (y_base); @@ -2192,8 +2207,8 @@ compare_base_symbol_refs (const_rtx x_base, const_rtx y_base) std::swap (x_decl, y_decl); std::swap (x_base, y_base); } - /* We handle specially only section anchors and assume that other - labels may overlap with user variables in an arbitrary way. */ + /* We handle specially only section anchors. Other symbols are + either equal (via aliasing) or refer to different objects. */ if (!SYMBOL_REF_HAS_BLOCK_INFO_P (y_base)) return -1; /* Anchors contains static VAR_DECLs and CONST_DECLs. We are safe @@ -2222,14 +2237,13 @@ compare_base_symbol_refs (const_rtx x_base, const_rtx y_base) { if (SYMBOL_REF_BLOCK (x_base) != SYMBOL_REF_BLOCK (y_base)) return 0; - if (SYMBOL_REF_BLOCK_OFFSET (x_base) == SYMBOL_REF_BLOCK_OFFSET (y_base)) - return binds_def ? 1 : -1; - if (SYMBOL_REF_ANCHOR_P (x_base) != SYMBOL_REF_ANCHOR_P (y_base)) - return -1; - return 0; + if (distance) + *distance += (SYMBOL_REF_BLOCK_OFFSET (y_base) + - SYMBOL_REF_BLOCK_OFFSET (x_base)); + return binds_def ? 1 : -1; } - /* In general we assume that memory locations pointed to by different labels - may overlap in undefined ways. */ + /* Either the symbols are equal (via aliasing) or they refer to + different objects. */ return -1; } @@ -2513,11 +2527,12 @@ memrefs_conflict_p (poly_int64 xsize, rtx x, poly_int64 ysize, rtx y, if (GET_CODE (x) == SYMBOL_REF && GET_CODE (y) == SYMBOL_REF) { - int cmp = compare_base_symbol_refs (x,y); + HOST_WIDE_INT distance = 0; + int cmp = compare_base_symbol_refs (x, y, &distance); /* If both decls are the same, decide by offsets. */ if (cmp == 1) - return offset_overlap_p (c, xsize, ysize); + return offset_overlap_p (c + distance, xsize, ysize); /* Assume a potential overlap for symbolic addresses that went through alignment adjustments (i.e., that have negative sizes), because we can't know how far they are from each @@ -2526,7 +2541,7 @@ memrefs_conflict_p (poly_int64 xsize, rtx x, poly_int64 ysize, rtx y, return -1; /* If decls are different or we know by offsets that there is no overlap, we win. */ - if (!cmp || !offset_overlap_p (c, xsize, ysize)) + if (!cmp || !offset_overlap_p (c + distance, xsize, ysize)) return 0; /* Decls may or may not be different and offsets overlap....*/ return -1;