What should we do about this?  The PR is a wrong-code regression from
GCC 9 and it doesn't look like we can remove the second offset_overlap_p
check, given how many cases currently rely on it.

Thanks,
Richard

Richard Sandiford <richard.sandif...@arm.com> writes:
> Richard Biener <richard.guent...@gmail.com> writes:
>> On Tue, Feb 4, 2020 at 6:44 PM Richard Sandiford
>> <richard.sandif...@arm.com> wrote:
>>>
>>> Richard Sandiford <richard.sandif...@arm.com> writes:
>>> > [...]
>>> >> I'm not sure given the issues you've introduced if I could actually
>>> >> fill out the matrix of answers without more underlying information.
>>> >> ie, when can we get symbols without source level decls,
>>> >> anchors+interposition issues, etc.
>>> >
>>> > OK.  In that case, I wonder whether it would be safer to have a
>>> > fourth state on top of the three above:
>>> >
>>> >   - known distance apart
>>> >   - independent
>>> >   - known distance apart or independent
>>> >   - don't know
>>> >
>>> > with "don't know" being anything that involves bare symbols?
>>> >
>>> > Richard
>>>
>>> How does this look?  Tested on aarch64-linux-gnu and
>>> x86_64-linux-gnu.
>>>
>>> Full description from scratch:
>>>
>>> memrefs_conflict_p has a slightly odd structure.  It first checks
>>> whether two addresses based on SYMBOL_REFs refer to the same object,
>>> with a tristate result:
>>>
>>>       int cmp = compare_base_symbol_refs (x,y);
>>>
>>> If the addresses do refer to the same object, we can use offset-based 
>>> checks:
>>>
>>>       /* If both decls are the same, decide by offsets.  */
>>>       if (cmp == 1)
>>>         return offset_overlap_p (c, xsize, ysize);
>>>
>>> But then, apart from the special case of forced address alignment,
>>> we use an offset-based check even if we don't know whether the
>>> addresses refer to the same object:
>>>
>>>       /* Assume a potential overlap for symbolic addresses that went
>>>          through alignment adjustments (i.e., that have negative
>>>          sizes), because we can't know how far they are from each
>>>          other.  */
>>>       if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>>>         return -1;
>>>       /* If decls are different or we know by offsets that there is no 
>>> overlap,
>>>          we win.  */
>>>       if (!cmp || !offset_overlap_p (c, xsize, ysize))
>>>         return 0;
>>>
>>> This somewhat contradicts:
>>>
>>>   /* In general we assume that memory locations pointed to by different 
>>> labels
>>>      may overlap in undefined ways.  */
>>
>> Sorry for not chiming in earlier but isn't the bug that
>>
>>
>>
>>> at the end of compare_base_symbol_refs.  In other words, we're taking -1
>>> to mean that either (a) the symbols are equal (via aliasing) or (b) the
>>> references access non-overlapping objects.
>>>
>>> But even assuming that's true for normal symbols, it doesn't cope
>>> correctly with section anchors.  If a symbol X at ANCHOR+OFFSET
>>> is preemptible, either (a) X = ANDHOR+OFFSET or (b) X and ANCHOR
>>> reference non-overlapping objects.
>>>
>>> And an offset-based comparison makes no sense for an anchor symbol
>>> vs. a bare symbol with no decl.  If the bare symbol is allowed to
>>> alias other symbols then it can surely alias any symbol in the
>>> anchor's block, so there are multiple anchor offsets that might
>>> induce an alias.
>>
>> But then isn't the simple fix to honor the -1 and do
>>
>> diff --git a/gcc/alias.c b/gcc/alias.c
>> index 3794f9b6a9e..bf13d37c0f7 100644
>> --- a/gcc/alias.c
>> +++ b/gcc/alias.c
>> @@ -2490,9 +2490,8 @@ memrefs_conflict_p (poly_int64 xsize, rtx x,
>> poly_int64 ysize, rtx y,
>>          other.  */
>>        if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>>         return -1;
>> -      /* If decls are different or we know by offsets that there is no 
>> overlap,
>> -        we win.  */
>> -      if (!cmp || !offset_overlap_p (c, xsize, ysize))
>> +      /* If decls are different, we win.  */
>> +      if (cmp == 0)
>>         return 0;
>>        /* Decls may or may not be different and offsets overlap....*/
>>        return -1;
>>
>> ?
>
> The code was deliberately written this way from the ouset though
> (rather than it ending up like this through many cuts).  It was
> added in g:54363f8a92920f5559c83ddd53e480a27205e6b7:
>
> 2015-12-08  Jan Hubicka  <hubi...@ucw.cz>
>
>       PR ipa/61886
>       PR middle-end/25140
>       * tree-ssa-alias.c (ptr_deref_may_alias_decl_p): Use compare_base_decls
>       (nonoverlapping_component_refs_of_decl_p): Update sanity check.
>       (decl_refs_may_alias_p): Use compare_base_decls.
>       * alias.c: Include cgraph.h
>       (get_alias_set): Add cut-off for recursion.
>       (rtx_equal_for_memref_p): Use rtx_equal_for_memref_p.
>       (compare_base_decls): New function.
>       (base_alias_check): Likewise.
>       (memrefs_conflict_p): Likewise.
>       (nonoverlapping_memrefs_p): Likewise.
>       * alias.h (compare_base_decls): Declare.
>
> which included:
>
> -  if (rtx_equal_for_memref_p (x, y))
> +  if (GET_CODE (x) == SYMBOL_REF && GET_CODE (y) == SYMBOL_REF)
> +    {
> +      tree x_decl = SYMBOL_REF_DECL (x);
> +      tree y_decl = SYMBOL_REF_DECL (y);
> +      int cmp;
> +
> +      if (!x_decl || !y_decl)
> +     {
> +       /* Label and normal symbol are never the same. */
> +       if (x_decl != y_decl)
> +         return 0;
> +       return offset_overlap_p (c, xsize, ysize);
> +     }
> +      else
> +        cmp = compare_base_decls (x_decl, y_decl);
> +
> +      /* If both decls are the same, decide by offsets.  */
> +      if (cmp == 1)
> +        return offset_overlap_p (c, xsize, ysize);
> +      /* If decls are different or we know by offsets that there is no 
> overlap,
> +      we win.  */
> +      if (!cmp || !offset_overlap_p (c, xsize, ysize))
> +     return 0;
> +      /* Decls may or may not be different and offsets overlap....*/
> +      return -1;
> +    }
> +  else if (rtx_equal_for_memref_p (x, y))
>
> The only significant change since them was to add compare_base_symbol_refs
> (g:73e48cb322152bf504ced8694fa748544ecaa6eb):
>
>    if (GET_CODE (x) == SYMBOL_REF && GET_CODE (y) == SYMBOL_REF)
>      {
> -      tree x_decl = SYMBOL_REF_DECL (x);
> -      tree y_decl = SYMBOL_REF_DECL (y);
> -      int cmp;
> -
> -      if (!x_decl || !y_decl)
> -     {
> -       /* Label and normal symbol are never the same. */
> -       if (x_decl != y_decl)
> -         return 0;
> -       return offset_overlap_p (c, xsize, ysize);
> -     }
> -      else
> -        cmp = compare_base_decls (x_decl, y_decl);
> +      int cmp = compare_base_symbol_refs (x,y);
>
>>> This patch therefore replaces the current tristate:
>>>
>>>   - known equal
>>>   - known independent (two accesses can't alias)
>>>   - equal or independent
>>>
>>> with:
>>>
>>>   - known distance apart
>>>   - known independent (two accesses can't alias)
>>>   - known distance apart or independent
>>>   - don't know
>>>
>>> For safety, the patch puts all bare symbols in the "don't know"
>>> category.  If that turns out to be too conservative, we at least
>>> need that behaviour for combinations involving a bare symbol
>>> and a section anchor.
>>
>> This sounds complicated (for this stage).  Do you have any statistics as
>> to how it affects actual alias queries (thus outcome in 
>> {true,...}_dependence)
>> when you do the "simple" fix?
>
> For a stage3 build of gcc on aarch64-linux-gnu I get:
>
>    ("positive" == conflict, "negative" == no conflict)
>
>    16191  cmp ==  1,    disjoint offsets :  true negative
>    19698  cmp ==  1, overlapping offsets :  true positive
>    79363  cmp == -1,    disjoint offsets :  true negative
>     6965  cmp == -1,    disjoint offsets : false negative <== bug
>       48  cmp == -1, overlapping offsets :  true positive
>      928  cmp == -1, overlapping offsets : false positive <== missed opt
>   123193  total queries
>
> where the cmp == -1 cases are divided into true and false results
> according to whether we get the same answer when taking the (hopefully)
> correct offset into account.  cmp == 0 and what would be cmp == -2 never
> occured.
>
> So it looks like we're relying on the cmp == -1 offset_overlap_p
> check to get true "no conflict" results for ~64% of all queries
> (or ~83% of all true "no conflict" results).  I expect this is
> very dependent on the fact that the port uses section anchors.
>
> The number of buggy cases seems surprisingly high, but I've tried
> to re-check it a couple of times and it looks to be accurate.
> They come from cases where we compare things like:
>
> x: (symbol_ref:DI ("*.LANCHOR0") [flags 0x182])
> y: (symbol_ref:DI ("_ZL23yy_last_accepting_state") [flags 0x82] <var_decl ... 
> yy_last_accepting_state>)
> c: -48
>
> where yy_last_accepting_state is at offset 48 from the anchor.
> I haven't checked where the buggy queries are coming from though;
> might be debug-related and so hard to spot.
>
> Thanks,
> Richard

Reply via email to