[Sorry, somehow missed this till now]

Richard Biener <richard.guent...@gmail.com> writes:
> On Mon, Jul 23, 2018 at 5:05 PM Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> Marc Glisse <marc.gli...@inria.fr> writes:
>> > On Fri, 20 Jul 2018, Richard Sandiford wrote:
>> >
>> >> --- gcc/match.pd     2018-07-18 18:44:22.565914281 +0100
>> >> +++ gcc/match.pd     2018-07-20 11:24:33.692045585 +0100
>> >> @@ -4924,3 +4924,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>> >>    (if (inverse_conditions_p (@0, @2)
>> >>         && element_precision (type) == element_precision (op_type))
>> >>     (view_convert (cond_op @2 @3 @4 @5 (view_convert:op_type @1)))))))
>> >> +
>> >> +/* For pointers @0 and @2 and unsigned constant offset @1, look for
>> >> +   expressions like:
>> >> +
>> >> +   A: (@0 + @1 < @2) | (@2 + @1 < @0)
>> >> +   B: (@0 + @1 <= @2) | (@2 + @1 <= @0)
>> >> +
>> >> +   If pointers are known not to wrap, B checks whether @1 bytes starting 
>> >> at
>> >> +   @0 and @2 do not overlap, while A tests the same thing for @1 + 1 
>> >> bytes.
>> >> +   A is more efficiently tested as:
>> >> +
>> >> +   ((sizetype) @0 - (sizetype) @2 + @1) > (@1 * 2)
>> >> +
>> >> +   as long as @1 * 2 doesn't overflow.  B is the same with @1 replaced
>> >> +   with @1 - 1.  */
>> >> +(for ior (truth_orif truth_or bit_ior)
>> >> + (for cmp (le lt)
>> >> +  (simplify
>> >> +   (ior (cmp (pointer_plus:s @0 INTEGER_CST@1) @2)
>> >> +    (cmp (pointer_plus:s @2 @1) @0))
>> >
>> > Do you want :c on cmp, in case it appears as @2 > @0 + @1 ? (may need some
>> > care with "cmp == LE_EXPR" below)
>> > Do you want :s on cmp as well?
>> >
>> >> +   (if (!flag_trapv && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
>> >
>> > Don't you want TYPE_OVERFLOW_UNDEFINED?
>>
>> Thanks, fixed below.  Think the cmp == LE_EXPR stuff is still ok with :c,
>> since the generated code sets cmp to LE_EXPR when matching GE_EXPR.
>>
>> Tested as before.  OK to install?
>>
>> Richard
>>
>>
>> 2018-07-23  Richard Sandiford  <richard.sandif...@arm.com>
>>
>> gcc/
>>         * match.pd: Optimise pointer range checks.
>>
>> gcc/testsuite/
>>         * gcc.dg/pointer-range-check-1.c: New test.
>>         * gcc.dg/pointer-range-check-2.c: Likewise.
>>
>> Index: gcc/match.pd
>> ===================================================================
>> --- gcc/match.pd        2018-07-23 15:56:47.000000000 +0100
>> +++ gcc/match.pd        2018-07-23 15:58:33.480269844 +0100
>> @@ -4924,3 +4924,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>     (if (inverse_conditions_p (@0, @2)
>>          && element_precision (type) == element_precision (op_type))
>>      (view_convert (cond_op @2 @3 @4 @5 (view_convert:op_type @1)))))))
>> +
>> +/* For pointers @0 and @2 and unsigned constant offset @1, look for
>> +   expressions like:
>> +
>> +   A: (@0 + @1 < @2) | (@2 + @1 < @0)
>> +   B: (@0 + @1 <= @2) | (@2 + @1 <= @0)
>> +
>> +   If pointers are known not to wrap, B checks whether @1 bytes starting at
>> +   @0 and @2 do not overlap, while A tests the same thing for @1 + 1 bytes.
>> +   A is more efficiently tested as:
>> +
>> +   ((sizetype) @0 - (sizetype) @2 + @1) > (@1 * 2)
>> +
>> +   as long as @1 * 2 doesn't overflow.  B is the same with @1 replaced
>> +   with @1 - 1.  */
>> +(for ior (truth_orif truth_or bit_ior)
>> + (for cmp (le lt)
>> +  (simplify
>> +   (ior (cmp:cs (pointer_plus:s @0 INTEGER_CST@1) @2)
>> +       (cmp:cs (pointer_plus:s @2 @1) @0))
>> +   (if (!flag_trapv && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>
> no need to check flag_trapv, pointer arithmetic is not covered by -ftrapv.

This was there because we're converting pointer arithmetic into integer
arithmetic, and -ftrapv could cause that new integer code to trap.
TYPE_OVERFLOW_UNDEFINED says that pointer overflow is undefined,
but it seemed bad form to make it actually trap, especially when the
above does some reassociation too.

>> +    /* Convert the B form to the A form.  */
>> + (with { offset_int off = wi::to_offset (@1) - (cmp == LE_EXPR ? 1 :
>> 0); }
>> +     /* Always fails for negative values.  */
>> + (if (wi::min_precision (off, UNSIGNED) * 2 <= TYPE_PRECISION
>> (sizetype))
>> +      /* It doesn't matter whether we use @2 - @0 or @0 - @2, so let
>> +        tree_swap_operands_p pick a canonical order.  */
>
> gimple_resimplify takes care of that - well, it doesn't since minus isn't
> commutative...  I guess you get better CSE later when doing this thus ok,
> but it does look a bit off here ;)
>
> I think you shouldn't use 'sizetype' without guarding this whole thing
> with TYPE_PRECISION (sizetype) == TYPE_PRECISION (TREE_TYPE (@0)).

OK, hadn't realised that could be false.  Would building the appropriate
unsigned type be OK without the condition, or does it need to be sizetype?

> Since the original IL performs an ordered compare of two eventually unrelated
> pointers (or pointers adjusted in a way that crosses object
> boundaries) (uh... ;))
> I wonder if you can use POINTER_DIFF_EXPR here to avoid the sizetype
> conversion?  Since POINTER_PLUS_EXPR offsets are supposed to be
> interpreted "signed" that should also handle negative offsets correctly then?

The transform isn't valid when @1 is negative though, so I think we
need that check either way.

I could use POINTER_DIFF_EXPR and convert to unsigned though, if that
seems better.  That might also allow us to CSE the retained
POINTER_PLUS_EXPRs.

> Also, when @2 == @0 + (@1+1) then the original condition is true but
> ((sizetype) @0 - (sizetype) @2 + @1) > (@1 * 2) is not?
>    (sizetype) @0 - (sizetype) (@0 + @1 + 1) + @1 > @1 * 2
> -> -1 > @1 * 2
>
> which is false.  So I can't really see how you can apply this transform in
> general (it would be fine for generating alias checks but a bit more
> pessimistic).
>
> But maybe I am missing something?

It relies on sizetype being unsigned: (sizetype)-1 > @1 * 2 is true.

Thanks,
Richard

> Richard.
>
>> +      (with { tree ptr1 = @0, ptr2 = @2;
>> +             if (tree_swap_operands_p (ptr1, ptr2))
>> +               std::swap (ptr1, ptr2); }
>> +       (gt (plus (minus (convert:sizetype { ptr1; })
>> +                       (convert:sizetype { ptr2; }))
>> +                { wide_int_to_tree (sizetype, off); })
>> +          { wide_int_to_tree (sizetype, off * 2); }))))))))
>> Index: gcc/testsuite/gcc.dg/pointer-range-check-1.c
>> ===================================================================
>> --- /dev/null   2018-07-09 14:52:09.234750850 +0100
>> +++ gcc/testsuite/gcc.dg/pointer-range-check-1.c 2018-07-23
>> 15:58:33.480269844 +0100
>> @@ -0,0 +1,37 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fno-ipa-icf -fdump-tree-optimized" } */
>> +
>> +/* All four functions should be folded to:
>> +
>> +   ((sizetype) a - (sizetype) b + 15) < 30.  */
>> +
>> +_Bool
>> +f1 (char *a, char *b)
>> +{
>> +  return (a + 16 <= b) || (b + 16 <= a);
>> +}
>> +
>> +_Bool
>> +f2 (char *a, char *b)
>> +{
>> +  return (a + 15 < b) || (b + 15 < a);
>> +}
>> +
>> +_Bool
>> +f3 (char *a, char *b)
>> +{
>> +  return (a + 16 <= b) || (b + 16 <= a);
>> +}
>> +
>> +_Bool
>> +f4 (char *a, char *b)
>> +{
>> +  return (a + 15 < b) || (b + 15 < a);
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times { = [^\n]* - [^\n]*;} 4
>> "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times { = [^\n]* \+ [^\n]*;} 4
>> "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times { = [^\n]*\ > [^\n]*;} 4
>> "optimized" } } */
>> +/* { dg-final { scan-tree-dump-not {=[^\n]*\ < [^\n]*;} "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times { \+ 15} 4 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times { > 30} 4 "optimized" } } */
>> Index: gcc/testsuite/gcc.dg/pointer-range-check-2.c
>> ===================================================================
>> --- /dev/null   2018-07-09 14:52:09.234750850 +0100
>> +++ gcc/testsuite/gcc.dg/pointer-range-check-2.c 2018-07-23
>> 15:58:33.480269844 +0100
>> @@ -0,0 +1,31 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fno-ipa-icf -fwrapv-pointer
>> -fdump-tree-optimized" } */
>> +
>> +_Bool
>> +f1 (char *a, char *b)
>> +{
>> +  return (a + 16 <= b) || (b + 16 <= a);
>> +}
>> +
>> +_Bool
>> +f2 (char *a, char *b)
>> +{
>> +  return (a + 15 < b) || (b + 15 < a);
>> +}
>> +
>> +_Bool
>> +f3 (char *a, char *b)
>> +{
>> +  return (a + 16 <= b) || (b + 16 <= a);
>> +}
>> +
>> +_Bool
>> +f4 (char *a, char *b)
>> +{
>> +  return (a + 15 < b) || (b + 15 < a);
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-not { = [^\n]* - [^\n]*;} "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times { = [^\n]* \+ [^\n]*;} 8
>> "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times { \+ 15} 4 "optimized" } } */
>> +/* { dg-final { scan-tree-dump-times { \+ 16} 4 "optimized" } } */

Reply via email to