Re: [RFTesting] New POINTER_DIFF_EXPR

Richard Biener Sat, 18 Nov 2017 03:49:18 -0800

On November 18, 2017 11:20:42 AM GMT+01:00, Marc Glisse <marc.gli...@inria.fr> 
wrote:
>On Fri, 17 Nov 2017, Richard Biener wrote:
>
>> On Sat, Nov 11, 2017 at 12:44 AM, Marc Glisse <marc.gli...@inria.fr>
>wrote:
>>
>>> The exact undefined-behavior status should probably be clarified
>more.
>>> First, I'd like to say that POINTER_DIFF_EXPR may only take 2
>pointers into
>>> the same "object" (like in C/C++), so they differ by at most half
>the size
>>> of the address space, and a-b is not allowed to be the minimum
>negative
>>> value (so I can safely use b-a), and if we compute a-b and b-c, then
>a and c
>>> are in the same object so I can safely compute a-c, etc. Second, we
>probably
>>> need modes without undefined behavior, wrapping with either -fwrapv
>or a new
>>> -fwrapp, or sanitized. For the sanitized version, we could keep
>using
>>> POINTER_DIFF_EXPR and check TYPE_OVERFLOW_SANITIZED on the pointer
>type as
>>> we currently do for integers. For wrapping, either we say that
>>> POINTER_DIFF_EXPR has wrapping semantics when that option is in
>effect, or
>>> we do not use POINTER_DIFF_EXPR and instead cast to unsigned before
>applying
>>> a MINUS_EXPR (my preference).
>>
>> CCing C/C++ FE folks as well.
>>
>> +      /* It is better if the caller provides the return type.  */
>> +      if (code == POINTER_DIFF_EXPR)
>> +       {
>> +         offset_int res = wi::sub (wi::to_offset (arg1),
>> +                                   wi::to_offset (arg2));
>> +         return force_fit_type (signed_type_for (TREE_TYPE (arg1)),
>res, 1,
>> +                                TREE_OVERFLOW (arg1) | TREE_OVERFLOW
>(arg2));
>> +       }
>> +
>>
>> there's an overload that provides the type... (we should probably go
>> over all callers
>> and use that).  There is already a folding that is only done when the
>type is
>> available.  So I'd simply remove the above from the non-type overload
>handling.
>> What breaks then?
>
>I haven't checked yet. On the other hand, now that I require the same 
>precision for pointers and their difference, using signed_type_for
>should 
>be safe enough (only issue is if a front-end is unhappy that we changed
>
>the exact type for another 'equivalent' one). But I'll try removing the
>
>version with signed_type_for.
>
>> With the patch it's of course somewhat ugly in that we need to deal
>with both
>> MINUS_EXPR on pointers and POINTER_DIFF_EXPR - at least that's what
>>
>> +    case POINTER_DIFF_EXPR:
>>     case MINUS_EXPR:
>> +      /* Fold &a[i] - &a[j] to i-j.  */
>> +      if (TREE_CODE (arg0) == ADDR_EXPR
>> +         && TREE_CODE (TREE_OPERAND (arg0, 0)) == ARRAY_REF
>> +         && TREE_CODE (arg1) == ADDR_EXPR
>> +         && TREE_CODE (TREE_OPERAND (arg1, 0)) == ARRAY_REF)
>> +        {
>> +         tree tem = fold_addr_of_array_ref_difference (loc, type,
>> +                                                       TREE_OPERAND
>(arg0, 0),
>> +                                                       TREE_OPERAND
>(arg1, 0),
>> +                                                       code
>> +                                                       ==
>POINTER_DIFF_EXPR);
>>
>> suggests.  But is that really so?  The FEs today should never build
>> a MINUS_EXPR with pointer operands and your patch also doesn't.
>> I suppose we only arrive above because STRIP_NOPS strips the
>> pointer to integer conversion?
>
>I think so. And since I only patched 2 front-ends, others probably
>still 
>generate casts to integers and MINUS_EXPR and can benefit from this 
>transformation. Also, if we implement -fwrapp by casting to unsigned
>and 
>doing MINUS_EXPR, that will remain useful.
>
>> +    case POINTER_DIFF_EXPR:
>> +      if (!POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (t, 0)))
>> +         || !POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (t, 1))))
>> +       {
>> +         error ("invalid operand to pointer diff, operand is not a
>pointer");
>> +         return t;
>> +       }
>> +      CHECK_OP (0, "invalid operand to pointer diff");
>> +      CHECK_OP (1, "invalid operand to pointer diff");
>> +      break;
>>
>> can you add
>>  || TREE_CODE (TREE_TYPE (t)) != INTEGER_TYPE
>>  || TYPE_UNSIGNED (TREE_TYPE (t))
>
>I wasn't sure how much to duplicate between here and below. Added.
>
>> ?  Note that if the precision of the pointer type arguments are not
>equal to the
>> precision of the return value then foldings like
>>
>>   (simplify
>> +   (plus:c (pointer_diff @0 @1) (pointer_diff @2 @0))
>> +   (if (TYPE_OVERFLOW_UNDEFINED (type)
>> +       && !TYPE_OVERFLOW_SANITIZED (TREE_TYPE (@0)))
>> +    (pointer_diff @2 @1)))
>>
>> might not be correct as we drop a possible truncation here.  Shall we
>
>I simplified the transformations after deciding to require equal 
>precision.
>
>> require (and document) that the result of the pointer difference in
>> a POINTER_DIFF_EXPR has to be representable in the type of the
>> expression, otherwise
>> the behavior is undefined?  Shall we allow an unsigned return type
>for
>> differences known to be representable?  Probably not...  Does the
>behavior
>> depend on TYPE_OVERFLOW_UNDEFINED?  It would be nice to amend
>> the generic.texi docs somewhat here.
>
>I was more precise in tree.def, copying details over to generic.texi
>now.
>But the overflow behavior is something I was asking about above (see
>citation at the beginning of this email). I am tempted to say that
>overflow
>is always undefined for POINTER_DIFF_EXPR and we should use MINUS_EXPR
>on
>unsigned integers when we want wrapping, but if people prefer to use
>POINTER_DIFF_EXPR in all cases and change its behavior depending on
>TYPE_OVERFLOW_UNDEFINED that would be ok with me. I probably wasn't
>consistent in match.pd and will need to tweak a few details once this
>is
>decided.


Ah. Indeed using unsigned MINUS_EXPR sounds like a good idea - also for 
canonicalization. Likewise for POUNTER_PLUS_EXPR vs. PLUS_EXPR. 

>> Ah, the gimple checking adds some more bits here that clarify things:
>>
>> +    case POINTER_DIFF_EXPR:
>> +      {
>> +       if (!POINTER_TYPE_P (rhs1_type)
>> +           || !POINTER_TYPE_P (rhs2_type)
>> +           || !types_compatible_p (rhs1_type, rhs2_type)
>> +           || TREE_CODE (lhs_type) != INTEGER_TYPE
>> +           || TYPE_UNSIGNED (lhs_type)
>> +           || TYPE_PRECISION (lhs_type) != TYPE_PRECISION
>(rhs1_type))
>> +         {
>> +           error ("type mismatch in pointer diff expression");
>> +           debug_generic_stmt (lhs_type);
>> +           debug_generic_stmt (rhs1_type);
>> +           debug_generic_stmt (rhs2_type);
>> +           return true;
>> +         }
>>
>>
>> I think the patch is ok for trunk, the FE bits should get approval
>> from their maintainers.
>> Joseph may have an idea about the address-space issue.
>>
>> With this in place we should be able to fix some of the issues Jakub
>ran into
>> with pointer sanitizing?
>
>Some of them should be fixed without doing anything: pointer
>differences
>are not using MINUS_EXPR anymore in C/C++, so they are not sanitized
>and
>don't produce spurious errors when crossing the middle of the address
>space. On the other hand, we may want to sanitize POINTER_DIFF_EXPR
>specifically.
>
>> Also with this in place it would be nice to do the corresponding
>adjustment to
>> POINTER_PLUS_EXPR, that is, require a _signed_ offset type that
>matches
>> the precision of the other argument.
>
>Maybe next stage1... Too late for this year.

Agreed. Bin had patches to do this IIRC. 

Richard.

Re: [RFTesting] New POINTER_DIFF_EXPR

Reply via email to