On Tue, May 3, 2016 at 12:01 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
> On Mon, May 2, 2016 at 10:00 AM, Richard Biener
> <richard.guent...@gmail.com> wrote:
>> On Fri, Apr 29, 2016 at 5:05 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>>> On Fri, Apr 29, 2016 at 12:16 PM, Richard Biener
>>> <richard.guent...@gmail.com> wrote:
>>>> On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng <bin.ch...@arm.com> wrote:
>>>>> Hi,
>>>>> Tree if-conversion sometimes cannot convert conditional array reference 
>>>>> into unconditional one.  Root cause is GCC conservatively assumes newly 
>>>>> introduced array reference could be out of array bound and thus trapping. 
>>>>>  This patch improves the situation by proving the converted unconditional 
>>>>> array reference is within array bound using loop niter information.  To 
>>>>> be specific, it checks every index of array reference to see if it's 
>>>>> within bound in ifcvt_memrefs_wont_trap.  This patch also factors out 
>>>>> base_object_writable checking if the base object is writable or not.
>>>>> Bootstrap and test on x86_64 and aarch64, is it OK?
>>>>
>>>> I think you miss to handle the case optimally where the only
>>>> non-ARRAY_REF idx is the dereference of the
>>>> base-pointer for, say, p->a[i].  In this case we can use
>>>> base_master_dr to see if p is unconditionally dereferenced
>>> Yes, will pick up this case.
>>>
>>>> in the loop.  You also fail to handle the case where we have
>>>> MEM_REF[&x].a[i] that is, you see a decl base.
>>> I am having difficulty in creating this case for ifcvt, any advices?  
>>> Thanks.
>>
>> Sth like
>>
>> float a[128];
>> float foo (int n, int i)
>> {
>>   return (*((float(*)[n])a))[i];
>> }
>>
>> should do the trick (w/o the component-ref).  Any other type-punning
>> would do it, too.
>>
>>>> I suppose for_each_index should be fixed for this particular case (to
>>>> return true), same for TARGET_MEM_REF TMR_BASE.
>>>>
>>>> +  /* The case of nonconstant bounds could be handled, but it would be
>>>> +     complicated.  */
>>>> +  if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low)
>>>> +      || !high || TREE_CODE (high) != INTEGER_CST)
>>>> +    return false;
>>>> +
>>>>
>>>> handling of a non-zero but constant low bound is important - otherwise
>>>> all this is a no-op for Fortran.  It
>>>> shouldn't be too difficult to handle after all.  In fact I think your
>>>> code does handle it correctly already.
>>>>
>>>> +  if (!init || TREE_CODE (init) != INTEGER_CST
>>>> +      || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
>>>> +    return false;
>>>>
>>>> step == 0 should be easy to handle as well, no?  The index will simply
>>>> always be 'init' ...
>>>>
>>>> +  /* In case the relevant bound of the array does not fit in type, or
>>>> +     it does, but bound + step (in type) still belongs into the range of 
>>>> the
>>>> +     array, the index may wrap and still stay within the range of the 
>>>> array
>>>> +     (consider e.g. if the array is indexed by the full range of
>>>> +     unsigned char).
>>>> +
>>>> +     To make things simpler, we require both bounds to fit into type, 
>>>> although
>>>> +     there are cases where this would not be strictly necessary.  */
>>>> +  if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type))
>>>> +    return false;
>>>> +
>>>> +  low = fold_convert (type, low);
>>>>
>>>> please use wide_int for all of this.
>>> Now I use wi:fits_to_tree_p instead of int_fits_type_p. But I am not
>>> sure what's the meaning by "handle "low = fold_convert (type, low);"
>>> related code in wide_int".   Do you mean to use tree_int_cst_compare
>>> instead of tree_int_cst_compare in the following code?
>>
>> I don't think you need any kind of fits-to-type check here.  You'd simply
>> use to_widest () when operating on / comparing with high/low.
> But what would happen if low/high and init/step are different in type
> sign-ness?  Anything special I need to do before using wi::ltu_p or
> wi::lts_p directly?

You want to use to_widest (min) which extends according to sign to
an "infinite" precision signed integer.  So you can then use the new
operator< overloads as well.

Richard.

> Thanks,
> bin
>>
>> And no, I mean to do it all with widest_ints.
>>
>>>>
>>>> I wonder if we can do sth for wrapping IVs like
>>>>
>>>> int a[2048];
>>>>
>>>> for (int i = 0; i < 4096; ++i)
>>>>   ... a[(unsigned char)i];
>>>>
>>>> as well.  Like if the IVs type max and min value are within the array 
>>>> bounds
>>>> simply return true?
>>> I think we can only do this for read.  For write this is not safe.
>>> From vectorizer's point of view, is this worth handling?  Could
>>> vectorizer handles wrapping IV in a smaller range than loop IV?
>>
>> Possibly, but dependence analysis might get confused.
>>
>> Richard.
>>
>>> Thanks,
>>> bin
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>> Thanks,
>>>>> bin
>>>>>
>>>>> 2016-04-28  Bin Cheng  <bin.ch...@arm.com>
>>>>>
>>>>>         * tree-if-conv.c (tree-ssa-loop.h): Include header file.
>>>>>         (tree-ssa-loop-niter.h): Ditto.
>>>>>         (idx_within_array_bound, ref_within_array_bound): New functions.
>>>>>         (ifcvt_memrefs_wont_trap): Check if array ref is within bound.
>>>>>         Factor out check on writable base object to ...
>>>>>         (base_object_writable): ... here.

Reply via email to