On Tue, May 3, 2016 at 12:01 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Mon, May 2, 2016 at 10:00 AM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Fri, Apr 29, 2016 at 5:05 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>> On Fri, Apr 29, 2016 at 12:16 PM, Richard Biener >>> <richard.guent...@gmail.com> wrote: >>>> On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng <bin.ch...@arm.com> wrote: >>>>> Hi, >>>>> Tree if-conversion sometimes cannot convert conditional array reference >>>>> into unconditional one. Root cause is GCC conservatively assumes newly >>>>> introduced array reference could be out of array bound and thus trapping. >>>>> This patch improves the situation by proving the converted unconditional >>>>> array reference is within array bound using loop niter information. To >>>>> be specific, it checks every index of array reference to see if it's >>>>> within bound in ifcvt_memrefs_wont_trap. This patch also factors out >>>>> base_object_writable checking if the base object is writable or not. >>>>> Bootstrap and test on x86_64 and aarch64, is it OK? >>>> >>>> I think you miss to handle the case optimally where the only >>>> non-ARRAY_REF idx is the dereference of the >>>> base-pointer for, say, p->a[i]. In this case we can use >>>> base_master_dr to see if p is unconditionally dereferenced >>> Yes, will pick up this case. >>> >>>> in the loop. You also fail to handle the case where we have >>>> MEM_REF[&x].a[i] that is, you see a decl base. >>> I am having difficulty in creating this case for ifcvt, any advices? >>> Thanks. >> >> Sth like >> >> float a[128]; >> float foo (int n, int i) >> { >> return (*((float(*)[n])a))[i]; >> } >> >> should do the trick (w/o the component-ref). Any other type-punning >> would do it, too. >> >>>> I suppose for_each_index should be fixed for this particular case (to >>>> return true), same for TARGET_MEM_REF TMR_BASE. >>>> >>>> + /* The case of nonconstant bounds could be handled, but it would be >>>> + complicated. */ >>>> + if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low) >>>> + || !high || TREE_CODE (high) != INTEGER_CST) >>>> + return false; >>>> + >>>> >>>> handling of a non-zero but constant low bound is important - otherwise >>>> all this is a no-op for Fortran. It >>>> shouldn't be too difficult to handle after all. In fact I think your >>>> code does handle it correctly already. >>>> >>>> + if (!init || TREE_CODE (init) != INTEGER_CST >>>> + || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop (step)) >>>> + return false; >>>> >>>> step == 0 should be easy to handle as well, no? The index will simply >>>> always be 'init' ... >>>> >>>> + /* In case the relevant bound of the array does not fit in type, or >>>> + it does, but bound + step (in type) still belongs into the range of >>>> the >>>> + array, the index may wrap and still stay within the range of the >>>> array >>>> + (consider e.g. if the array is indexed by the full range of >>>> + unsigned char). >>>> + >>>> + To make things simpler, we require both bounds to fit into type, >>>> although >>>> + there are cases where this would not be strictly necessary. */ >>>> + if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type)) >>>> + return false; >>>> + >>>> + low = fold_convert (type, low); >>>> >>>> please use wide_int for all of this. >>> Now I use wi:fits_to_tree_p instead of int_fits_type_p. But I am not >>> sure what's the meaning by "handle "low = fold_convert (type, low);" >>> related code in wide_int". Do you mean to use tree_int_cst_compare >>> instead of tree_int_cst_compare in the following code? >> >> I don't think you need any kind of fits-to-type check here. You'd simply >> use to_widest () when operating on / comparing with high/low. > But what would happen if low/high and init/step are different in type > sign-ness? Anything special I need to do before using wi::ltu_p or > wi::lts_p directly?
You want to use to_widest (min) which extends according to sign to an "infinite" precision signed integer. So you can then use the new operator< overloads as well. Richard. > Thanks, > bin >> >> And no, I mean to do it all with widest_ints. >> >>>> >>>> I wonder if we can do sth for wrapping IVs like >>>> >>>> int a[2048]; >>>> >>>> for (int i = 0; i < 4096; ++i) >>>> ... a[(unsigned char)i]; >>>> >>>> as well. Like if the IVs type max and min value are within the array >>>> bounds >>>> simply return true? >>> I think we can only do this for read. For write this is not safe. >>> From vectorizer's point of view, is this worth handling? Could >>> vectorizer handles wrapping IV in a smaller range than loop IV? >> >> Possibly, but dependence analysis might get confused. >> >> Richard. >> >>> Thanks, >>> bin >>>> >>>> Thanks, >>>> Richard. >>>> >>>>> Thanks, >>>>> bin >>>>> >>>>> 2016-04-28 Bin Cheng <bin.ch...@arm.com> >>>>> >>>>> * tree-if-conv.c (tree-ssa-loop.h): Include header file. >>>>> (tree-ssa-loop-niter.h): Ditto. >>>>> (idx_within_array_bound, ref_within_array_bound): New functions. >>>>> (ifcvt_memrefs_wont_trap): Check if array ref is within bound. >>>>> Factor out check on writable base object to ... >>>>> (base_object_writable): ... here.