On Mon, May 2, 2016 at 10:00 AM, Richard Biener <richard.guent...@gmail.com> wrote: > On Fri, Apr 29, 2016 at 5:05 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >> On Fri, Apr 29, 2016 at 12:16 PM, Richard Biener >> <richard.guent...@gmail.com> wrote: >>> On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng <bin.ch...@arm.com> wrote: >>>> Hi, >>>> Tree if-conversion sometimes cannot convert conditional array reference >>>> into unconditional one. Root cause is GCC conservatively assumes newly >>>> introduced array reference could be out of array bound and thus trapping. >>>> This patch improves the situation by proving the converted unconditional >>>> array reference is within array bound using loop niter information. To be >>>> specific, it checks every index of array reference to see if it's within >>>> bound in ifcvt_memrefs_wont_trap. This patch also factors out >>>> base_object_writable checking if the base object is writable or not. >>>> Bootstrap and test on x86_64 and aarch64, is it OK? >>> >>> I think you miss to handle the case optimally where the only >>> non-ARRAY_REF idx is the dereference of the >>> base-pointer for, say, p->a[i]. In this case we can use >>> base_master_dr to see if p is unconditionally dereferenced >> Yes, will pick up this case. >> >>> in the loop. You also fail to handle the case where we have >>> MEM_REF[&x].a[i] that is, you see a decl base. >> I am having difficulty in creating this case for ifcvt, any advices? Thanks. > > Sth like > > float a[128]; > float foo (int n, int i) > { > return (*((float(*)[n])a))[i]; > } > > should do the trick (w/o the component-ref). Any other type-punning > would do it, too. > >>> I suppose for_each_index should be fixed for this particular case (to >>> return true), same for TARGET_MEM_REF TMR_BASE. >>> >>> + /* The case of nonconstant bounds could be handled, but it would be >>> + complicated. */ >>> + if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low) >>> + || !high || TREE_CODE (high) != INTEGER_CST) >>> + return false; >>> + >>> >>> handling of a non-zero but constant low bound is important - otherwise >>> all this is a no-op for Fortran. It >>> shouldn't be too difficult to handle after all. In fact I think your >>> code does handle it correctly already. >>> >>> + if (!init || TREE_CODE (init) != INTEGER_CST >>> + || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop (step)) >>> + return false; >>> >>> step == 0 should be easy to handle as well, no? The index will simply >>> always be 'init' ... >>> >>> + /* In case the relevant bound of the array does not fit in type, or >>> + it does, but bound + step (in type) still belongs into the range of >>> the >>> + array, the index may wrap and still stay within the range of the array >>> + (consider e.g. if the array is indexed by the full range of >>> + unsigned char). >>> + >>> + To make things simpler, we require both bounds to fit into type, >>> although >>> + there are cases where this would not be strictly necessary. */ >>> + if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type)) >>> + return false; >>> + >>> + low = fold_convert (type, low); >>> >>> please use wide_int for all of this. >> Now I use wi:fits_to_tree_p instead of int_fits_type_p. But I am not >> sure what's the meaning by "handle "low = fold_convert (type, low);" >> related code in wide_int". Do you mean to use tree_int_cst_compare >> instead of tree_int_cst_compare in the following code? > > I don't think you need any kind of fits-to-type check here. You'd simply > use to_widest () when operating on / comparing with high/low. But what would happen if low/high and init/step are different in type sign-ness? Anything special I need to do before using wi::ltu_p or wi::lts_p directly?
Thanks, bin > > And no, I mean to do it all with widest_ints. > >>> >>> I wonder if we can do sth for wrapping IVs like >>> >>> int a[2048]; >>> >>> for (int i = 0; i < 4096; ++i) >>> ... a[(unsigned char)i]; >>> >>> as well. Like if the IVs type max and min value are within the array bounds >>> simply return true? >> I think we can only do this for read. For write this is not safe. >> From vectorizer's point of view, is this worth handling? Could >> vectorizer handles wrapping IV in a smaller range than loop IV? > > Possibly, but dependence analysis might get confused. > > Richard. > >> Thanks, >> bin >>> >>> Thanks, >>> Richard. >>> >>>> Thanks, >>>> bin >>>> >>>> 2016-04-28 Bin Cheng <bin.ch...@arm.com> >>>> >>>> * tree-if-conv.c (tree-ssa-loop.h): Include header file. >>>> (tree-ssa-loop-niter.h): Ditto. >>>> (idx_within_array_bound, ref_within_array_bound): New functions. >>>> (ifcvt_memrefs_wont_trap): Check if array ref is within bound. >>>> Factor out check on writable base object to ... >>>> (base_object_writable): ... here.