On Fri, Apr 29, 2016 at 5:05 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Fri, Apr 29, 2016 at 12:16 PM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng <bin.ch...@arm.com> wrote: >>> Hi, >>> Tree if-conversion sometimes cannot convert conditional array reference >>> into unconditional one. Root cause is GCC conservatively assumes newly >>> introduced array reference could be out of array bound and thus trapping. >>> This patch improves the situation by proving the converted unconditional >>> array reference is within array bound using loop niter information. To be >>> specific, it checks every index of array reference to see if it's within >>> bound in ifcvt_memrefs_wont_trap. This patch also factors out >>> base_object_writable checking if the base object is writable or not. >>> Bootstrap and test on x86_64 and aarch64, is it OK? >> >> I think you miss to handle the case optimally where the only >> non-ARRAY_REF idx is the dereference of the >> base-pointer for, say, p->a[i]. In this case we can use >> base_master_dr to see if p is unconditionally dereferenced > Yes, will pick up this case. > >> in the loop. You also fail to handle the case where we have >> MEM_REF[&x].a[i] that is, you see a decl base. > I am having difficulty in creating this case for ifcvt, any advices? Thanks.
Sth like float a[128]; float foo (int n, int i) { return (*((float(*)[n])a))[i]; } should do the trick (w/o the component-ref). Any other type-punning would do it, too. >> I suppose for_each_index should be fixed for this particular case (to >> return true), same for TARGET_MEM_REF TMR_BASE. >> >> + /* The case of nonconstant bounds could be handled, but it would be >> + complicated. */ >> + if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low) >> + || !high || TREE_CODE (high) != INTEGER_CST) >> + return false; >> + >> >> handling of a non-zero but constant low bound is important - otherwise >> all this is a no-op for Fortran. It >> shouldn't be too difficult to handle after all. In fact I think your >> code does handle it correctly already. >> >> + if (!init || TREE_CODE (init) != INTEGER_CST >> + || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop (step)) >> + return false; >> >> step == 0 should be easy to handle as well, no? The index will simply >> always be 'init' ... >> >> + /* In case the relevant bound of the array does not fit in type, or >> + it does, but bound + step (in type) still belongs into the range of the >> + array, the index may wrap and still stay within the range of the array >> + (consider e.g. if the array is indexed by the full range of >> + unsigned char). >> + >> + To make things simpler, we require both bounds to fit into type, >> although >> + there are cases where this would not be strictly necessary. */ >> + if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type)) >> + return false; >> + >> + low = fold_convert (type, low); >> >> please use wide_int for all of this. > Now I use wi:fits_to_tree_p instead of int_fits_type_p. But I am not > sure what's the meaning by "handle "low = fold_convert (type, low);" > related code in wide_int". Do you mean to use tree_int_cst_compare > instead of tree_int_cst_compare in the following code? I don't think you need any kind of fits-to-type check here. You'd simply use to_widest () when operating on / comparing with high/low. And no, I mean to do it all with widest_ints. >> >> I wonder if we can do sth for wrapping IVs like >> >> int a[2048]; >> >> for (int i = 0; i < 4096; ++i) >> ... a[(unsigned char)i]; >> >> as well. Like if the IVs type max and min value are within the array bounds >> simply return true? > I think we can only do this for read. For write this is not safe. > From vectorizer's point of view, is this worth handling? Could > vectorizer handles wrapping IV in a smaller range than loop IV? Possibly, but dependence analysis might get confused. Richard. > Thanks, > bin >> >> Thanks, >> Richard. >> >>> Thanks, >>> bin >>> >>> 2016-04-28 Bin Cheng <bin.ch...@arm.com> >>> >>> * tree-if-conv.c (tree-ssa-loop.h): Include header file. >>> (tree-ssa-loop-niter.h): Ditto. >>> (idx_within_array_bound, ref_within_array_bound): New functions. >>> (ifcvt_memrefs_wont_trap): Check if array ref is within bound. >>> Factor out check on writable base object to ... >>> (base_object_writable): ... here.