On Fri, May 6, 2016 at 10:40 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Tue, May 3, 2016 at 11:08 AM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Tue, May 3, 2016 at 12:01 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>> On Mon, May 2, 2016 at 10:00 AM, Richard Biener >>> <richard.guent...@gmail.com> wrote: >>>> On Fri, Apr 29, 2016 at 5:05 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>>>> On Fri, Apr 29, 2016 at 12:16 PM, Richard Biener >>>>> <richard.guent...@gmail.com> wrote: >>>>>> On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng <bin.ch...@arm.com> wrote: >>>>>>> Hi, >>>>>>> Tree if-conversion sometimes cannot convert conditional array reference >>>>>>> into unconditional one. Root cause is GCC conservatively assumes newly >>>>>>> introduced array reference could be out of array bound and thus >>>>>>> trapping. This patch improves the situation by proving the converted >>>>>>> unconditional array reference is within array bound using loop niter >>>>>>> information. To be specific, it checks every index of array reference >>>>>>> to see if it's within bound in ifcvt_memrefs_wont_trap. This patch >>>>>>> also factors out base_object_writable checking if the base object is >>>>>>> writable or not. >>>>>>> Bootstrap and test on x86_64 and aarch64, is it OK? >>>>>> >>>>>> I think you miss to handle the case optimally where the only >>>>>> non-ARRAY_REF idx is the dereference of the >>>>>> base-pointer for, say, p->a[i]. In this case we can use >>>>>> base_master_dr to see if p is unconditionally dereferenced >>>>> Yes, will pick up this case. >>>>> >>>>>> in the loop. You also fail to handle the case where we have >>>>>> MEM_REF[&x].a[i] that is, you see a decl base. >>>>> I am having difficulty in creating this case for ifcvt, any advices? >>>>> Thanks. >>>> >>>> Sth like >>>> >>>> float a[128]; >>>> float foo (int n, int i) >>>> { >>>> return (*((float(*)[n])a))[i]; >>>> } >>>> >>>> should do the trick (w/o the component-ref). Any other type-punning >>>> would do it, too. >>>> >>>>>> I suppose for_each_index should be fixed for this particular case (to >>>>>> return true), same for TARGET_MEM_REF TMR_BASE. >>>>>> >>>>>> + /* The case of nonconstant bounds could be handled, but it would be >>>>>> + complicated. */ >>>>>> + if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low) >>>>>> + || !high || TREE_CODE (high) != INTEGER_CST) >>>>>> + return false; >>>>>> + >>>>>> >>>>>> handling of a non-zero but constant low bound is important - otherwise >>>>>> all this is a no-op for Fortran. It >>>>>> shouldn't be too difficult to handle after all. In fact I think your >>>>>> code does handle it correctly already. >>>>>> >>>>>> + if (!init || TREE_CODE (init) != INTEGER_CST >>>>>> + || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop >>>>>> (step)) >>>>>> + return false; >>>>>> >>>>>> step == 0 should be easy to handle as well, no? The index will simply >>>>>> always be 'init' ... >>>>>> >>>>>> + /* In case the relevant bound of the array does not fit in type, or >>>>>> + it does, but bound + step (in type) still belongs into the range >>>>>> of the >>>>>> + array, the index may wrap and still stay within the range of the >>>>>> array >>>>>> + (consider e.g. if the array is indexed by the full range of >>>>>> + unsigned char). >>>>>> + >>>>>> + To make things simpler, we require both bounds to fit into type, >>>>>> although >>>>>> + there are cases where this would not be strictly necessary. */ >>>>>> + if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type)) >>>>>> + return false; >>>>>> + >>>>>> + low = fold_convert (type, low); >>>>>> >>>>>> please use wide_int for all of this. >>>>> Now I use wi:fits_to_tree_p instead of int_fits_type_p. But I am not >>>>> sure what's the meaning by "handle "low = fold_convert (type, low);" >>>>> related code in wide_int". Do you mean to use tree_int_cst_compare >>>>> instead of tree_int_cst_compare in the following code? >>>> >>>> I don't think you need any kind of fits-to-type check here. You'd simply >>>> use to_widest () when operating on / comparing with high/low. >>> But what would happen if low/high and init/step are different in type >>> sign-ness? Anything special I need to do before using wi::ltu_p or >>> wi::lts_p directly? >> >> You want to use to_widest (min) which extends according to sign to >> an "infinite" precision signed integer. So you can then use the new >> operator< overloads as well. >> > Hi, > Here is the updated patch. It includes below changes according to > review comments: > > 1) It uses widest_int for all INTEGER_CST tree computations, which > simplifies the patch a lot. > 2) It covers array with non-zero low bound, which is important for Fortran. > 3) It picks up a boundary case so that ifc-11.c/vect-23.c/vect-24.c > can be handled. > 4) It also checks within bound array reference inside a structure like > p->a[i] by using base_master_dr in tree-if-conv.c so that ifc-12.c can > be handled. > > It leaves two other review comments not addressed: > 1) It doesn't handle array reference whose idx is a wrapping SCEV. > Because only read is safe and vectorizer itself may be confused by it > now. > 2) It doesn't handle array reference in the form of > "MEM[(float[0:(sizetype) ((long int) SAVE_EXPR <m.2> + -1)] > *)&b][i_1];". To handle this case, existing code in > array_at_struct_end_p as well as this patch need to be improved. I > tend to handle it in an independent patch. > > With these changes, now cases pr61194.c and vect-23.c can be > vectorized, I removed XFAIL from them. Also vect-mask-store-move-1.c > is affected and not vectorized now, this is tricky and it exposes a > known bug PR65206 in vectorizer's dependence analyzer. This should be > handled independently too. Also gcc.dg/vect/vect-24.c now is XPASS on > AArch64, but not x86_64. Root cause is dom2 jump threads first > iteration of loop thus idx_within_array_bound is failed. I didn't > check if jump threading is good in this case, either I remove the > XFAIL mark. I tend to improve idx_within_array_bound by using VRP > information in a following patch.
Hmm, I accidentally included removal of XFAIL for gcc.dg/vect/vect-24.c in the patch. Anyway, it can be included or excluded wrto reviewers opinion. Thanks, bin > Otherwise bootstrap and test on x86_64 and AArch64 are fine. Is this > version OK? > > Thanks, > bin > > 2016-05-04 Bin Cheng <bin.ch...@arm.com> > > * tree-if-conv.c (tree-ssa-loop.h): Include header file. > (tree-ssa-loop-niter.h): Ditto. > (idx_within_array_bound, ref_within_array_bound): New functions. > (ifcvt_memrefs_wont_trap): Check if array ref is within bound. > Factor out check on writable base object to ... > (base_object_writable): ... here. > > gcc/testsuite/ChangeLog > 2016-05-04 Bin Cheng <bin.ch...@arm.com> > > * gcc.dg/tree-ssa/ifc-9.c: New test. > * gcc.dg/tree-ssa/ifc-10.c: New test. > * gcc.dg/tree-ssa/ifc-11.c: New test. > * gcc.dg/tree-ssa/ifc-12.c: New test. > * gcc.dg/vect/pr61194.c: Remove XFAIL. > * gcc.dg/vect/vect-23.c: Remove XFAIL. > * gcc.dg/vect/vect-mask-store-move-1.c: Revise test check.