On Fri, May 6, 2016 at 11:42 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Fri, May 6, 2016 at 10:40 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: >> On Tue, May 3, 2016 at 11:08 AM, Richard Biener >> <richard.guent...@gmail.com> wrote: >>> On Tue, May 3, 2016 at 12:01 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>>> On Mon, May 2, 2016 at 10:00 AM, Richard Biener >>>> <richard.guent...@gmail.com> wrote: >>>>> On Fri, Apr 29, 2016 at 5:05 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>>>>> On Fri, Apr 29, 2016 at 12:16 PM, Richard Biener >>>>>> <richard.guent...@gmail.com> wrote: >>>>>>> On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng <bin.ch...@arm.com> wrote: >>>>>>>> Hi, >>>>>>>> Tree if-conversion sometimes cannot convert conditional array >>>>>>>> reference into unconditional one. Root cause is GCC conservatively >>>>>>>> assumes newly introduced array reference could be out of array bound >>>>>>>> and thus trapping. This patch improves the situation by proving the >>>>>>>> converted unconditional array reference is within array bound using >>>>>>>> loop niter information. To be specific, it checks every index of >>>>>>>> array reference to see if it's within bound in >>>>>>>> ifcvt_memrefs_wont_trap. This patch also factors out >>>>>>>> base_object_writable checking if the base object is writable or not. >>>>>>>> Bootstrap and test on x86_64 and aarch64, is it OK? >>>>>>> >>>>>>> I think you miss to handle the case optimally where the only >>>>>>> non-ARRAY_REF idx is the dereference of the >>>>>>> base-pointer for, say, p->a[i]. In this case we can use >>>>>>> base_master_dr to see if p is unconditionally dereferenced >>>>>> Yes, will pick up this case. >>>>>> >>>>>>> in the loop. You also fail to handle the case where we have >>>>>>> MEM_REF[&x].a[i] that is, you see a decl base. >>>>>> I am having difficulty in creating this case for ifcvt, any advices? >>>>>> Thanks. >>>>> >>>>> Sth like >>>>> >>>>> float a[128]; >>>>> float foo (int n, int i) >>>>> { >>>>> return (*((float(*)[n])a))[i]; >>>>> } >>>>> >>>>> should do the trick (w/o the component-ref). Any other type-punning >>>>> would do it, too. >>>>> >>>>>>> I suppose for_each_index should be fixed for this particular case (to >>>>>>> return true), same for TARGET_MEM_REF TMR_BASE. >>>>>>> >>>>>>> + /* The case of nonconstant bounds could be handled, but it would be >>>>>>> + complicated. */ >>>>>>> + if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low) >>>>>>> + || !high || TREE_CODE (high) != INTEGER_CST) >>>>>>> + return false; >>>>>>> + >>>>>>> >>>>>>> handling of a non-zero but constant low bound is important - otherwise >>>>>>> all this is a no-op for Fortran. It >>>>>>> shouldn't be too difficult to handle after all. In fact I think your >>>>>>> code does handle it correctly already. >>>>>>> >>>>>>> + if (!init || TREE_CODE (init) != INTEGER_CST >>>>>>> + || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop >>>>>>> (step)) >>>>>>> + return false; >>>>>>> >>>>>>> step == 0 should be easy to handle as well, no? The index will simply >>>>>>> always be 'init' ... >>>>>>> >>>>>>> + /* In case the relevant bound of the array does not fit in type, or >>>>>>> + it does, but bound + step (in type) still belongs into the range >>>>>>> of the >>>>>>> + array, the index may wrap and still stay within the range of the >>>>>>> array >>>>>>> + (consider e.g. if the array is indexed by the full range of >>>>>>> + unsigned char). >>>>>>> + >>>>>>> + To make things simpler, we require both bounds to fit into type, >>>>>>> although >>>>>>> + there are cases where this would not be strictly necessary. */ >>>>>>> + if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type)) >>>>>>> + return false; >>>>>>> + >>>>>>> + low = fold_convert (type, low); >>>>>>> >>>>>>> please use wide_int for all of this. >>>>>> Now I use wi:fits_to_tree_p instead of int_fits_type_p. But I am not >>>>>> sure what's the meaning by "handle "low = fold_convert (type, low);" >>>>>> related code in wide_int". Do you mean to use tree_int_cst_compare >>>>>> instead of tree_int_cst_compare in the following code? >>>>> >>>>> I don't think you need any kind of fits-to-type check here. You'd simply >>>>> use to_widest () when operating on / comparing with high/low. >>>> But what would happen if low/high and init/step are different in type >>>> sign-ness? Anything special I need to do before using wi::ltu_p or >>>> wi::lts_p directly? >>> >>> You want to use to_widest (min) which extends according to sign to >>> an "infinite" precision signed integer. So you can then use the new >>> operator< overloads as well. >>> >> Hi, >> Here is the updated patch. It includes below changes according to >> review comments: >> >> 1) It uses widest_int for all INTEGER_CST tree computations, which >> simplifies the patch a lot. >> 2) It covers array with non-zero low bound, which is important for Fortran. >> 3) It picks up a boundary case so that ifc-11.c/vect-23.c/vect-24.c >> can be handled. >> 4) It also checks within bound array reference inside a structure like >> p->a[i] by using base_master_dr in tree-if-conv.c so that ifc-12.c can >> be handled. >> >> It leaves two other review comments not addressed: >> 1) It doesn't handle array reference whose idx is a wrapping SCEV. >> Because only read is safe and vectorizer itself may be confused by it >> now. >> 2) It doesn't handle array reference in the form of >> "MEM[(float[0:(sizetype) ((long int) SAVE_EXPR <m.2> + -1)] >> *)&b][i_1];". To handle this case, existing code in >> array_at_struct_end_p as well as this patch need to be improved. I >> tend to handle it in an independent patch. >> >> With these changes, now cases pr61194.c and vect-23.c can be >> vectorized, I removed XFAIL from them. Also vect-mask-store-move-1.c >> is affected and not vectorized now, this is tricky and it exposes a >> known bug PR65206 in vectorizer's dependence analyzer. This should be >> handled independently too. Also gcc.dg/vect/vect-24.c now is XPASS on >> AArch64, but not x86_64. Root cause is dom2 jump threads first >> iteration of loop thus idx_within_array_bound is failed. I didn't >> check if jump threading is good in this case, either I remove the >> XFAIL mark. I tend to improve idx_within_array_bound by using VRP >> information in a following patch. > > Hmm, I accidentally included removal of XFAIL for > gcc.dg/vect/vect-24.c in the patch. Anyway, it can be included or > excluded wrto reviewers opinion. > > Thanks, > bin >> Otherwise bootstrap and test on x86_64 and AArch64 are fine. Is this >> version OK?
Ok. Adjusting gcc.dg/vect/vect-24.c is your choice, if things change again with a followup you might want to defer the change to that. Thanks, Richard. >> Thanks, >> bin >> >> 2016-05-04 Bin Cheng <bin.ch...@arm.com> >> >> * tree-if-conv.c (tree-ssa-loop.h): Include header file. >> (tree-ssa-loop-niter.h): Ditto. >> (idx_within_array_bound, ref_within_array_bound): New functions. >> (ifcvt_memrefs_wont_trap): Check if array ref is within bound. >> Factor out check on writable base object to ... >> (base_object_writable): ... here. >> >> gcc/testsuite/ChangeLog >> 2016-05-04 Bin Cheng <bin.ch...@arm.com> >> >> * gcc.dg/tree-ssa/ifc-9.c: New test. >> * gcc.dg/tree-ssa/ifc-10.c: New test. >> * gcc.dg/tree-ssa/ifc-11.c: New test. >> * gcc.dg/tree-ssa/ifc-12.c: New test. >> * gcc.dg/vect/pr61194.c: Remove XFAIL. >> * gcc.dg/vect/vect-23.c: Remove XFAIL. >> * gcc.dg/vect/vect-mask-store-move-1.c: Revise test check.