On Sun, May 24, 2015 at 8:47 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Fri, May 22, 2015 at 7:45 PM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Wed, May 20, 2015 at 11:41 AM, Bin Cheng <bin.ch...@arm.com> wrote: >>> Hi, >>> As we know, GCC is too conservative when checking overflow behavior in SCEV >>> and loop related optimizers. Result is some variable can't be recognized as >>> scalar evolution and thus optimizations are missed. To be specific, >>> optimizers like ivopts and vectorizer are affected. >>> This issue is more severe on 64 bit platforms, for example, PR62173 is >>> failed on aarch64; scev-3.c and scev-4.c were marked as XFAIL on lp64 >>> platforms. >>> >>> As the first part to improve overflow checking in GCC, this patch does below >>> improvements: >>> 1) Ideally, chrec_convert should be responsible to convert scev like >>> "(type){base, step}" to scev like "{(type)base, (type)step} when the result >>> scev doesn't overflow; chrec_convert_aggressive should do the conversion if >>> the result scev could overflow/wrap. Unfortunately, current implementation >>> may use chrec_convert_aggressive to return a scev that won't overflow. This >>> is because of a) the static parameter "fold_conversions" for >>> instantiate_scev_convert can only tracks whether chrec_convert_aggressive >>> may be called, rather than if it does some overflow conversion or not; b) >>> the implementation of instantiate_scev_convert sometimes shortcuts the call >>> to chrec_convert and misses conversion opportunities. This patch improves >>> this. >>> 2) iv->no_overflow computed in simple_iv is too conservative. With 1) >>> fixed, iv->no_overflow should reflects whether chrec_convert_aggressive does >>> return an overflow scev. This patch improves this. >>> 3) chrec_convert should be able to prove the resulting scev won't overflow >>> with loop niter information. This patch doesn't finish this, but it >>> factored a new interface out of scev_probably_wraps_p for future >>> improvement. And that will be the part II patch. >>> >>> With the improvements in SCEV, this patch also improves optimizer(IVOPT) >>> that uses scev information like below: >>> For array reference in the form of arr[IV], GCC tries to derive new >>> address iv {arr+iv.base, iv.step*elem_size} from IV. If IV overflow wrto a >>> type that is narrower than address space, this derivation is not true >>> because &arr[IV] isn't a scev. Root cause why scev-*.c are failed now is >>> the overflow information of IV is too conservative. IVOPT has to be >>> conservative to reject &arr[IV] as a scev. With more accurate overflow >>> information, IVOPT can be improved too. So this patch fixes the mentioned >>> long standing issues. >>> >>> Bootstrap and test on x86_64, x86 and aarch64. >>> BTW, test gcc.target/i386/pr49781-1.c failed on x86_64, but I can confirmed >>> it's not this patch's fault. >>> >>> So what's your opinion on this?. >> >> I maybe mixing things up but does >> >> +chrec_convert_aggressive (tree type, tree chrec, bool *fold_conversions) >> { >> ... >> + if (evolution_function_is_affine_p (chrec)) >> + { >> + tree base, step; >> + struct loop *loop; >> + >> + loop = get_chrec_loop (chrec); >> + base = CHREC_LEFT (chrec); >> + step = CHREC_RIGHT (chrec); >> + if (convert_affine_scev (loop, type, &base, &step, NULL, true)) >> + return build_polynomial_chrec (loop->num, base, step); >> >> ^^^ not forget to set *fold_conversions to true? Or we need to use >> convert_affine_scev (..., false)? > > Nice catch. It's supposed to be called only if source scev has no > overflow behavior introduced by previous call to > chrec_convert_aggressive. In other words, it should be guarded by > "!*fold_conversions" like below: > > + > + if (!*fold_conversions && evolution_function_is_affine_p (chrec)) > + { > + tree base, step; > + struct loop *loop; > + > + loop = get_chrec_loop (chrec); > + base = CHREC_LEFT (chrec); > + step = CHREC_RIGHT (chrec); > + if (convert_affine_scev (loop, type, &base, &step, NULL, true)) > + return build_polynomial_chrec (loop->num, base, step); > + } > > The scenario is rare that didn't exposed in either bootstrap or reg-test. > > Here is the updated patch without any other difference. Bootstrap and > test on x86_64 & AArch64.
Ok. Thanks, Richard. > Thanks, > bin >> >> + } >> >> (bah, and the diff somehow messes up -p context :/ which is why I like >> context diffs more) >> >> Other from the above the patch looks good to me. >> >> Thanks, >> Richard. >> >>> Thanks, >>> bin >>> >>> 2015-05-20 Bin Cheng <bin.ch...@arm.com> >>> >>> PR tree-optimization/62173 >>> * tree-ssa-loop-ivopts.c (struct iv): New field. Reorder fields. >>> (alloc_iv, set_iv): New parameter. >>> (determine_biv_step): Delete. >>> (find_bivs): Inline original determine_biv_step. Pass new >>> argument to set_iv. >>> (idx_find_step): Use no_overflow information for conversion. >>> * tree-scalar-evolution.c (analyze_scalar_evolution_in_loop): Let >>> resolve_mixers handle folded_casts. >>> (instantiate_scev_name): Change bool parameter to bool pointer. >>> (instantiate_scev_poly, instantiate_scev_binary): Ditto. >>> (instantiate_array_ref, instantiate_scev_not): Ditto. >>> (instantiate_scev_3, instantiate_scev_2): Ditto. >>> (instantiate_scev_1, instantiate_scev_r): Ditto. >>> (instantiate_scev_convert, ): Change parameter. Pass argument >>> to chrec_convert_aggressive. >>> (instantiate_scev): Change argument. >>> (resolve_mixers): New parameter and set it. >>> (scev_const_prop): New argument. >>> * tree-scalar-evolution.h (resolve_mixers): New parameter. >>> * tree-chrec.c (convert_affine_scev): Call chrec_convert instead >>> of chrec_conert_1. >>> (chrec_convert): New parameter. Move definition below. >>> (chrec_convert_aggressive): New parameter and set it. Call >>> convert_affine_scev. >>> * tree-chrec.h (chrec_convert): New parameter. >>> (chrec_convert_aggressive): Ditto. >>> * tree-ssa-loop-niter.c (loop_exits_before_overflow): New function. >>> (scev_probably_wraps_p): Factor loop niter related code into >>> loop_exits_before_overflow. >>> >>> gcc/testsuite/ChangeLog >>> 2015-05-20 Bin Cheng <bin.ch...@arm.com> >>> >>> PR tree-optimization/62173 >>> * gcc.dg/tree-ssa/scev-3.c: Remove xfail. >>> * gcc.dg/tree-ssa/scev-4.c: Ditto. >>> * gcc.dg/tree-ssa/scev-8.c: New.