On Tue, Dec 8, 2015 at 7:56 PM, Steve Ellcey <sell...@imgtec.com> wrote: > I have an ivopts optimization question/proposal. When compiling the > attached program the ivopts pass prefers the original ivs over new ivs > and that causes us to generate less efficient code on MIPS. It may > affect other platforms too. > > The Source code is a C strcmp: > > int strcmp (const char *p1, const char *p2) > { > const unsigned char *s1 = (const unsigned char *) p1; > const unsigned char *s2 = (const unsigned char *) p2; > unsigned char c1, c2; > do { > c1 = (unsigned char) *s1++; > c2 = (unsigned char) *s2++; > if (c1 == '\0') return c1 - c2; > } while (c1 == c2); > return c1 - c2; > } > > Currently the code prefers the original ivs and so it generates > code that increments s1 and s2 before doing the loads (and uses > a -1 offset): > > <bb 3>: > # s1_1 = PHI <p1_4(D)(2), s1_6(6)> > # s2_2 = PHI <p2_5(D)(2), s2_9(6)> > s1_6 = s1_1 + 1; > c1_8 = MEM[base: s1_6, offset: 4294967295B]; > s2_9 = s2_2 + 1; > c2_10 = MEM[base: s2_9, offset: 4294967295B]; > if (c1_8 == 0) > goto <bb 4>; > else > goto <bb 5>; > > If I remove the cost increment for non-original ivs then GCC > does the loads before the increments: > > <bb 3>: > # ivtmp.6_17 = PHI <ivtmp.6_24(2), ivtmp.6_14(6)> > # ivtmp.7_21 = PHI <ivtmp.7_22(2), ivtmp.7_23(6)> > _25 = (void *) ivtmp.6_17; > c1_8 = MEM[base: _25, offset: 0B]; > _26 = (void *) ivtmp.7_21; > c2_10 = MEM[base: _26, offset: 0B]; > if (c1_8 == 0) > goto <bb 4>; > else > goto <bb 5>; > . > . > <bb 5>: > ivtmp.6_14 = ivtmp.6_17 + 1; > ivtmp.7_23 = ivtmp.7_21 + 1; > if (c1_8 == c2_10) > goto <bb 6>; > else > goto <bb 7>; > > > This second case (without the preference for the original IV) > generates better code on MIPS because the final assembly > has the increment instructions between the loads and the tests > of the values being loaded and so there is no delay (or less delay) > between the load and use. It seems like this could easily be > the case for other platforms too so I was wondering what people > thought of this patch:
You don't comment on the comment you remove ... debugging programs is also important! So if then the cost of both cases should be distinguished somewhere else, like granting a bonus for increment before exit test or so. Richard. > 2015-12-08 Steve Ellcey <sell...@imgtec.com> > > * tree-ssa-loop-ivopts.c (determine_iv_cost): Remove preference for > original ivs. > > > diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c > index 98dc451..26daabc 100644 > --- a/gcc/tree-ssa-loop-ivopts.c > +++ b/gcc/tree-ssa-loop-ivopts.c > @@ -5818,14 +5818,6 @@ determine_iv_cost (struct ivopts_data *data, struct > iv_cand *cand) > > cost = cost_step + adjust_setup_cost (data, cost_base.cost); > > - /* Prefer the original ivs unless we may gain something by replacing it. > - The reason is to make debugging simpler; so this is not relevant for > - artificial ivs created by other optimization passes. */ > - if (cand->pos != IP_ORIGINAL > - || !SSA_NAME_VAR (cand->var_before) > - || DECL_ARTIFICIAL (SSA_NAME_VAR (cand->var_before))) > - cost++; > - > /* Prefer not to insert statements into latch unless there are some > already (so that we do not create unnecessary jumps). */ > if (cand->pos == IP_END