On Fri, 2015-12-04 at 16:22 +0800, Bin.Cheng wrote: > Dump before IVO is as below: > > <bb 3>: > # s1_1 = PHI <p1_4(D)(2), s1_6(6)> > # s2_2 = PHI <p2_5(D)(2), s2_9(6)> > s1_6 = s1_1 + 1; > c1_8 = *s1_1; > s2_9 = s2_2 + 1; > c2_10 = *s2_2; > if (c1_8 == 0) > goto <bb 4>; > else > goto <bb 5>; > > And the iv candidates are as: > candidate 1 (important) > var_before ivtmp.6 > var_after ivtmp.6 > incremented before exit test > type unsigned int > base (unsigned int) p1_4(D) > step 1 > base object (void *) p1_4(D) > candidate 2 (important) > original biv > type const unsigned char * > base (const unsigned char *) p1_4(D) > step 1 > base object (void *) p1_4(D) > candidate 3 (important) > var_before ivtmp.7 > var_after ivtmp.7 > incremented before exit test > type unsigned int > base (unsigned int) p2_5(D) > step 1 > base object (void *) p2_5(D) > candidate 4 (important) > original biv > type const unsigned char * > base (const unsigned char *) p2_5(D) > step 1 > base object (void *) p2_5(D) > > Generally GCC would choose normal candidates {1, 3} and insert > increment before exit condition. This is expected in this case. But > when there is applicable original candidates {2, 4}, GCC would prefer > these in order to achieve better debugging. Also as I suspected, > [reg] and [reg-1] have same address cost on mips, that's why GCC makes > current decision. > > Thanks, > bin
Yes, I agree that [reg] and [reg-1] have the same address cost, but using [reg-1] means that the increment of reg happens before the access and that puts the load of [reg-1] closer to the use of the value loaded and that causes a stall. If we used [reg] and incremented it after the load then we would have at least one instruction in between the load and the use and either no stall or a shorter stall. I don't know if ivopts has anyway to do this type of analysis when picking the IV. Steve Ellcey sell...@imgtec.com