On Fri, Jun 13, 2014 at 9:43 AM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Thu, Jun 12, 2014 at 7:59 PM, Zdenek Dvorak <rakd...@iuuk.mff.cuni.cz> > wrote: >> Hi, >> >>> > I noticed there is below code/comments about may_be_zero field in loop >>> > niter desc: >>> > >>> > tree may_be_zero; /* The boolean expression. If it evaluates to >>> > true, >>> > the loop will exit in the first iteration (i.e. >>> > its latch will not be executed), even if the niter >>> > field says otherwise. */ >>> > >>> > I had difficulty in understanding this because I ran into some cases >>> > in which it didn't behave as said. >> >> actually, in all the examples below, the field behaves as described, >> i.e., >> >> the number of iterations = may_be_zero ? 0 : niter; >> >> In particular, the fact that may_be_zero is false *does not imply* >> that the number of iterations as described by niter is non-zero. >> >>> > Example1, the dump of loop before sccp is like: >>> > >>> > <bb 2>: >>> > bnd_4 = len_3(D) + 1; >>> > >>> > <bb 3>: >>> > # ivtmp_1 = PHI <0(2), ivtmp_11(4)> >>> > _6 = ivtmp_1 + len_3(D); >>> > _7 = a[ivtmp_1]; >>> > _8 = b[ivtmp_1]; >>> > _9 = _7 + _8; >>> > a[_6] = _9; >>> > ivtmp_11 = ivtmp_1 + 1; >>> > if (bnd_4 > ivtmp_11) >>> > goto <bb 4>; >>> > else >>> > goto <bb 5>; >>> > >>> > <bb 4>: >>> > goto <bb 3>; >>> > >>> > The loop niter information analyzed in sccp is like: >>> > >>> > Analyzing # of iterations of loop 1 >>> > exit condition [1, + , 1] < len_3(D) + 1 >>> > bounds on difference of bases: -1 ... 4294967294 >>> > result: >>> > zero if len_3(D) == 4294967295 >>> > # of iterations len_3(D), bounded by 4294967294 >>> > >>> > Qeustion1, shouldn't it be like "len_3 +1 <= 1" because the latch >>> > won't be executed when "len_3 == 0", right? >> >> the analysis determines the number of iterations as len_3, that is >> 0 if len_3 == 0. So, the information is computed correctly here. >> >>> > But when boundary condition is the only case that latch get ZERO >>> > executed, the may_be_zero info will not be computed. See example2, >>> > with dump of loop before sccp like: >>> > >>> > foo (int M) >>> > >>> > <bb 2>: >>> > if (M_4(D) > 0) >>> > goto <bb 4>; >>> > else >>> > goto <bb 3>; >>> > >>> > <bb 3>: >>> > return; >>> > >>> > <bb 4>: >>> > >>> > <bb 5>: >>> > # i_13 = PHI <0(4), i_10(6)> >>> > _5 = i_13 + M_4(D); >>> > _6 = a[i_13]; >>> > _7 = b[i_13]; >>> > _8 = _6 + _7; >>> > a[_5] = _8; >>> > i_10 = i_13 + 1; >>> > if (M_4(D) > i_10) >>> > goto <bb 6>; >>> > else >>> > goto <bb 3>; >>> > >>> > <bb 6>: >>> > goto <bb 5>; >>> > >>> > The niter information analyzed in sccp is like: >>> > >>> > Analyzing # of iterations of loop 1 >>> > exit condition [1, + , 1](no_overflow) < M_4(D) >>> > bounds on difference of bases: 0 ... 2147483646 >>> > result: >>> > # of iterations (unsigned int) M_4(D) + 4294967295, bounded by >>> > 2147483646 >>> > >>> > So may_be_zero is always false here, but the latch may be ZERO >>> > executed when "M_4 == 1". >> >> Again, this is correct, since then ((unsigned int) M_4) + 4294967295 == 0. >> >>> > Start from Example1, we can create Example3 which makes no sense to >>> > me. Again, the dump of loop is like: >>> > >>> > <bb 2>: >>> > bnd_4 = len_3(D) + 1; >>> > >>> > <bb 3>: >>> > # ivtmp_1 = PHI <0(2), ivtmp_11(4)> >>> > _6 = ivtmp_1 + len_3(D); >>> > _7 = a[ivtmp_1]; >>> > _8 = b[ivtmp_1]; >>> > _9 = _7 + _8; >>> > a[_6] = _9; >>> > ivtmp_11 = ivtmp_1 + 4; >>> > if (bnd_4 > ivtmp_11) >>> > goto <bb 4>; >>> > else >>> > goto <bb 5>; >>> > >>> > <bb 4>: >>> > goto <bb 3>; >>> > >>> > <bb 5>: >>> > return 0; >>> > >>> > The niter info is like: >>> > >>> > Analyzing # of iterations of loop 1 >>> > exit condition [4, + , 4] < len_3(D) + 1 >>> > bounds on difference of bases: -4 ... 4294967291 >>> > result: >>> > under assumptions len_3(D) + 1 <= 4294967292 >>> > zero if len_3(D) == 4294967295 >>> > # of iterations len_3(D) / 4, bounded by 1073741823 >>> > >>> > The problem is: won't latch be ZERO executed when "len_3 == 0/1/2/3"? >> >> Again, in all these cases the number of iterations is len_3 / 4 == 0. > > Thanks, I realized that boundary condition is described by niter part > of information and it's more conveniently handled in this way for > consumer of may_be_zero like IVOPT. One problem is with below > example: > > <bb 6>: > vectp_a.8_57 = &a; > vectp_b.11_61 = &b; > _67 = _1 * 4; > vectp_a.15_66 = &a + _67; > > <bb 7>: > # vectp_a.7_58 = PHI <vectp_a.8_57(6), vectp_a.7_59(14)> > # vectp_b.10_62 = PHI <vectp_b.11_61(6), vectp_b.10_63(14)> > # vectp_a.14_68 = PHI <vectp_a.15_66(6), vectp_a.14_69(14)> > # ivtmp_9 = PHI <0(6), ivtmp_71(14)> > vect__6.9_60 = MEM[(float *)vectp_a.7_58]; > vect__7.12_64 = MEM[(float *)vectp_b.10_62]; > vect__8.13_65 = vect__6.9_60 + vect__7.12_64; > MEM[(float *)vectp_a.14_68] = vect__8.13_65; > vectp_a.7_59 = vectp_a.7_58 + 16; > vectp_b.10_63 = vectp_b.10_62 + 16; > vectp_a.14_69 = vectp_a.14_68 + 16; > ivtmp_71 = ivtmp_9 + 1; > if (ivtmp_71 < bnd.4_36) > goto <bb 14>; > else > goto <bb 9>; > > <bb 14>: > goto <bb 7>; > > After ivopt: > > <bb 6>: > vectp_a.8_57 = &a; > vectp_b.11_61 = &b; > _67 = _1 * 4; > vectp_a.15_66 = &a + _67; > > <bb 7>: > # ivtmp.24_26 = PHI <0(6), ivtmp.24_33(14)> > # ivtmp.27_88 = PHI <0(6), ivtmp.27_89(14)> > vect__6.9_60 = MEM[symbol: a, index: ivtmp.27_88, offset: 0B]; > vect__7.12_64 = MEM[symbol: b, index: ivtmp.27_88, offset: 0B]; > vect__8.13_65 = vect__6.9_60 + vect__7.12_64; > MEM[base: vectp_a.15_66, index: ivtmp.27_88, offset: 0B] = vect__8.13_65; > ivtmp.24_33 = ivtmp.24_26 + 1; > ivtmp.27_89 = ivtmp.27_88 + 16; > if (ivtmp.24_33 < bnd.4_36) > goto <bb 14>; > else > goto <bb 9>; > > <bb 14>: > goto <bb 7>; > > Ideally, "(ivtmp.24_33 < bnd.4_36)" should be eliminated by using > "ivtmp.27_89". One blocker is may_be_zero information like below: > > Analyzing # of iterations of loop 1 > exit condition [1, + , 1] < _38 + 1 > bounds on difference of bases: -1 ... 4294967294 > result: > zero if _38 == 4294967295 > # of iterations _38, bounded by 4294967294 > number of iterations _38; zero if _38 == 4294967295 > > "_38 == 4294967295" is folded from "_38 + 1 < 1", only the folded form > can't be handled by iv_elimination_compare_lt. > > It seems I should investigate how to extend iv_elimination_compare_lt > to handle more general may_be_zero information.
Probably. It's only those corner cases that can be folded from the generic range specifying unsigned compares. Richard. > Thanks, > bin > > -- > Best Regards.