On 2021/10/26 21:05, Jan Hubicka wrote:
>>>
>
>> That said, likely the profile update cannot be done uniformly
>> for all blocks of a loop?
>
> For the loop:
>
> for (i = 0; i < n; i = inc (i))
> {
> if (ga)
> ga = do_something ();
> }
>
> to:
>
> for (i = 0; i < x; i = inc (i))
> {
> if (true)
> ga = do_something ();
> if (!ga)
> break;
> }
> for (; i < n; i = inc (i))
> {
> if (false)
> ga = do_something ();
> }
>
> If probability of if (ga) being true is p, then you indeed can scale the
> first loop by p and second loop by 1-p.
>
> Imagine that loop has n iterations and it takes m iterations for ga to
> become false, then probability of if(ga) is m/n and you get frequencies
> with m=n*(m/n) for first loop and n-m=n*(1-n/m) for second loop.
>
> Because the conditional becomes constant true, one needs to scale up the
> basic block guarded by the if (true) up by n/m to compensate for the
> change. With that the udpate should be right.
> Ideally one can bypass scaling of basic block(s) containing
> ga = do_something () since the scaling first scales down to m/n
> and then scale sup to m/n. Which may not combine to noop.
> Perhaps one wants to have a parameter specifying basic blocks on which
> the scaling is performed while duplicating for this?
Yes. This is the patch I tried to fix the issue for the case you
pasted for loop split.
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576566.html
5<----------------
| \ |
6-- 7 |
| \ |
21 11->3 |
| \ |
19 20----------
|
12<----
/ | |
3<-11 16--
with the patch, Loop 1's bb {5,6,7,21,20} is scaled to 33% * count,
Loop 2's bb {12,16} is scaled to 66% * count,
especially, probability of edge 21->19 and 21->20 is fixed to
(33%, 67%) instead of (100%, INV).
- <bb 5> [local count: 955630225]:
+ <bb 5> [local count: 315357973]:
# i_13 = PHI <i_10(20), 0(4)>
# prephitmp_12 = PHI <prephitmp_5(20), pretmp_3(4)>
if (prephitmp_12 != 0)
goto <bb 6>; [33.00%]
else
goto <bb 7>; [67.00%]
- <bb 6> [local count: 315357972]:
+ <bb 6> [local count: 104068130]:
_2 = do_something ();
ga = _2;
- <bb 7> [local count: 955630225]:
+ <bb 7> [local count: 315357973]:
# prephitmp_5 = PHI <prephitmp_12(5), _2(6)>
i_10 = inc (i_13);
if (n_7(D) > i_10)
goto <bb 21>; [89.00%]
else
goto <bb 11>; [11.00%]
<bb 11> [local count: 105119324]:
goto <bb 3>; [100.00%]
- <bb 21> [local count: 850510901]:
+ <bb 21> [local count: 280668596]:
if (prephitmp_12 != 0)
- goto <bb 20>; [100.00%]
+ goto <bb 20>; [33.00%]
else
- goto <bb 19>; [INV]
+ goto <bb 19>; [67.00%]
- <bb 20> [local count: 850510901]:
+ <bb 20> [local count: 280668596]:
goto <bb 5>; [100.00%]
- <bb 19> [count: 0]:
+ <bb 19> [local count: 70429947]:
# i_23 = PHI <i_10(21)>
# prephitmp_25 = PHI <prephitmp_5(21)>
- <bb 12> [local count: 955630225]:
+ <bb 12> [local count: 640272252]:
# i_15 = PHI <i_23(19), i_22(16)>
# prephitmp_16 = PHI <prephitmp_25(19), prephitmp_16(16)>
i_22 = inc (i_15);
if (n_7(D) > i_22)
goto <bb 16>; [89.00%]
else
goto <bb 11>; [11.00%]
- <bb 16> [local count: 850510901]:
+ <bb 16> [local count: 569842305]:
goto <bb 12>; [100.00%]
>
> Honza
>>
>> Richard.
--
Thanks,
Xionghu