On August 16, 2015 1:21:22 PM GMT+02:00, Ajit Kumar Agarwal <ajit.kumar.agar...@xilinx.com> wrote: >All: > >Loop fusion is an important optimizations that fuses the set of Loops >if the following condition is valid.
GCC does not implement loop fusion at all. Richard. >1) Loops are conformant ( i.e. they have same iteration count). >2. Loops are control equivalent. The control equivalence of the loops >can be identified with the dominator and post dominator properties >Of the Loop. Two Loops Lj and LK are control equivalent if the Lj >dominates Lk and LK post dominates the Lj. >3. Loops are adjacent. The Loops are adjacent if there is no >intervening code between them. The intervening code can be determined >With the dominance properties. If there is an aggregate node ax where >the Loop Lj dominates ax and the Loop Lk post dominates >The ax then there is a intervening code. >4. There are forward dependence between the Loops. > >If the properties 1. Doesn't hold i.e. the loops are not conformant and >they don't have same iteration count. Then the following >Transformation can be done as given in Fig(1) and Fig(2) to make the >set of Loops as non-conformant. > >In Fig(1) the Loops i1 and i2 are non-conformant as they don't have >same iteration count. The following transformation can be made given in >Fig(2). > >The Fig(2) fuses the Loop for the Loops i1 and i2 that are >non-conformant and doesn't have same iteration count. > >If the properties 3 is not valid and the set of Loops are not adjacent. >The intervening code can be moved up the Loop or down the Loop >So that the Loop become adjacent and there is no intervening code. Also >partially the intervening code is moved up and partial intervening >Code is moved down based on the dependence relation and the dominance >properties. Such transformation can make loops adjacent and >There is no intervening code. > >Also the Loop fusion, should be done after the Loop normalization pass >and the Loop normalization pass should be present before the Loop >Fusion pass. The 4 properties to be valid for loop fusion to occur is >done with respect to traversing the Loops from outermost to inner most >Loop. First the outer loop is traversed and the loops that are at same >level of the outer loop are made into one partitions and that partition >Can be fused if the above four properties is valid and otherwise the >loops are taken out from the partitions. The same logic is done for the >Loop traversing outer to inner loops and partitions are formed and >later the partitions are fused based on the above properties. > >Each Loop is traverse in the forward order of dominance tree and also >the reverse order of dominance tree ( i.e. post dominator) and the >Forward dependence is checked. Based on the dependence the Loops are >fused. And the Loops that are non-adjacent partial code is moved >Up and partial code is moved down or the whole intervening code is >moved up or down. > >I think the above transformations and properties and enhance the Loop >fusion algorithm in GCC with the above Algorithm to have more Loop >Fusion with respect to SPEC benchmarks. > >For ( I1 = 0;I1 < n;I1 ++) >{ > If ( I1 < n-1) > S1: > Else > S2: > S3: >} >For ( I2 = 0 ; I2 < n-2 ;I2 ++) > S4; > >Fig(1). > >For( I1 = 0; i1< n ; i1++) >{ > If(i1 < n-2) > { > If(i1< n-1) > S1: > Else > S2: > S3: > S4: > } > Else > { > If ( i1 < n-1) > S1: > Else > S2: > S3: > } >}// For > >Fig(2). > >Thanks & Regards >Ajit > >