On Mon, Jun 16, 2014 at 4:14 PM, Ajit Kumar Agarwal
<ajit.kumar.agar...@xilinx.com> wrote:
> Hello All:
>
> I have worked on the Open64 compiler where the Register Pressure Guided 
> Unroll and Jam gave a good amount of performance improvement for the  C and 
> C++ Spec Benchmark and also Fortran benchmarks.
>
> The Unroll and Jam increases the register pressure in the Unrolled Loop 
> leading to increase in the Spill and Fetch degrading the performance of the 
> Unrolled Loop. The Performance of Cache locality achieved through Unroll and 
> Jam is degraded with the presence of Spilling instruction due to increases in 
> register pressure Its better to do the decision  of Unrolled Factor of the 
> Loop based on the Performance model of the register pressure.
>
> Most of the Loop Optimization Like Unroll and Jam is implemented in the High 
> Level IR. The register pressure based Unroll and Jam requires the calculation 
> of register pressure in the High Level IR  which will be similar to register 
> pressure we calculate on Register Allocation. This makes the implementation 
> complex.
>
> To overcome this, the Open64 compiler does the decision of Unrolling to both 
> High Level IR and also at the Code Generation Level. Some of the decisions 
> way at the end of the Code Generation . The advantage of using this approach 
> like Open64 helps in using the register pressure information calculated by 
> the Register Allocator. This helps the implementation much simpler and less 
> complex.
>
> Can we have this approach in GCC of the Decisions of Unroll and Jam in the 
> High Level IR  and also to defer some of the decision at the Code Generation 
> Level like Open64?
>
>  Please let me know what do you think.

Sure, you can for example compute validity of the transform during
the GIMPLE loop opts, annotate the loop meta-information with
the desired transform and apply it (or not) later during RTL unrolling.

Richard.

> Thanks & Regards
> Ajit

Reply via email to