On June 16, 2014 6:39:58 PM CEST, Ajit Kumar Agarwal <ajit.kumar.agar...@xilinx.com> wrote: > > >-----Original Message----- >From: Richard Biener [mailto:richard.guent...@gmail.com] >Sent: Monday, June 16, 2014 7:55 PM >To: Ajit Kumar Agarwal >Cc: gcc@gcc.gnu.org; Vladimir Makarov; Michael Eager; Vinod Kathail; >Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala >Subject: Re: Register Pressure guided Unroll and Jam in GCC !! > >On Mon, Jun 16, 2014 at 4:14 PM, Ajit Kumar Agarwal ><ajit.kumar.agar...@xilinx.com> wrote: >> Hello All: >> >> I have worked on the Open64 compiler where the Register Pressure >Guided Unroll and Jam gave a good amount of performance improvement for >the C and C++ Spec Benchmark and also Fortran benchmarks. >> >> The Unroll and Jam increases the register pressure in the Unrolled >Loop leading to increase in the Spill and Fetch degrading the >performance of the Unrolled Loop. The Performance of Cache locality >achieved through Unroll and Jam is degraded with the presence of >Spilling instruction due to increases in register pressure Its better >to do the decision of Unrolled Factor of the Loop based on the >Performance model of the register pressure. >> >> Most of the Loop Optimization Like Unroll and Jam is implemented in >the High Level IR. The register pressure based Unroll and Jam requires >the calculation of register pressure in the High Level IR which will >be similar to register pressure we calculate on Register Allocation. >This makes the implementation complex. >> >> To overcome this, the Open64 compiler does the decision of Unrolling >to both High Level IR and also at the Code Generation Level. Some of >the decisions way at the end of the Code Generation . The advantage of >using this approach like Open64 helps in using the register pressure >information calculated by the Register Allocator. This helps the >implementation much simpler and less complex. >> >> Can we have this approach in GCC of the Decisions of Unroll and Jam >in the High Level IR and also to defer some of the decision at the >Code Generation Level like Open64? >> >> Please let me know what do you think. > >>>Sure, you can for example compute validity of the transform during >the GIMPLE loop opts, annotate the loop meta-information with the >desired transform and apply it (or not) later >>during RTL unrolling. > >Thanks !! Has RTL unrolling been already implemented?
Yes but not of non-innermost loops afaik. Richard >Richard. > >> Thanks & Regards >> Ajit