On 2015-01-12 6:33 AM, Ajit Kumar Agarwal wrote:

-----Original Message-----
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Monday, January 12, 2015 2:33 PM
To: Ajit Kumar Agarwal
Cc: vmaka...@redhat.com; l...@redhat.com; gcc@gcc.gnu.org; Vinod Kathail; Shail 
Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Allocating some Loop allocno in memory

On Sun, Jan 11, 2015 at 4:45 PM, Ajit Kumar Agarwal 
<ajit.kumar.agar...@xilinx.com> wrote:

-----Original Message-----
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Sunday, January 11, 2015 8:05 PM
To: Ajit Kumar Agarwal; vmaka...@redhat.com; l...@redhat.com;
gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju
Mekala
Subject: Re: Allocating some Loop allocno in memory

On January 11, 2015 5:25:23 AM CET, Ajit Kumar Agarwal 
<ajit.kumar.agar...@xilinx.com> wrote:
I was thinking of some of the opportunities with respect to reducing
spills inside the Loop. If the Live range(allocno) spans through the
Loop and Live out at the exit of the Loop and there are no references
or not being touched upon inside the Loop, assign the allocno to the
memory. This increase the chances of getting the register for other
allocno which spans through the Loop and have reference inside the
Loop if the interference graph is not colorable.

Since allocno don't have references inside the Loop there won't be any
load instructions with respect to restore inside the loops. There will
be a store  instruction with respect to spill which will be outside
the Loop. This will reduce the conflicting edges of some of the
allocno increasing the chances of making colorable of the Interference
graph and reduces the spills and restore inside the Loop.

This heuristics looks reasonable. This heuristics goes side by side
with the Live range splitting across the Loop boundary.
On top of this heuristics, after the Live range splitting across the
Loop boundary there interference graph is not colorable then we can
assign some of the  splitted live ranges in the memory giving chances
of registers  for the Live ranges(allocno) that spans the Loop and
have reference inside the Loop.

We can change the cost of allocno in memory based on the above
heuristics and take the above factor into consideration for the cost
calculation.

Thoughts Please?
How can this result in better allocations if you only ever spill at life-range split 
points? The life-range covering the loop should already be assigned to >>memory 
if required.
If the live range covering the loop assigned to memory, and there are
references inside the Loops  then there are chances of spills inside
the loop which degrades the performance.  If live range covering the
loops and there are no references inside the Loop, then assigning to
memory make reasonable as there are no references. Since there are no
references or not touched inside the Loop, there won't be load instruction to 
restore which are required if there are any references. The store which 
requires for the def of the Live range will be outside the Loops. This helps in 
not degrading the performances as there are no load and store required for 
spill and restore inside the Loops.

The above heuristics can also be accompanied with heuristics of the
number of use points in the Live range. Considering both  the
heuristics will lead to better allocation if ever spill at the live
range split points and there are chances of getting colorable  the
Interference graph without degrading the performance. Also if the Live range is 
splitted at the Loop boundary then the spill  at the split points of loop 
boundary and registers is assigned to this live range will make it reasonable 
if there are no references inside the Loops.
But if we split the life-range of not used in-loop-body pseudos at the loop 
boundary the allocator should do the assignment to memory.
There is no need to add another heuristic here.  Or do you say we don't split 
the life-range of such pseudos at loop boundary?
In spill_cost calculations we are seeing the function ira_loop_edge_freq 
triggered with the Live ranges that are live at the exit of the Loop but
are not referenced inside the Loops.  We are still considering frequency in the 
updation of spill cost for such live ranges . We have made a change
not to consider the freq of such live ranges in the updation  of spill cost, 
thus reducing the cost of such live ranges making it a chance to be in
memory giving the conflicting allocno the chances of getting the register. 
There are chances , the conflicting allocno  with respect to loops
which have references and will have more edge on getting the registers.

The changes are made in the following.

  https://gcc.gnu.org/ml/gcc/2015-01/msg00033.html      


I have some doubts that your changes will improve performance. The edge_freq is used in top-bottom loop allocation. If we spilled allocno for given regno in the parent loop, the cost of memory for allocno in the current loop for given regno is decreased proportionally enter/exit edges freq (loop back edges are excluded) and the cost of hard registers are proportionally increased. My common sense says that it is a right approach. When there are no references of regno in the loop we even more definitely allocates memory for allocno in the loop. If we allocate hard reg to pseudo in the parent loop and it is possible to allocate hard register to the pseudo in the loop (without spilling other pseudos) we should allocate the register as it removes loads/stores on loop bounds. Again edge_freq helps to do this.

You propose to make edge freq to zero for some regnos and this means ignoring any allocation (to hard reg or memory) in the parent loop.

Of course, common sense might be wrong in heuristics (I saw it several times). So I'll try your patch on SPEC2000 on major platform x86-64. I let you know them probably tomorrow or day after tomorrow.

Still if your changes work, your changes add loops inside another loop. I believe the same can be achieved without introducing the new loops.

Reply via email to