On 2015-01-12 6:33 AM, Ajit Kumar Agarwal wrote:
-----Original Message-----
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Monday, January 12, 2015 2:33 PM
To: Ajit Kumar Agarwal
Cc: vmaka...@redhat.com; l...@redhat.com; gcc@gcc.gnu.org; Vinod Kathail; Shail
Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: Allocating some Loop allocno in memory
On Sun, Jan 11, 2015 at 4:45 PM, Ajit Kumar Agarwal
<ajit.kumar.agar...@xilinx.com> wrote:
-----Original Message-----
From: Richard Biener [mailto:richard.guent...@gmail.com]
Sent: Sunday, January 11, 2015 8:05 PM
To: Ajit Kumar Agarwal; vmaka...@redhat.com; l...@redhat.com;
gcc@gcc.gnu.org
Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju
Mekala
Subject: Re: Allocating some Loop allocno in memory
On January 11, 2015 5:25:23 AM CET, Ajit Kumar Agarwal
<ajit.kumar.agar...@xilinx.com> wrote:
I was thinking of some of the opportunities with respect to reducing
spills inside the Loop. If the Live range(allocno) spans through the
Loop and Live out at the exit of the Loop and there are no references
or not being touched upon inside the Loop, assign the allocno to the
memory. This increase the chances of getting the register for other
allocno which spans through the Loop and have reference inside the
Loop if the interference graph is not colorable.
Since allocno don't have references inside the Loop there won't be any
load instructions with respect to restore inside the loops. There will
be a store instruction with respect to spill which will be outside
the Loop. This will reduce the conflicting edges of some of the
allocno increasing the chances of making colorable of the Interference
graph and reduces the spills and restore inside the Loop.
This heuristics looks reasonable. This heuristics goes side by side
with the Live range splitting across the Loop boundary.
On top of this heuristics, after the Live range splitting across the
Loop boundary there interference graph is not colorable then we can
assign some of the splitted live ranges in the memory giving chances
of registers for the Live ranges(allocno) that spans the Loop and
have reference inside the Loop.
We can change the cost of allocno in memory based on the above
heuristics and take the above factor into consideration for the cost
calculation.
Thoughts Please?
How can this result in better allocations if you only ever spill at life-range split
points? The life-range covering the loop should already be assigned to >>memory
if required.
If the live range covering the loop assigned to memory, and there are
references inside the Loops then there are chances of spills inside
the loop which degrades the performance. If live range covering the
loops and there are no references inside the Loop, then assigning to
memory make reasonable as there are no references. Since there are no
references or not touched inside the Loop, there won't be load instruction to
restore which are required if there are any references. The store which
requires for the def of the Live range will be outside the Loops. This helps in
not degrading the performances as there are no load and store required for
spill and restore inside the Loops.
The above heuristics can also be accompanied with heuristics of the
number of use points in the Live range. Considering both the
heuristics will lead to better allocation if ever spill at the live
range split points and there are chances of getting colorable the
Interference graph without degrading the performance. Also if the Live range is
splitted at the Loop boundary then the spill at the split points of loop
boundary and registers is assigned to this live range will make it reasonable
if there are no references inside the Loops.
But if we split the life-range of not used in-loop-body pseudos at the loop
boundary the allocator should do the assignment to memory.
There is no need to add another heuristic here. Or do you say we don't split
the life-range of such pseudos at loop boundary?
In spill_cost calculations we are seeing the function ira_loop_edge_freq
triggered with the Live ranges that are live at the exit of the Loop but
are not referenced inside the Loops. We are still considering frequency in the
updation of spill cost for such live ranges . We have made a change
not to consider the freq of such live ranges in the updation of spill cost,
thus reducing the cost of such live ranges making it a chance to be in
memory giving the conflicting allocno the chances of getting the register.
There are chances , the conflicting allocno with respect to loops
which have references and will have more edge on getting the registers.
The changes are made in the following.
https://gcc.gnu.org/ml/gcc/2015-01/msg00033.html
I have some doubts that your changes will improve performance. The
edge_freq is used in top-bottom loop allocation. If we spilled allocno
for given regno in the parent loop, the cost of memory for allocno in
the current loop for given regno is decreased proportionally enter/exit
edges freq (loop back edges are excluded) and the cost of hard registers
are proportionally increased. My common sense says that it is a right
approach. When there are no references of regno in the loop we even
more definitely allocates memory for allocno in the loop. If we
allocate hard reg to pseudo in the parent loop and it is possible to
allocate hard register to the pseudo in the loop (without spilling other
pseudos) we should allocate the register as it removes loads/stores on
loop bounds. Again edge_freq helps to do this.
You propose to make edge freq to zero for some regnos and this means
ignoring any allocation (to hard reg or memory) in the parent loop.
Of course, common sense might be wrong in heuristics (I saw it several
times). So I'll try your patch on SPEC2000 on major platform x86-64. I
let you know them probably tomorrow or day after tomorrow.
Still if your changes work, your changes add loops inside another loop.
I believe the same can be achieved without introducing the new loops.