FX wrote:
The best way to test IRA is to build and use the branch. It is easy to
compare the old RA (which is the default on the branch) and IRA (-fira
option switches IRA on). I'd recommend to try the following option sets:
-fira
-fira -fira-algorithm=CB
OK, I've done that and I see a 40% to 60% increase in compilation time
for the first (Fortran) testcase I tried, is that expected?
Yes, that is known problem for -O0. The old allocator does not use
global allocator at -O0, IRA is used always even for -O0. The correct
comparison would be at -O2. There are several solution of the problem:
o We could make only the reload working for -O0. In this case, the
time will be the same.
o We could prevent regional allocation at -O0. In this case, the
slow down would be 20% (I guess).
o Use a very fast and simple local allocation.
o Or just ignore this.
I'd prefer the second solution.
With the compiler from the ira branch on x86_64-linux, here are the
timings reported by "gfortran -c -time -save-temps" with and without
IRA (two timings provided for each set of option, to check
reproducability):
With -O0
# f951 148.97 9.92
# as 3.95 0.18
# f951 137.51 7.05
# as 3.98 0.17
With -O0 -fira
# f951 223.89 10.91
# as 3.67 0.18
# f951 218.98 8.43
# as 3.61 0.19
-O0 -fira -fira-algorithm=CB
# f951 191.32 9.03
# as 3.65 0.15
# f951 190.92 8.96
# as 3.63 0.18
(The testcase is 400k lines of preprocessed Fortran code, 16M is size,
available here:
http://www.pci.unizh.ch/vandevondele/tmp/all_cp2k_gfortran.f90.gz)
Thanks, I'll check it.