On Wed, Mar 18, 2015 at 12:08:47PM +0100, Richard Biener wrote: > Did you double-check if there are any differences in generated code? > Esp. the SPEC INT benchmarks look odd - they don't contain any > FP code.
SpecINT does contain some amount of floating point. Off the top of my head: 1) Bzip2 does a percentage calculation for fprintf; 2) Libquantum uses sin/cos (which the compiler optimizes to sincos) in the initial setup to set up the data. 3) I don't recall if the version of GCC used in Spec had switched over to using floating point for the register allocator or not. 4) Perlbench has a lot of code that does floating pointing point, but the main loop excerised in the Spec runs probably doesn't use FP. Sorry I couldn't respond earlier, the corporate IMAP email server was down for a period of time. Any way, I do see code changes. Before the fix was made, if you used -ffast-math, it kept the floating point constants around until reload. When reload could not find a reload to load the constant, it would push the constant to memory, and do validize_mem on the address. This created the sequence: addis 9,2,.LC0@toc@ha addi 9,9,.LC0@toc@l lfd 0,0(9) Because the address was a single register, it could also load the value into the traditional Altivec registers: addis 9,2,.LC0@toc@ha addi 9,9,.LC0@toc@l lxsdx 32,0(9) And in fact the register allocator seemed to prefer loading constants into the traditional Altivec registers instead of the traditional floating point registers. Once I made the change to force the constant to memory earlier, it would use normal addressing, and generate something like: addis 9,2,.LC0@toc@ha lfs 0,.LC0@toc@l(9) This meant that a lot of addi's are no longer generated, because the addi is folded into the lfs/lfd instruction. In addition, due to the VSX memory instructions being only register+register, whenever the code wanted to load floating point constants, it would prefer the traditional floating point registers which had register+offset addressing. This meant in turn, that any instruction that used a FP constant could potentially be changed from a VSX form (i.e. xsmuldp) into a traditional FP form (i.e. fmul) if all of the inputs and outputs were traditional floating point registers. Finally, I suspect pushing the address out earlier, means that it might be keeping the address live a little bit longer, which could change things if we are spilling GPRs. One of the things I am working on for GCC 6.0 is going back to reworking on the address support to do a better job with fusion, and I believe it will reduce the life time for some of these address temporaries. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797