> The optimization might be useful for some other processors which > have direct move insns for the two considered classes and when IRA for > some reasons did not use the class union. At least I see > that we could try this for ARM (spilling general regs into VF regs) > and for extended powerpc architecture (spilling general regs into fp > regs). What is only necessary is just to define two macros. I am > going to do it for ARM and see is this optimization beneficial for > OMAP4. Although I think it is not as fp units with VF regs in ARM > implementations I know are too separate from integer units.
There is a cost associated with using the VFP register bank and on older cores like the A8, there is a penalty associated with moving values from the VFP register bank to the integer register bank, so it needs to be carefully looked at on a per core basis. If you are benchmarking this on an A9 (which is an OMAP4), I would suggest turning on Neon in your builds to see the full effect of this rather than just defaulting to the standard vfpv3-d16 configuration just because this then also brings in the SIMD unit into play. Thanks, Ramana