Richard Henderson wrote:

I do not see the point why you should discourage the register allocator from using mmx registers, move through memory is clearly inefficent and enlarges resulting code (if the function containing moves is inlined in several places, even more so).

First, what you think is "clearly inefficient" is at least two cycles
faster, at least for AMD (Intel hasn't published anything as useful as
instruction latencies since early PentiumPro).  I'm not sure what sort
of pipeline bypasses are or are not responsible, but *all* cross function
unit moves are discouraged.

I see, i did a speed test with GCC 4.1.0 and 3.4.4 on my athlon-xp and you are right. Direct moves between genregs and MMX are still useful when optimizing for size. GCC could do a bit more sophisticated guesses whether to use secondary memory for such moves or not, right now it just disables them all.

Second, proper use of MMX requires proper placement of emms instructions.
Allowing the register allocator to use MMX registers at will breaks that.


r~
That is true, but would register allocator choose MMX regs for anything else than MMX ops ?

Regards,
Vahur

Reply via email to