Bradley Lucier <luc...@math.purdue.edu> writes: > Are 12 registers not enough, in principle, to do scheduling before > register allocation? I was getting a 15% speedup on some numerical > codes, as pre-scheduling spaced out the vector loads among the > floating-point computations.
If you are getting that kind of speedup (which I personally did not expect) then this is clearly worth pursuing. It should be possible to make it work at least in 64-bit mode. I recommend that you file a bug report or two for cases which fail when using -fschedule-insns. Ian