He Xiao wrote:
When I finished the scheduler, I got a strange phenomenon:
The CPI is reduced, but the total execution cycles are dramatically increased.
If this is a machine with a small number of registers, then try
disabling the first instruction scheduling pass that runs before
register allocation. This tends to increase register pressure so much
that you end up with worse code than if you didn't schedule. There is a
second scheduling pass after register allocation that will still run.
See for instance the flag_schedule_insns code in config/i386/i386.c in
the function optimization_options.
Try debugging the scheduler to make sure it is doing what it should be
doing. If you use options like
-fsched-verbose=9 -fdump-rtl-sched2
then you will get an output file that contains info about the scheduling
choices that were made for the second scheduling pass. This file
contains estimated issue cycles for all of the instructions that were
scheduled. You can try writing small testcases to verify that you get
the result you expect for specific cases. The output will look like this
;; Ready list (t = 6): 9
;; 6--> 9 dx=ax*0x4+ax :decodern,p0
where 6 is the issue cycle, and 9 is the insn uid, and then the next
part shows what this instruction is doing (this is an x86 lea).
Jim