On 01/09/2012 07:45 AM, Bernd Schmidt wrote:
On 12/23/2011 12:46 PM, Richard Sandiford wrote:
In the end I tried an ad-hoc approach in an attempt to do something
about (2), (3) and (4b). The idea was to construct a preliminary
"model" schedule in which the only objective is to keep register
pressure to a minimum. This schedule ignores pipeline characteristics,
latencies, and the number of available registers. The maximum pressure
seen in this initial model schedule (MP) is then the benchmark for ECC(X).
Interesting. I had also thought about trying to address the problem in
the scheduler, so I'll just share some of my thoughts (not intended as a
negative comment on your approach).
Essentially the problem I usually see is that the wider your machine
gets, the happier sched1 will be to fill unused slots with instructions
that could just as well be scheduled 200 cycles later. That's really an
artifact of forward scheduling, so why not schedule both forwards and
then backwards (or the other way round)? Produce an initial schedule,
then perform a fixup phase: start at the other end and look for
instructions that can be scheduled in a wide range of cycles, and move
them if doing so can be shown to reduce register pressure, and we retain
a correct schedule. This can be done without trying to have a global
pressure estimate which has the problems you noted.
I saw such approaches in the literature. It would be interesting to see
how it works in GCC.
Although, I believe that register pressure minimization pass before RA
and selective insn scheduler used only after RA could solve all this
problems. But it needs a lot of investigation to confirm this. It is
also hard for me to say what approach would require more efforts to
implement.
Doing this would require constructing a backwards DFA, and I gave up on
this for the moment after a day or so spent with genautomata, but it
should be possible.
Long ago (10 years ago) I considered to do this exactly for removing the
2nd separate insn scheduler by inserting spill code in already existing
insn schedule. But I decided not to do this.
Also I considered to generate DFAs with repeated insn issues for better
serving modulo scheduler.
There are a lot of directions of developing automatically generated
pipeline hazard recognizers going beyond (N)DFAs to simulate hidden
register renaming, different ooo processors look ahead buffers/queues.
All of these directions could be interesting research projects. But
even in the current state, imho GCC (N)DFA pipeline hazard recognizer
after its 10 years of existence is the most powerful one used in
industrial compilers.