Hello, I want to propose a project for Google Summer of Code on title
"New static scheduling heuristic". I hope that Vlad Makarov from
Redhat or Andrey Belevantsev from ISP RAS will menthor this
application.
I will appreciate any feedback and will try to answer any questions
regarding my application.

========================================================================
           Abstract
  Most scheduling approaches consist of two steps: computation of sets
of available instructions at the current scheduling point, and
choosing one of them to become the next instruction.  This approach is
used in GCC compiler in current implementation of list scheduler
(known as Haifa scheduler) and new approach under developing –
selective scheduling.  While most activities aimed at improving
scheduling quality are trying to increase number of gathered available
instructions (such as above-mentioned selective scheduling, treegion
scheduling, improvements of region formation), choosing step plays an
important role too. Current implementation of scheduler uses static
priorities, based on critical path heuristic, which is known to behave
well while scheduling single basic blocks and become worse when
expanding scheduling boundaries to extended basic blocks or regions.
This project aims at developing a new priorities system, which would
consider not only the critical path, but other possible exits from
scheduling region and their probabilities.

           Current implementation
  First of all, depending on area, from which available instructions are
gathered, there are two schedulers present in GCC, namely sched-ebb
and sched-region. The scheduling infrastructure of these schedulers is
essentially the same, driven by haifa-sched.
  Current implementation of scheduling priorities is divided onto two
phases – static and dynamic. The most important is static part – it is
considered first, and in case of equal static priorities, dynamic
priority is used to determine which one of instructions is better.
  Given a dependence graph, static priority is calculated as default
instruction latency for all instructions without successors in
dependence graph, and maximum over all successors of sum of priority
of successor and cost of dependency – for other instructions. This
way, considering only one basic block, instructions that are on the
critical path to basic block exit are the most prioritized ones. It is
important to note that for sched-rgn dependencies leading from one
basic block to other are just discarded.

           Drawbacks of current implementation
  Critical path heuristic begins to fail to find good priorities when
used on areas, bigger than single basic block, i.e. with several
exits. For extended basic block this heuristic tends to give most
priority to instructions that contribute to the last (with the longest
path from entry in dependence graph) exit. At the same time
instructions, contributing to other exits, are delayed. Consequently,
if earliest exit from extended basic block is taken most of the time,
this can lead to performance regression.
  Behavior of critical path heuristic in case of region scheduling is
trickier. From the definition of priority, instructions from the end
of basic block are generally less prioritized than from the beginning.
Hence, when scheduling the end part of some basic block in region,
instructions from beginning of successive basic blocks may have bigger
priority than instructions from the end of current basic block. This,
in turn, can lead to performance regression too.

           Ways to overcome drawbacks
  The drawbacks of current priority scheme can be evaded in several
ways. First, critical path priorities can be adjusted dynamically with
regard to the current scheduling point. Second, compute new static
priorities with regard to probabilities of taking corresponding exits.
The background idea of this approach is that instruction priority is
increased for some amount for each exit it contributes to. This amount
depends on the probability of taking current exit and the depth of
instruction in dependence graph, consisting only from instructions,
contributing to current exit. In this way, instructions contributing
to several exits will get higher priority. Likewise, instructions that
are situated higher in dependence graph will get lower priority.
Third, taking into account processor resources while creating data
dependence graph can increase accuracy of priority calculation.
I aim mostly at the implementation of second approach as it seems to
solve known problems and then to improve it with resource-aware
priority computation.

           Roadmap
  I plan to implement and test on ia64 platform all three approaches
to solving static scheduling priority problem for sched-ebb and
sched-rgn and do it later for selective scheduling.


Reply via email to