On 09/08/2015 03:12 PM, Evandro Menezes wrote:
cache miss and transcendental functions). You might also attack
secondary
issues like throughput at the retirement stage for example.
Our motivation stems from the fact that even modern, aggressively OOO
processors don't have orthogonal resources. Some insns depend on expensive
circuitry (area or power wise) that is added only once, making such insns
simply scalar, though most of other insns enjoy multiple resources capable
of executing them as superscalar. That's why we believe that a hybrid
approach might yield good results. We don't have data, for it possibly
requires implementing it first.
I'd also argue that looking at an OOO pipeline in a steady state is not the
only approach. It's also important to consider how quickly the pipeline can
be replenished or warmed up to reach a steady state.
Which is why I mentioned optimizing for throughput at the retirement
stage rather than traditional latency scheduling.
That's from a real world case -- the PA8000 where retirement bandwidth
was at a premium (relative to functional unit bandwidth).
jeff