RE: [AArch64] A question about Cortex-A57 pipeline description

Evandro Menezes Tue, 15 Sep 2015 08:55:32 -0700

Indeed, we observed some problems with scheduling which we believe has more
to do with the scheduling algorithm than with the model DFA, as we said in
https://gcc.gnu.org/ml/gcc/2015-09/msg00118.html


Cheers,

-- 
Evandro Menezes                              Austin, TX

> -----Original Message-----
> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of
> Nikolai Bozhenov
> Sent: Monday, September 14, 2015 2:28
> To: James Greenhalgh
> Cc: gcc@gcc.gnu.org
> Subject: Re: [AArch64] A question about Cortex-A57 pipeline description
> 
> Thanks for the reply! I see you point. Indeed, I've also seen cases where
the
> load pipeline was overused at the beginning of a basic block, whereas at
the
> end the code got stuck with a bunch of stores and no other instructions to
> run in parallel. And indeed, relaxing the restrictions makes things even
> worse in some cases. Anyway, I don't believe it's the best we can do, I'm
> going to have a closer look at the scheduler and see what can be done to
> improve the situation.
> 
> Nikolai
> 
> 
> On 09/11/2015 07:21 PM, James Greenhalgh wrote:
> > On Fri, Sep 11, 2015 at 04:31:37PM +0100, Nikolai Bozhenov wrote:
> >> Hi!
> >>
> >> Recently I got somewhat confused by Cortex-A57 pipeline description
> >> in GCC and I would be grateful if you could help me understand a few
> >> unclear points.
> > Sure,
> >
> >> Particularly I am interested in how memory operations (loads/stores)
> >> are scheduled. It seems that according to the cortex-a57.md file,
> >> firstly, two memory operations may never be scheduled at the same
> >> cycle and, secondly, two loads may never be scheduled at two
consecutive
> cycles:
> >>
> >>       ;; 5.  Two pipelines for load and store operations: LS1, LS2. The
> most
> >>       ;;     valuable thing we can do is force a structural hazard to
> split
> >>       ;;     up loads/stores.
> >>
> >>       (define_cpu_unit "ca57_ls_issue" "cortex_a57")
> >>       (define_cpu_unit "ca57_ldr, ca57_str" "cortex_a57")
> >>       (define_reservation "ca57_load_model" "ca57_ls_issue,ca57_ldr*2")
> >>       (define_reservation "ca57_store_model"
> >> "ca57_ls_issue,ca57_str")
> >>
> >> However, the Cortex-A57 Software Optimization Guide states that the
> >> core is able to execute one load operation and one store operation
> >> every cycle. And that agrees with my experiments. Indeed, a loop
> >> consisting of 10 loads, 10 stores and several arithmetic operations
> >> takes on average about 10 cycles per iteration, provided that the
> instructions are intermixed properly.
> >>
> >> So, what is the purpose of additional restrictions imposed on the
> >> scheduler in cortex-a57.md file? It doesn't look like an error.
> >> Rather, it looks like a deliberate decision.
> > When designing the model for the Cortex-A57 processor, I was primarily
> > trying to build a model which would increase the blend of utilized
> > pipelines on each cycle across a range of benchmarks, rather than to
> > accurately reflect the constraints listed in the Cortex-A57 Software
> > Optimisation Guide [1].
> >
> > My reasoning here is that the Cortex-A57 is a high-performance
> > processor, and an accurate model would be infeasible to build. Because
> > of this, it is unlikely that the model in GCC will be representative
> > of the true state of the processor, and consequently GCC may make
> > decisions which result in an instruction stream which would bias
> > towards one execution pipeline. In particular, given a less
> > restrictive model, GCC will try to hoist more loads to be earlier in
> > the basic block, which can result in less good utilization of the other
> execution pipelines.
> >
> > In my experiments, I found this model to be more beneficial across a
> > range of benchmarks than a model with the additional restrictions I
imposed
> relaxed.
> > I'd be happy to consider counter-examples where this modeling produces
> > suboptimal results - and where the changes you suggest are sufficient
> > to resolve the issue.
> >
> > Thanks,
> > James
> >
> > ---
> > [1]: Cortex-A57 Software Optimisation Guide
> >
> >
>
http://infocenter.arm.com/help/topic/com.arm.doc.uan0015a/cortex_a57_softwar
e
> _optimisation_guide_external.pdf
> >

RE: [AArch64] A question about Cortex-A57 pipeline description

Reply via email to