On Mon, 18 Jun 2012, William J. Schmidt wrote:

> On Mon, 2012-06-11 at 13:40 +0200, Richard Guenther wrote:
> > On Fri, 8 Jun 2012, William J. Schmidt wrote:
> > 
> <snip>
> > 
> > Hmm.  I don't like this patch or its general idea too much.  Instead
> > I'd like us to move more of the cost model detail to the target, giving
> > it a chance to look at the whole loop before deciding on a cost.  ISTR
> > posting the overall idea at some point, but let me repeat it here instead
> > of trying to find that e-mail.
> > 
> > The basic interface of the cost model should be, in targetm.vectorize
> > 
> >   /* Tell the target to start cost analysis of a loop or a basic-block
> >      (if the loop argument is NULL).  Returns an opaque pointer to
> >      target-private data.  */
> >   void *init_cost (struct loop *loop);
> > 
> >   /* Add cost for N vectorized-stmt-kind statements in vector_mode.  */
> >   void add_stmt_cost (void *data, unsigned n,
> >                   vectorized-stmt-kind,
> >                       enum machine_mode vector_mode);
> > 
> >   /* Tell the target to compute and return the cost of the accumulated
> >      statements and free any target-private data.  */
> >   unsigned finish_cost (void *data);
> > 
> > with eventually slightly different signatures for add_stmt_cost
> > (like pass in the original scalar stmt?).
> > 
> > It allows the target, at finish_cost time, to evaluate things like
> > register pressure and resource utilization.
> > 
> > Thanks,
> > Richard.
> 
> I've been looking at this in between other projects.  I wanted to be
> sure I understood the SLP infrastructure and whether it would cause any
> problems.  It looks to me like it will be mostly ok.  One issue I
> noticed is a possible difference in the order in which SLP instructions
> are analyzed and the order in which the instructions are "issued" during
> transformation.
> 
> For both loop analysis and basic block analysis, SLP trees are
> constructed and analyzed prior to examining other vectorizable
> instructions.  Their costs are calculated and stored in the SLP trees at
> this time.  Later, when transforming statements to their vector
> equivalents, instructions in the block (or loop body) are processed in
> order until the first instruction that's part of an SLP tree is
> encountered.  At that point, every instruction that's part of any SLP
> tree is transformed; then the vectorizer continues with the remaining
> non-SLP vectorizable statements.
> 
> So if we do the natural and easy thing of placing calls to add_stmt_cost
> everywhere that costs are calculated today, the order that those costs
> are presented to the back end model will possibly be different than the
> order they are actually "emitted."

Interesting.  But I suppose this is similar to how pattern statements
are handled?  Thus, the whole pattern sequence is processed when
we encounter the "main" pattern statement?

> For a first cut at this, I suggest ignoring the problem other than to
> document it as an opportunity for improvement.  Later we could improve
> it by using an add_stmt_slp_cost () interface (or adding an is_slp
> flag), and another interface to be called at the time during analysis
> when the SLP statements will be issued during transformation.  This
> would allow the back end model to queue up the SLP costs in a separate
> vector and later place them in its internal structures at the
> appropriate place.
>
> It should eventually be possible to remove these fields/accessors:
> 
>  * STMT_VINFO_{IN,OUT}SIDE_OF_LOOP_COST
>  * SLP_TREE_{IN,OUT}SIDE_OF_LOOP_COST
>  * SLP_INSTANCE_{IN,OUT}SIDE_OF_LOOP_COST
> 
> However, I think this should be delayed until we have the basic
> infrastructure in place for the new model and well-tested.

Indeed.
 
> The other issue is that we should have the model track both the inside
> and outside costs if we're going to get everything into the target
> model.  For a first pass we can ignore this and keep the existing logic
> for the outside costs.  Later we should add some interfaces analogous to
> add_stmt_cost such as add_stmt_prolog_cost and add_stmt_epilog_cost so
> the model can track this stuff as carefully as it wants to.

Outside costs are merely added to the niter * inner-cost metric to
be compared with the scalar cost niter * scalar-cost, right?  Thus
they would be tracked completely separate - eventually similar to
how we compute the cost of the scalar loop.

> So, I'd propose going at this in several phases:
> 
> (1) Add calls to the new interface without disturbing existing logic;
> modify the profitability algorithms to query the new model for inside
> costs.  Default algorithm for the model is to just sum costs as is done
> today.

Right.

> (x) Add heuristics to target models as desired.
> (2) Handle the SLP ordering problem.
> (3) Handle outside costs in the target model.
> (4) Remove the now unnecessary cost fields and the calls that set them.
> 
> Item (x) can happen anytime after item (1).
> 
> I don't think this work is terribly difficult, just a bit tedious.  The
> only really time-consuming aspect of it will be in very careful testing
> to keep from changing existing behavior.
> 
> All comments welcome -- please let me know what you think.

I think this is a reasonable plan.

Thanks,
Richard.

Reply via email to