Hi Richard,

Thanks for the comments!

on 2020/6/2 上午1:59, Richard Sandiford wrote:
> Could you go into more detail about this choice of cost calculation?
> It looks like we first calculate per-group flags, which are true only if
> the unrolled offsets are valid for all uses in the group.  Then we create
> per-candidate flags when associating candidates with groups.
> 

Sure.  It checks every address type IV group to determine whether this
group is valid to use reg offset addressing mode.  Here we only need to
check the first one and the last one, since the intermediates should 
have been handled by split_address_groups.  With unrolling the
displacement of the address can be offset-ed by (UF-1)*step, check the
address with this max offset whether still valid.  If the check finds
it's valid to use reg offset mode for the whole group, we flag this
group.  Later, when we create IV candidate for address group flagged,
we flag the candidate further.  This flag is mainly for iv cand
costing, we don't need to scale up iv cand's step cost for this kind
of candidate.

Imagining this loop is being unrolled, all the statements will be
duplicated by UF.  For the cost modeling against iv group, it's
scaling up the cost by UF (here I simply excluded the compare_type
since in most cases it for loop ending check).  For the cost modeling
against iv candidate, it's to focus on step costs, for an iv candidate
we flagged before, it's taken as one time step cost, for the others,
it's scaling up the step cost since the unrolling make step 
calculation become UF times.

This cost modeling is trying to simulate cost change after the
unrolling, scaling up the costs accordingly.  There are somethings
to be improved like distinguish the loop ending compare or else,
whether need to tweak the other costs somehow since the scaling up
probably cause existing cost framework imbalance, but during
benchmarking I didn't find these matter, so take it as simple as 
possible for now.


> Instead, couldn't we take this into account in get_address_cost,
> which calculates the cost of an address use for a given candidate?
> E.g. after the main if-else at the start of the function,
> perhaps it would make sense to add the worst-case offset to
> the address in “parts”, check whether that too is a valid address,
> and if not, increase var_cost by the cost of one add instruction.
> 

IIUC, what you suggest is to tweak the iv group cost, if we find
one address group is valid for reg offset mode, we price more on
the pairs between this group and other non address-based iv cands.
The question is how do we decide this add-on cost.  For the test
case I was working on initially, adding one cost (of add) doesn't
work, the normal iv still wined.  We can price it more like two
but what's the justification on this value, by heuristics?

> I guess there are two main sources of inexactness if we do that:
> 
> (1) It might underestimate the cost because it assumes that vuse[0]
>     stands for all vuses in the group.
> 

Do you mean we don't need one check function like mark_reg_offset_groups?
If without it, vuse[0] might be not enough since we can't ensure the
others are fine with additional displacement from unrolling.  If we still
have it, I think it's fine to just use vuse[0].

> (2) It might overestimates the cost because it treats all unrolled
>     iterations as having the cost of the final unrolled iteration.
>
> (1) could perhaps be avoided by adding a flag to the iv_use to say
> whether it wants this treatment.  I think the flag approach suffers
> from (2) too, and I'd be surprised if it makes a difference in practice.
> 

Sorry, I didn't have the whole picture how to deal with uf for your proposal.
But the flag approach considers uf in iv group cost calculation as well as
iv cand step cost calculation.

BR,
Kewen

> Thanks,
> Richard
> 

Reply via email to