"Kewen.Lin" <li...@linux.ibm.com> writes:
> Hi Richard,
>
> Thanks for the comments!
>
> on 2020/6/2 上午1:59, Richard Sandiford wrote:
>> Could you go into more detail about this choice of cost calculation?
>> It looks like we first calculate per-group flags, which are true only if
>> the unrolled offsets are valid for all uses in the group.  Then we create
>> per-candidate flags when associating candidates with groups.
>> 
>
> Sure.  It checks every address type IV group to determine whether this
> group is valid to use reg offset addressing mode.  Here we only need to
> check the first one and the last one, since the intermediates should 
> have been handled by split_address_groups.  With unrolling the
> displacement of the address can be offset-ed by (UF-1)*step, check the
> address with this max offset whether still valid.  If the check finds
> it's valid to use reg offset mode for the whole group, we flag this
> group.  Later, when we create IV candidate for address group flagged,
> we flag the candidate further.  This flag is mainly for iv cand
> costing, we don't need to scale up iv cand's step cost for this kind
> of candidate.

But AIUI, this is calculating whether the uses in their original form
support all unrolled offsets.  For ivopts, I think the question is really
whether the uses support all unrolled offsets when based on a given IV
candidate (which might not be the original IV).

E.g. there might be another IV candidate at a constant offset
from the original one, and the offsets might all be in range
for that offset too.

> Imagining this loop is being unrolled, all the statements will be
> duplicated by UF.  For the cost modeling against iv group, it's
> scaling up the cost by UF (here I simply excluded the compare_type
> since in most cases it for loop ending check).  For the cost modeling
> against iv candidate, it's to focus on step costs, for an iv candidate
> we flagged before, it's taken as one time step cost, for the others,
> it's scaling up the step cost since the unrolling make step 
> calculation become UF times.
>
> This cost modeling is trying to simulate cost change after the
> unrolling, scaling up the costs accordingly.  There are somethings
> to be improved like distinguish the loop ending compare or else,
> whether need to tweak the other costs somehow since the scaling up
> probably cause existing cost framework imbalance, but during
> benchmarking I didn't find these matter, so take it as simple as 
> possible for now.
>
>
>> Instead, couldn't we take this into account in get_address_cost,
>> which calculates the cost of an address use for a given candidate?
>> E.g. after the main if-else at the start of the function,
>> perhaps it would make sense to add the worst-case offset to
>> the address in “parts”, check whether that too is a valid address,
>> and if not, increase var_cost by the cost of one add instruction.
>> 
>
> IIUC, what you suggest is to tweak the iv group cost, if we find
> one address group is valid for reg offset mode, we price more on
> the pairs between this group and other non address-based iv cands.
> The question is how do we decide this add-on cost.  For the test
> case I was working on initially, adding one cost (of add) doesn't
> work, the normal iv still wined.  We can price it more like two
> but what's the justification on this value, by heuristics?

Yeah, I was thinking of adding one instance of add_cost.  If that
doesn't work, it'd be interesting to know why in more detail.

>> I guess there are two main sources of inexactness if we do that:
>> 
>> (1) It might underestimate the cost because it assumes that vuse[0]
>>     stands for all vuses in the group.
>> 
>
> Do you mean we don't need one check function like mark_reg_offset_groups?
> If without it, vuse[0] might be not enough since we can't ensure the
> others are fine with additional displacement from unrolling.  If we still
> have it, I think it's fine to just use vuse[0].
>
>> (2) It might overestimates the cost because it treats all unrolled
>>     iterations as having the cost of the final unrolled iteration.
>>
>> (1) could perhaps be avoided by adding a flag to the iv_use to say
>> whether it wants this treatment.  I think the flag approach suffers
>> from (2) too, and I'd be surprised if it makes a difference in practice.
>> 
>
> Sorry, I didn't have the whole picture how to deal with uf for your proposal.
> But the flag approach considers uf in iv group cost calculation as well as
> iv cand step cost calculation.
>
> BR,
> Kewen
>
>> Thanks,
>> Richard
>> 

Reply via email to