"Kewen.Lin" <li...@linux.ibm.com> writes: > Hi Richard, > > Thanks for the comments! > > on 2020/6/2 上午1:59, Richard Sandiford wrote: >> Could you go into more detail about this choice of cost calculation? >> It looks like we first calculate per-group flags, which are true only if >> the unrolled offsets are valid for all uses in the group. Then we create >> per-candidate flags when associating candidates with groups. >> > > Sure. It checks every address type IV group to determine whether this > group is valid to use reg offset addressing mode. Here we only need to > check the first one and the last one, since the intermediates should > have been handled by split_address_groups. With unrolling the > displacement of the address can be offset-ed by (UF-1)*step, check the > address with this max offset whether still valid. If the check finds > it's valid to use reg offset mode for the whole group, we flag this > group. Later, when we create IV candidate for address group flagged, > we flag the candidate further. This flag is mainly for iv cand > costing, we don't need to scale up iv cand's step cost for this kind > of candidate.
But AIUI, this is calculating whether the uses in their original form support all unrolled offsets. For ivopts, I think the question is really whether the uses support all unrolled offsets when based on a given IV candidate (which might not be the original IV). E.g. there might be another IV candidate at a constant offset from the original one, and the offsets might all be in range for that offset too. > Imagining this loop is being unrolled, all the statements will be > duplicated by UF. For the cost modeling against iv group, it's > scaling up the cost by UF (here I simply excluded the compare_type > since in most cases it for loop ending check). For the cost modeling > against iv candidate, it's to focus on step costs, for an iv candidate > we flagged before, it's taken as one time step cost, for the others, > it's scaling up the step cost since the unrolling make step > calculation become UF times. > > This cost modeling is trying to simulate cost change after the > unrolling, scaling up the costs accordingly. There are somethings > to be improved like distinguish the loop ending compare or else, > whether need to tweak the other costs somehow since the scaling up > probably cause existing cost framework imbalance, but during > benchmarking I didn't find these matter, so take it as simple as > possible for now. > > >> Instead, couldn't we take this into account in get_address_cost, >> which calculates the cost of an address use for a given candidate? >> E.g. after the main if-else at the start of the function, >> perhaps it would make sense to add the worst-case offset to >> the address in “parts”, check whether that too is a valid address, >> and if not, increase var_cost by the cost of one add instruction. >> > > IIUC, what you suggest is to tweak the iv group cost, if we find > one address group is valid for reg offset mode, we price more on > the pairs between this group and other non address-based iv cands. > The question is how do we decide this add-on cost. For the test > case I was working on initially, adding one cost (of add) doesn't > work, the normal iv still wined. We can price it more like two > but what's the justification on this value, by heuristics? Yeah, I was thinking of adding one instance of add_cost. If that doesn't work, it'd be interesting to know why in more detail. >> I guess there are two main sources of inexactness if we do that: >> >> (1) It might underestimate the cost because it assumes that vuse[0] >> stands for all vuses in the group. >> > > Do you mean we don't need one check function like mark_reg_offset_groups? > If without it, vuse[0] might be not enough since we can't ensure the > others are fine with additional displacement from unrolling. If we still > have it, I think it's fine to just use vuse[0]. > >> (2) It might overestimates the cost because it treats all unrolled >> iterations as having the cost of the final unrolled iteration. >> >> (1) could perhaps be avoided by adding a flag to the iv_use to say >> whether it wants this treatment. I think the flag approach suffers >> from (2) too, and I'd be surprised if it makes a difference in practice. >> > > Sorry, I didn't have the whole picture how to deal with uf for your proposal. > But the flag approach considers uf in iv group cost calculation as well as > iv cand step cost calculation. > > BR, > Kewen > >> Thanks, >> Richard >>