On June 21, 2016 5:50:26 PM GMT+02:00, James Greenhalgh <james.greenha...@arm.com> wrote: > >On Fri, Jun 03, 2016 at 12:39:42PM +0200, Richard Biener wrote: >> On Thu, Jun 2, 2016 at 6:53 PM, James Greenhalgh >> <james.greenha...@arm.com> wrote: >> > >> > Hi, >> > >> > This patch introduces a new target hook, to be used like >BRANCH_COST but >> > with a guaranteed unit of measurement. We want this to break away >from >> > the current ambiguous uses of BRANCH_COST. >> > >> > BRANCH_COST is used in ifcvt.c in two types of comparisons. One >against >> > instruction counts - where it is used as the limit on the number of >new >> > instructions we are permitted to generate. The other (after >multiplying >> > by COSTS_N_INSNS (1)) directly against RTX costs. >> > >> > Of these, a comparison against RTX costs is the more easily >understood >> > metric across the compiler, and the one I've pulled out to the new >hook. >> > To keep things consistent for targets which don't migrate, this new >hook >> > has a default value of BRANCH_COST * COSTS_N_INSNS (1). >> > >> > OK? >> >> How does the caller compute "predictable"? There are some archs >where >> an information on whether this is a forward or backward jump is more >> useful I guess. Also at least for !speed_p the distance of the >branch is >> important given not all targets support arbitrary branch offsets. > >Just through a call to predictable_edge_p. It isn't perfect. My worry >with adding more details of the branch is that you end up with a >nonsense >target implementation that tries way too hard to be clever. But, I >don't >mind passing the edge through to the target hook, that way a target has >it if they want it. In this patch revision, I pass the edge through. > >> I remember that at the last Cauldron we discussed to change things to >> compare costs of sequences of instructions rather than giving targets >no >> context with just asking for single (sub-)insn rtx costs. > >I've made better use of seq_cost in this respin. Bernd was right, >constructing dummy RTX just for costs, then discarding it, then >constructing the actual RTX for matching doesn't make sense as a >pipeline. >Better just to construct the real sequence and use the cost of that. > >In this patch revision, I started by removing the idea that this costs >a branch at all. It doesn't, the use of this hook is really a target >trying to limit if-convert to not end up pulling too much on to the >unconditional path. It seems better to expose that limit directly by >explicitly asking for the maximum cost of an unconditional sequence we >would create, and comparing against seq_cost of the new RTL. This saves >a target trying to figure out what is meant by a cost of a branch. > >Having done that, I think I can see a clearer path to getting the >default hook implementation in shape. I've introduced two new params, >which give maximum costs for the generated sequence (one for a >"predictable" >branch, one for "unpredictable") in the speed_p cases. I'm not >expecting it >to be useful to give the user control in the case we are compiling for >size - whether this is a size win or not is independent of whether the >branch is predictable. > >For the default implementation, if the parameters are not set, I just >multiply BRANCH_COST through by COSTS_N_INSNS (1) for size and >COSTS_N_INSNS (3) for speed. I know this is not ideal, but I'm still >short >of ideas on how best to form the default implementation.
How bad is it in e.g. CSiBE? >we're >still potentially going to introduce performance regressions for >targets >that don't provide an implementation of the new hook, or a default >value >for the new parameters. It does mean we can keep the testsuite clean by >setting parameter values suitably high for all targets that have >conditional move instructions. > >The new default causes some changes in generated conditional move >sequences >for x86_64. Whether these changes are for the better or not I can't >say. > >This first patch introduces the two new parameters, and uses them in >the >default implementation of the target hook. s/precitable/predictable/ ? Present tense in documentation (s/will try to/tries to/). s/should return/returns/ TARGET_MAX_NOCE_IFCVT_SEQ_COST (bool @var{speed_p}, edge @var{e}) talks about predictable_p but doesn't document e. +DEFPARAM (PARAM_MAX_RTL_IF_CONVERSION_UNPREDICTABLE_COST, + "max-rtl-if-conversion-unpredictable-cost", + "Maximum permissible cost for the sequence that would be " + "generated by the RTL if-conversion pass for a branch which " + "is considered predictable.", + 40, 0, 200) unpredictable. Present tense also in target.def. +@code{predictable_p} is true no predictable_p anymore but e missing in docs. /Then multiply through by/s/through by/with/ thanks, > >Bootstrapped on x86_64 and aarch64 with no issues. > >OK? > >Thanks, >James > >--- >2016-06-21 James Greenhalgh <james.greenha...@arm.com> > > * target.def (max_noce_ifcvt_seq_cost): New. > * doc/tm.texi.in (TARGET_MAX_NOCE_IFCVT_SEQ_COST): Document it. > * doc/tm.texi: Regenerate. > * targhooks.h (default_max_noce_ifcvt_seq_cost): New. > * targhooks.c (default_max_noce_ifcvt_seq_cost): New. > * params.def (PARAM_MAX_RTL_IF_CONVERSION_PREDICTABLE_COST): New. > (PARAM_MAX_RTL_IF_CONVERSION_UNPREDICTABLE_COST): Likewise. > * doc/invoke.texi: Document new params.