On Wed, Apr 27, 2016 at 04:13:33PM -0500, Evandro Menezes wrote: > gcc/ > * config/aarch64/aarch64-protos.h > (AARCH64_APPROX_MODE): New macro. > (AARCH64_APPROX_{NONE,SP,DP,DFORM,QFORM,SCALAR,VECTOR,ALL}): > Likewise. > (tune_params): New member "approx_rsqrt_modes". > * config/aarch64/aarch64-tuning-flags.def > (AARCH64_EXTRA_TUNE_APPROX_RSQRT): Remove macro. > * config/aarch64/aarch64.c > (generic_tunings): New member "approx_rsqrt_modes". > (cortexa35_tunings): Likewise. > (cortexa53_tunings): Likewise. > (cortexa57_tunings): Likewise. > (cortexa72_tunings): Likewise. > (exynosm1_tunings): Likewise. > (thunderx_tunings): Likewise. > (xgene1_tunings): Likewise. > (use_rsqrt_p): New argument for the mode and use new member from > "tune_params". > (aarch64_builtin_reciprocal): Devise mode from builtin. > (aarch64_optab_supported_p): New argument for the mode. > * doc/invoke.texi (-mlow-precision-recip-sqrt): Reword description. > > diff --git a/gcc/config/aarch64/aarch64-protos.h > b/gcc/config/aarch64/aarch64-protos.h > index f22a31c..50f1d24 100644 > --- a/gcc/config/aarch64/aarch64-protos.h > +++ b/gcc/config/aarch64/aarch64-protos.h > @@ -178,6 +178,32 @@ struct cpu_branch_cost > const int unpredictable; /* Unpredictable branch or optimizing for speed. > */ > }; > > +/* Control approximate alternatives to certain FP operators. */ > +#define AARCH64_APPROX_MODE(MODE) \ > + ((MIN_MODE_FLOAT <= (MODE) && (MODE) <= MAX_MODE_FLOAT) \ > + ? (1 << ((MODE) - MIN_MODE_FLOAT)) \ > + : (MIN_MODE_VECTOR_FLOAT <= (MODE) && (MODE) <= MAX_MODE_VECTOR_FLOAT) \ > + ? (1 << ((MODE) - MIN_MODE_VECTOR_FLOAT \ > + + MAX_MODE_FLOAT - MIN_MODE_FLOAT + 1)) \ > + : (0)) > +#define AARCH64_APPROX_NONE (0) > +#define AARCH64_APPROX_SP (AARCH64_APPROX_MODE (SFmode) \ > + | AARCH64_APPROX_MODE (V2SFmode) \ > + | AARCH64_APPROX_MODE (V4SFmode)) > +#define AARCH64_APPROX_DP (AARCH64_APPROX_MODE (DFmode) \ > + | AARCH64_APPROX_MODE (V2DFmode)) > +#define AARCH64_APPROX_DFORM (AARCH64_APPROX_MODE (SFmode) \ > + | AARCH64_APPROX_MODE (DFmode) \ > + | AARCH64_APPROX_MODE (V2SFmode)) > +#define AARCH64_APPROX_QFORM (AARCH64_APPROX_MODE (V4SFmode) \ > + | AARCH64_APPROX_MODE (V2DFmode)) > +#define AARCH64_APPROX_SCALAR (AARCH64_APPROX_MODE (SFmode) \ > + | AARCH64_APPROX_MODE (DFmode)) > +#define AARCH64_APPROX_VECTOR (AARCH64_APPROX_MODE (V2SFmode) \ > + | AARCH64_APPROX_MODE (V4SFmode) \ > + | AARCH64_APPROX_MODE (V2DFmode)) > +#define AARCH64_APPROX_ALL (-1) > +
Thanks for providing these various subsets, but I think they are unneccesary for the final submission. From what I can see, only AARCH64_APPROX_ALL and AARCH64_APPROX_NONE are used. Please remove the rest, they are easy enough to add back if a subtarget wants them. > struct tune_params > { > const struct cpu_cost_table *insn_extra_cost; > @@ -218,6 +244,7 @@ struct tune_params > } autoprefetcher_model; > > unsigned int extra_tuning_flags; > + unsigned int approx_rsqrt_modes; As we're going to add a few of these, lets follow the approach for some of the other costs (e.g. branch costs, vector costs) and bury them in a structure of their own. > }; > > #define AARCH64_FUSION_PAIR(x, name) \ > diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def > b/gcc/config/aarch64/aarch64-tuning-flags.def > index 7e45a0c..048c2a3 100644 > --- a/gcc/config/aarch64/aarch64-tuning-flags.def > +++ b/gcc/config/aarch64/aarch64-tuning-flags.def > @@ -29,5 +29,3 @@ > AARCH64_TUNE_ to give an enum name. */ > > AARCH64_EXTRA_TUNING_OPTION ("rename_fma_regs", RENAME_FMA_REGS) > -AARCH64_EXTRA_TUNING_OPTION ("approx_rsqrt", APPROX_RSQRT) > - Did you want to add another way to tune these by command line (not neccessary now, but as a follow-up)? See how instruction fusion is handled by the -moverride code for an example. Thanks, James