On Wed, Jul 26, 2017 at 05:00:05PM +0100, Tamar Christina wrote: > Hi James, > > I have updated the patch and have responded to your question blow. > > Ok for trunk?
Ok, with a few small changes. > > > static bool > > > @@ -5857,12 +5955,6 @@ aarch64_preferred_reload_class (rtx x, > > reg_class_t regclass) > > > return NO_REGS; > > > } > > > > > > - /* If it's an integer immediate that MOVI can't handle, then > > > - FP_REGS is not an option, so we return NO_REGS instead. */ > > > - if (CONST_INT_P (x) && reg_class_subset_p (regclass, FP_REGS) > > > - && !aarch64_simd_imm_scalar_p (x, GET_MODE (x))) > > > - return NO_REGS; > > > - > > > > I don't understand this relaxation could you explain what this achieves and > > why it is safe in this patch? > > Because this should be left up to the pattern to decide what should happen > and not reload. Leaving the check here also means you'll do a reasonably > expensive check twice for each constant > you can emit a move for. > > Removing extra restriction on the constant classes leaves it up to > aarch64_legitimate_constant_p to decide if if the constant can be emitted as > a move or should be forced to memory. > aarch64_legitimate_constant_p also calls aarch64_cannot_force_const_mem. > > The documentation for TARGET_PREFERRED_RELOAD_CLASS also states: > > "One case where TARGET_PREFERRED_RELOAD_CLASS must not return rclass is if x > is a legitimate constant which cannot be loaded into some register class. By > returning NO_REGS you can force x into a memory location. For example, rs6000 > can load immediate values into general-purpose registers, but does not have > an instruction for loading an immediate value into a floating-point register, > so TARGET_PREFERRED_RELOAD_CLASS returns NO_REGS when x is a floating-point > constant. If the constant can't be loaded into any kind of register, code > generation will be better if TARGET_LEGITIMATE_CONSTANT_P makes the constant > illegitimate instead of using TARGET_PREFERRED_RELOAD_CLASS. " > > So it seems that not only did the original constraint not add anything, we > may also generate better code now. Thanks for the explanation. > diff --git a/gcc/config/aarch64/aarch64-protos.h > b/gcc/config/aarch64/aarch64-protos.h > index > e397ff4afa73cfbc7e192fd5686b1beff9bbbadf..fd20576d23cfdc48761f65e41762b2a5a71f3e61 > 100644 > --- a/gcc/config/aarch64/aarch64-protos.h > +++ b/gcc/config/aarch64/aarch64-protos.h > @@ -326,6 +326,8 @@ bool aarch64_emit_approx_sqrt (rtx, rtx, bool); > void aarch64_expand_call (rtx, rtx, bool); > bool aarch64_expand_movmem (rtx *); > bool aarch64_float_const_zero_rtx_p (rtx); > +bool aarch64_float_const_rtx_p (rtx); > +bool aarch64_can_const_movi_rtx_p (rtx x, machine_mode mode); This list should be alphabetical, first by type, then by name. So I'd expect this to fit in just above aarch64_const_vec_all_same_int_p . > bool aarch64_function_arg_regno_p (unsigned); > bool aarch64_fusion_enabled_p (enum aarch64_fusion_pairs); > bool aarch64_gen_movmemqi (rtx *); > @@ -353,7 +355,6 @@ bool aarch64_regno_ok_for_base_p (int, bool); > bool aarch64_regno_ok_for_index_p (int, bool); > bool aarch64_simd_check_vect_par_cnst_half (rtx op, machine_mode mode, > bool high); > -bool aarch64_simd_imm_scalar_p (rtx x, machine_mode mode); > bool aarch64_simd_imm_zero_p (rtx, machine_mode); > bool aarch64_simd_scalar_immediate_valid_for_move (rtx, machine_mode); > bool aarch64_simd_shift_imm_p (rtx, machine_mode, bool); > @@ -488,4 +489,6 @@ std::string aarch64_get_extension_string_for_isa_flags > (unsigned long, > > rtl_opt_pass *make_pass_fma_steering (gcc::context *ctxt); > > +bool aarch64_reinterpret_float_as_int (rtx value, unsigned HOST_WIDE_INT > *fail); This isn't defined in common/config/aarch64-common.c so shouldn't be in the section for functions which will be defined there. It should be in the list above between aarch64_regno_ok_for_index_p and aarch64_simd_check_vect_par_cnst_half. > + > #endif /* GCC_AARCH64_PROTOS_H */ > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index > d753666ef67fc524260c41f36743df3649a0a98a..b1ddd77823e50e63439e497f695f3fad9bd9efc9 > 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -147,6 +147,8 @@ static bool aarch64_builtin_support_vector_misalignment > (machine_mode mode, > const_tree type, > int misalignment, > bool is_packed); > +static machine_mode > +aarch64_simd_container_mode (machine_mode mode, unsigned width); > > /* Major revision number of the ARM Architecture implemented by the target. > */ > unsigned aarch64_architecture_version; > @@ -4723,6 +4725,62 @@ aarch64_legitimize_address_displacement (rtx *disp, > rtx *off, machine_mode mode) > return true; > } > > +/* Return the binary representation of floating point constant VALUE in > INTVAL. > + If the value cannot be converted, return false without setting INTVAL. > + The conversion is done in the given MODE. */ > +bool > +aarch64_reinterpret_float_as_int (rtx value, unsigned HOST_WIDE_INT *intval) > +{ > + machine_mode mode = GET_MODE (value); > + if (GET_CODE (value) != CONST_DOUBLE > + || !SCALAR_FLOAT_MODE_P (mode) > + || GET_MODE_BITSIZE (mode) > HOST_BITS_PER_WIDE_INT) > + return false; > + > + unsigned HOST_WIDE_INT ival = 0; > + > + /* Only support up to DF mode. */ > + gcc_assert (GET_MODE_BITSIZE (mode) <= GET_MODE_BITSIZE (DFmode)); I'm somewhere a bit hypothetical here, but on a machine with a 128-bit HOST_WIDE_INT, you would hit this case and take the assert. Just make this another case in your if statement above: machine_mode mode = GET_MODE (value); if (GET_CODE (value) != CONST_DOUBLE || !SCALAR_FLOAT_MODE_P (mode) || GET_MODE_BITSIZE (mode) > HOST_BITS_PER_WIDE_INT || GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (DFmode)) return false; > + long res[2]; > + real_to_target (res, > + CONST_DOUBLE_REAL_VALUE (value), > + REAL_MODE_FORMAT (mode)); > + > + ival = zext_hwi (res[0], 32); > + if (GET_MODE_BITSIZE (mode) == GET_MODE_BITSIZE (DFmode)) > + ival |= (zext_hwi (res[1], 32) << 32); > + > + *intval = ival; > + return true; > +} > + Thanks, James