Christophe Lyon via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > @@ -8464,10 +8464,18 @@ aarch64_gen_storewb_pair (machine_mode mode, rtx > base, rtx reg, rtx reg2, > return gen_storewb_pairdf_di (base, base, reg, reg2, > GEN_INT (-adjustment), > GEN_INT (UNITS_PER_WORD - adjustment)); > + case E_DDmode: > + return gen_storewb_pairdd_di (base, base, reg, reg2, > + GEN_INT (-adjustment), > + GEN_INT (UNITS_PER_WORD - adjustment)); > case E_TFmode: > return gen_storewb_pairtf_di (base, base, reg, reg2, > GEN_INT (-adjustment), > GEN_INT (UNITS_PER_VREG - adjustment)); > + case E_TDmode: > + return gen_storewb_pairtd_di (base, base, reg, reg2, > + GEN_INT (-adjustment), > + GEN_INT (UNITS_PER_VREG - adjustment)); > default: > gcc_unreachable (); > } > @@ -8510,9 +8518,15 @@ aarch64_gen_loadwb_pair (machine_mode mode, rtx base, > rtx reg, rtx reg2, > case E_DFmode: > return gen_loadwb_pairdf_di (base, base, reg, reg2, GEN_INT > (adjustment), > GEN_INT (UNITS_PER_WORD)); > + case E_DDmode: > + return gen_loadwb_pairdd_di (base, base, reg, reg2, GEN_INT > (adjustment), > + GEN_INT (UNITS_PER_WORD)); > case E_TFmode: > return gen_loadwb_pairtf_di (base, base, reg, reg2, GEN_INT > (adjustment), > GEN_INT (UNITS_PER_VREG)); > + case E_TDmode: > + return gen_loadwb_pairtd_di (base, base, reg, reg2, GEN_INT > (adjustment), > + GEN_INT (UNITS_PER_VREG)); > default: > gcc_unreachable (); > }
Are these changes needed? I would only have expected them to be used for stack pushes and pops. > @@ -8561,9 +8575,15 @@ aarch64_gen_store_pair (machine_mode mode, rtx mem1, > rtx reg1, rtx mem2, > case E_DFmode: > return gen_store_pair_dw_dfdf (mem1, reg1, mem2, reg2); > > + case E_DDmode: > + return gen_store_pair_dw_dddd (mem1, reg1, mem2, reg2); > + > case E_TFmode: > return gen_store_pair_dw_tftf (mem1, reg1, mem2, reg2); > > + case E_TDmode: > + return gen_store_pair_dw_tdtd (mem1, reg1, mem2, reg2); > + > case E_V4SImode: > return gen_vec_store_pairv4siv4si (mem1, reg1, mem2, reg2); > > @@ -8590,9 +8610,15 @@ aarch64_gen_load_pair (machine_mode mode, rtx reg1, > rtx mem1, rtx reg2, > case E_DFmode: > return gen_load_pair_dw_dfdf (reg1, mem1, reg2, mem2); > > + case E_DDmode: > + return gen_load_pair_dw_dddd (reg1, mem1, reg2, mem2); > + > case E_TFmode: > return gen_load_pair_dw_tftf (reg1, mem1, reg2, mem2); > > + case E_TDmode: > + return gen_load_pair_dw_tdtd (reg1, mem1, reg2, mem2); > + > case E_V4SImode: > return gen_load_pairv4siv4si (reg1, mem1, reg2, mem2); Same idea here, in that I think these would only be used for prologue/epilogue and block operations. > @@ -20362,7 +20407,8 @@ aarch64_vfp_is_call_or_return_candidate (machine_mode > mode, > machine_mode new_mode = VOIDmode; > bool composite_p = aarch64_composite_type_p (type, mode); > > - if ((!composite_p && GET_MODE_CLASS (mode) == MODE_FLOAT) > + if ((!composite_p && ((GET_MODE_CLASS (mode) == MODE_FLOAT) > + || GET_MODE_CLASS (mode) == MODE_DECIMAL_FLOAT)) Formatting nit: doubled brackets. Now that the condition is spread over multiple lines, the && should start a new line: if ((!composite_p && (GET_MODE_CLASS (mode) == MODE_FLOAT) || GET_MODE_CLASS (mode) == MODE_DECIMAL_FLOAT) > -;; Iterator for all scalar floating point modes (SF, DF and TF) > -(define_mode_iterator GPF_TF [SF DF TF]) > +;; Iterator for all scalar floating point modes (SF, DF, TF SD, DD, and TD) > +(define_mode_iterator GPF_TF [SF DF TF SD DD TD]) Missing comma after “TF”. The patch has some changes to the rtx costs for constants. What happens with aarch64_reinterpret_float_as_int? (Genuine question.) Do dfp constants get through and do they get converted to an integer? If so, I guess we need to handle DDmode like DFmode there. Related, aarch64_legitimate_constant_p has: /* Support CSE and rematerialization of common constants. */ if (CONST_INT_P (x) || (CONST_DOUBLE_P (x) && GET_MODE_CLASS (mode) == MODE_FLOAT)) return true; which looks it would exclude DFP constants. Perhaps we should simply remove the MODE_FLOAT condition, since on AArch64 all CONST_DOUBLEs are/must be FP constants. The patch changes the scalar handling in aapcs_vfp_sub_candidate, but not the complex handling. Is that deliberate? I wes wondering whether it would be worth adding some abstraction, since the same checks appear multiple times. But the only things I could think of would have made things worse, not better, so I agree open-coding the mode checks is the way to go. The other parts of the patch look good. Thanks, Richard