> > Given review comment already pointed out big-endian issue and patch was > updated to address it, I would expect reg-test on a big-endian target before > applying patch, right?
The patch spent 6 months in external review. Given that, I simply forgot to rerun big endian before the commit as I did the rest. The failing tests were all added after the submission of this patch. I'll have a look. > Thanks, > bin > > > > OK for trunk? > > > > Thanks, > > Tamar > > > > > > gcc/ > > 2017-06-26 Tamar Christina <tamar.christ...@arm.com> > > Richard Sandiford <richard.sandif...@linaro.org> > > > > * config/aarch64/aarch64.md (mov<mode>): Generalize. > > (*movhf_aarch64, *movsf_aarch64, *movdf_aarch64): > > Add integer and movi cases. > > (movi-split-hf-df-sf split, fp16): New. > > (enabled): Added TARGET_FP_F16INST. > > * config/aarch64/iterators.md (GPF_HF): New. > > ________________________________________ > > From: Tamar Christina > > Sent: Wednesday, June 21, 2017 11:48:33 AM > > To: James Greenhalgh > > Cc: GCC Patches; nd; Marcus Shawcroft; Richard Earnshaw > > Subject: RE: [PATCH][GCC][AArch64] optimize float immediate moves (2 /4) > - HF/DF/SF mode. > > > >> > movi\\t%0.4h, #0 > >> > - mov\\t%0.h[0], %w1 > >> > + fmov\\t%s0, %w1 > >> > >> Should this not be %h0? > > > > The problem is that H registers are only available in ARMv8.2+, I'm > > not sure what to do about ARMv8.1 given your other feedback Pointing > > out that the bit patterns between how it's stored in s vs h registers > > differ. > > > >> > >> > umov\\t%w0, %1.h[0] > >> > mov\\t%0.h[0], %1.h[0] > >> > + fmov\\t%s0, %1 > >> > >> Likewise, and much more important for correctness as it changes the > >> way the bit pattern ends up in the register (see table C2-1 in > >> release B.a of the ARM Architecture Reference Manual for ARMv8-A), > here. > >> > >> > + * return aarch64_output_scalar_simd_mov_immediate > (operands[1], > >> > + SImode); > >> > ldr\\t%h0, %1 > >> > str\\t%h1, %0 > >> > ldrh\\t%w0, %1 > >> > strh\\t%w1, %0 > >> > mov\\t%w0, %w1" > >> > - [(set_attr "type" > >> "neon_move,neon_from_gp,neon_to_gp,neon_move,\ > >> > - f_loads,f_stores,load1,store1,mov_reg") > >> > - (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")] > >> > + "&& can_create_pseudo_p () > >> > + && !aarch64_can_const_movi_rtx_p (operands[1], HFmode) > >> > + && !aarch64_float_const_representable_p (operands[1]) > >> > + && aarch64_float_const_rtx_p (operands[1])" > >> > + [(const_int 0)] > >> > + "{ > >> > + unsigned HOST_WIDE_INT ival; > >> > + if (!aarch64_reinterpret_float_as_int (operands[1], &ival)) > >> > + FAIL; > >> > + > >> > + rtx tmp = gen_reg_rtx (SImode); > >> > + aarch64_expand_mov_immediate (tmp, GEN_INT (ival)); > >> > + tmp = simplify_gen_subreg (HImode, tmp, SImode, 0); > >> > + emit_move_insn (operands[0], gen_lowpart (HFmode, tmp)); > >> > + DONE; > >> > + }" > >> > + [(set_attr "type" > "neon_move,f_mcr,neon_to_gp,neon_move,fconsts, > >> \ > >> > + neon_move,f_loads,f_stores,load1,store1,mov_reg") > >> > + (set_attr "simd" "yes,*,yes,yes,*,yes,*,*,*,*,*")] > >> > ) > >> > >> Thanks, > >> James > >