On 10/01/17 17:18, Wilco Dijkstra wrote: > My previous change to the Cortex-A53 scheduler resulted in a 13% regression > on a > proprietary benchmark. This turned out to be due to non-optimal scheduling > of int > to float conversions. This patch separates int to FP transfers from int to > float > conversions based on experiments to determine the best schedule. As a result > of > these tweaks the performance of the benchmark improves by 20%. > > ChangeLog: > 2017-01-10 Wilco Dijkstra <wdijk...@arm.com> > > * config/arm/cortex-a53.md: Add bypasses for > cortex_a53_r2f_cvt. > (cortex_a53_r2f): Only use for transfers. > (cortex_a53_f2r): Likewise. > (cortex_a53_r2f_cvt): Add reservation for conversions. > (cortex_a53_f2r_cvt): Likewise. >
OK. R. > -- > > diff --git a/gcc/config/arm/cortex-a53.md b/gcc/config/arm/cortex-a53.md > index > 14822ba0ac0532aaf0dd29cff7a87e32e745cbe8..b367ad403a4a641da34521c17669027b87092737 > 100644 > --- a/gcc/config/arm/cortex-a53.md > +++ b/gcc/config/arm/cortex-a53.md > @@ -252,9 +252,18 @@ > "cortex_a53_r2f") > > (define_bypass 1 "cortex_a53_mul, > - cortex_a53_load*" > + cortex_a53_load1, > + cortex_a53_load2" > "cortex_a53_r2f") > > +(define_bypass 2 "cortex_a53_alu*" > + "cortex_a53_r2f_cvt") > + > +(define_bypass 3 "cortex_a53_mul, > + cortex_a53_load1, > + cortex_a53_load2" > + "cortex_a53_r2f_cvt") > + > ;; Model flag forwarding to branches. > > (define_bypass 0 "cortex_a53_alu*,cortex_a53_shift*" > @@ -514,16 +523,24 @@ > ;; Floating-point to/from core transfers. > ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; > > -(define_insn_reservation "cortex_a53_r2f" 6 > +(define_insn_reservation "cortex_a53_r2f" 2 > (and (eq_attr "tune" "cortexa53") > - (eq_attr "type" "f_mcr,f_mcrr,f_cvti2f, > - neon_from_gp, neon_from_gp_q")) > - "cortex_a53_slot_any,nothing*2,cortex_a53_fp_alu") > + (eq_attr "type" "f_mcr,f_mcrr")) > + "cortex_a53_slot_any,cortex_a53_fp_alu") > + > +(define_insn_reservation "cortex_a53_f2r" 4 > + (and (eq_attr "tune" "cortexa53") > + (eq_attr "type" "f_mrc,f_mrrc")) > + "cortex_a53_slot_any,cortex_a53_fp_alu") > + > +(define_insn_reservation "cortex_a53_r2f_cvt" 4 > + (and (eq_attr "tune" "cortexa53") > + (eq_attr "type" "f_cvti2f, neon_from_gp, neon_from_gp_q")) > + "cortex_a53_slot_any,cortex_a53_fp_alu") > > -(define_insn_reservation "cortex_a53_f2r" 6 > +(define_insn_reservation "cortex_a53_f2r_cvt" 5 > (and (eq_attr "tune" "cortexa53") > - (eq_attr "type" "f_mrc,f_mrrc,f_cvtf2i, > - neon_to_gp, neon_to_gp_q")) > + (eq_attr "type" "f_cvtf2i, neon_to_gp, neon_to_gp_q")) > "cortex_a53_slot_any,cortex_a53_fp_alu") > > ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; >