Re: [PATCH, ARM] Improve robustness of -mslow-flash-data

Thomas Preudhomme Tue, 11 Dec 2018 08:09:53 -0800

Hi Kyrill,

I've tested on armeb-none-eabi with -mslow-flash-data for both
-mfloat-abi=hard and -mfloat-abi=soft. Both show no regression and the
former shows some new PASS.


Regarding the part you are hesitant about, the code was taken from
aarch64_reinterpret_float_as_int in config/aarch64/aarch64.c. I'm not
too keen on splitting the patch unless it's just for review (ie still
committed as one) since the changes really go together. The tighter
predicate and constraint are to prevent normal pattern to match when
-mslow-flash-data is in effect while the new splitter and expander is
to deal with load under those circumstances.

Best regards,

Thomas
On Fri, 30 Nov 2018 at 14:11, Kyrill Tkachov
<kyrylo.tkac...@foss.arm.com> wrote:
>
> Hi Thomas,
>
> On 19/11/18 17:56, Thomas Preudhomme wrote:
> > Hi,
> >
> > Current code to handle -mslow-flash-data in machine description files
> > suffers from a number of issues which this patch fixes:
> >
> > 1) The insn_and_split in vfp.md to load a generic floating-point
> > constant via GPR first and move it to VFP register are guarded by
> > !reload_completed which is forbidden explicitely in the GCC internals
> > documentation section 17.2 point 3;
> >
> > 2) A number of testcase in the testsuite ICEs under -mslow-flash-data
> > when targeting the hardfloat ABI [1];
> >
> > 3) Instructions performing load from literal pool are not disabled.
> >
> > These problems are addressed by 2 separate actions:
> >
> > 1) Making the splitters take a clobber and changing the expanders
> > accordingly to generate a mov with clobber in cases where a literal
> > pool would be used. The splitter can thus be enabled after reload since
> > it does not call gen_reg_rtx anymore;
> >
> > 2) Adding new predicates and constraints to disable literal pool loads
> > in existing instructions when -mslow-flash-data is in effect.
> >
>
> Please split these into two separate patches so we can more clearly see which 
> changes address which problem
>
> > The patch also rework the splitter for DFmode slightly to generate an
> > intermediate DI load instead of 2 intermediate SI loads, thus relying on
> > the existing DI splitters instead of redoing their job. At last, the
> > patch adds some missing arm_fp_ok effective target to some of the
> > slow-flash-data testcases.
> >
> > [1]
> > c-c++-common/Wunused-var-3.c
> > gcc.c-torture/compile/pr72771.c
> > gcc.c-torture/compile/vector-5.c
> > gcc.c-torture/compile/vector-6.c
> > gcc.c-torture/execute/20030914-1.c
> > gcc.c-torture/execute/20050316-1.c
> > gcc.c-torture/execute/pr59643.c
> > gcc.dg/builtin-tgmath-1.c
> > gcc.dg/debug/pr55730.c
> > gcc.dg/graphite/interchange-7.c
> > gcc.dg/pr56890-2.c
> > gcc.dg/pr68474.c
> > gcc.dg/pr80286.c
> > gcc.dg/torture/pr35227.c
> > gcc.dg/torture/pr65077.c
> > gcc.dg/torture/pr86363.c
> > g++.dg/torture/pr81112.C
> > g++.dg/torture/pr82985.C
> > g++.dg/warn/Wunused-var-7.C
> > and a lot more in libstdc++ in special_functions/*_comp_ellint_* and
> > special_functions/*_ellint_* directories.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-11-14  Thomas Preud'homme <thomas.preudho...@arm.com>
> >
> >         * config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and
> >         source is a constant that would be loaded by literal pool.
> >         (movsf expander): Generate a no_literal_pool_sf_immediate insn if
> >         -mslow-flash-data is present, targeting hardfloat ABI and source is 
> > a
> >         float constant that cannot be loaded via vmov.
> >         (movdf expander): Likewise but generate a 
> > no_literal_pool_df_immediate
> >         insn.
> >         (arm_movsf_soft_insn): Split if -mslow-flash-data and source is a
> >         float constant that would be loaded by literal pool.
> >         (softfloat constant movsf splitter): Splitter for the above case.
> >         (movdf_soft_insn): Split if -mslow-flash-data and source is a float
> >         constant that would be loaded by literal pool.
> >         (softfloat constant movdf splitter): Splitter for the above case.
> >         * config/arm/constraints.md (Pz): Document existing constraint.
> >         (Ha): Define constraint.
> >         (Tu): Likewise.
> >         * config/arm/predicates.md (hard_sf_operand): New predicate.
> >         (hard_df_operand): Likewise.
> >         * config/arm/thumb2.md (thumb2_movsi_insn): Split if
> >         -mslow-flash-data and constant would be loaded by literal pool.
> >         * constant/arm/vfp.md (thumb2_movsi_vfp): Likewise and disable 
> > constant
> >         load in VFP register.
> >         (movdi_vfp): Likewise.
> >         (thumb2_movsf_vfp): Use hard_sf_operand as predicate for source to
> >         prevent match for a constant load if -mslow-flash-data and constant
> >         cannot be loaded via vmov.  Adapt constraint accordingly by
> >         using Ha instead of E for generic floating-point constant load.
> >         (thumb2_movdf_vfp): Likewise using hard_df_operand predicate 
> > instead.
> >         (no_literal_pool_df_immediate): Add a clobber to use as the
> >         intermediate general purpose register and also enable it after 
> > reload
> >         but disable it constant is a valid FP constant.  Add constraints and
> >         generate a DI intermediate load rather than 2 SI loads.
> >         (no_literal_pool_sf_immediate): Add a clobber to use as the
> >         intermediate general purpose register and also enable it after
> >         reload.
> >
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2018-11-14  Thomas Preud'homme <thomas.preudho...@arm.com>
> >
> >         * gcc.target/arm/thumb2-slow-flash-data-2.c: Require arm_fp_ok
> >         effective target.
> >         * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> >         * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> >         * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> >
> > Testing: Built arm-none-eabi cross compilers for Armv7E-M defaulting to
> > softfloat and hardfloat ABI which showed no regression and some
> > FAIL->PASS for hardfloat ABI. Bootstraped on Arm and Thumb-2 without any
> > regression. Compiled SPEC2k6 without -mslow-flash-data and checked that
> > code generation didn't change.
> >
> > Is this ok for stage3?
> >
> > Best regards,
> >
> > Thomas
>
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 
> a773518cefaf8451e77fead9e072ee8ef39f1eb8..a08298bbb9f93fc132aa64a206fad64dcda9ed65
>  100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -5831,6 +5831,11 @@
>       case 1:
>       case 2:
>         return \"#\";
> +    case 3:
> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> +       return \"#\";
> +      /* Fall through.  */
>       default:
>         return output_move_double (operands, true, NULL);
>       }
> @@ -6939,6 +6944,20 @@
>              operands[1] = force_reg (SFmode, operands[1]);
>           }
>       }
> +
> +  /* Cannot load it directly, generate a load with clobber so that it can be
> +     loaded via GPR with MOV / MOVT.  */
> +  if (arm_disable_literal_pool
> +      && (REG_P (operands[0]) || SUBREG_P (operands[0]))
> +      && CONST_DOUBLE_P (operands[1])
> +      && TARGET_HARD_FLOAT
> +      && !vfp3_const_double_rtx (operands[1]))
> +    {
> +      rtx clobreg = gen_reg_rtx (SFmode);
> +      emit_insn (gen_no_literal_pool_sf_immediate (operands[0], operands[1],
> +                                                  clobreg));
> +      DONE;
> +    }
>     "
>   )
>
> @@ -6966,10 +6985,19 @@
>      && TARGET_SOFT_FLOAT
>      && (!MEM_P (operands[0])
>          || register_operand (operands[1], SFmode))"
> -  "@
> -   mov%?\\t%0, %1
> -   ldr%?\\t%0, %1\\t%@ float
> -   str%?\\t%1, %0\\t%@ float"
> +{
> +  switch (which_alternative)
> +    {
> +    case 0: return \"mov%?\\t%0, %1\";
> +    case 1:
> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> +       return \"#\";
> +      return \"ldr%?\\t%0, %1\\t%@ float\";
> +    case 2: return \"str%?\\t%1, %0\\t%@ float\";
> +    default: gcc_unreachable ();
> +    }
> +}
>     [(set_attr "predicable" "yes")
>      (set_attr "type" "mov_reg,load_4,store_4")
>      (set_attr "arm_pool_range" "*,4096,*")
> @@ -6978,6 +7006,21 @@
>      (set_attr "thumb2_neg_pool_range" "*,0,*")]
>   )
>
> +;; Splitter for the above.
> +(define_split
> +  [(set (match_operand:SF 0 "s_register_operand")
> +       (match_operand:SF 1 "const_double_operand"))]
> +  "arm_disable_literal_pool && TARGET_SOFT_FLOAT"
> +  [(const_int 0)]
> +{
> +  long buf;
> +  real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
> +  rtx cst = gen_int_mode (buf, SImode);
> +  emit_move_insn (simplify_gen_subreg (SImode, operands[0], SFmode, 0), cst);
> +  DONE;
> +}
> +)
> +
>   (define_expand "movdf"
>     [(set (match_operand:DF 0 "general_operand" "")
>         (match_operand:DF 1 "general_operand" ""))]
> @@ -6996,6 +7039,21 @@
>             operands[1] = force_reg (DFmode, operands[1]);
>           }
>       }
> +
> +  /* Cannot load it directly, generate a load with clobber so that it can be
> +     loaded via GPR with MOV / MOVT.  */
> +  if (arm_disable_literal_pool
> +      && (REG_P (operands[0]) || SUBREG_P (operands[0]))
> +      && CONSTANT_P (operands[1])
> +      && TARGET_HARD_FLOAT
> +      && !arm_const_double_rtx (operands[1])
> +      && !(TARGET_VFP_DOUBLE && vfp3_const_double_rtx (operands[1])))
> +    {
> +      rtx clobreg = gen_reg_rtx (DFmode);
> +      emit_insn (gen_no_literal_pool_df_immediate (operands[0], operands[1],
> +                                                  clobreg));
> +      DONE;
> +    }
>     "
>   )
>
> @@ -7055,6 +7113,11 @@
>       case 1:
>       case 2:
>         return \"#\";
> +    case 3:
> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> +       return \"#\";
> +      /* Fall through.  */
>       default:
>         return output_move_double (operands, true, NULL);
>       }
> @@ -7066,6 +7129,24 @@
>      (set_attr "arm_neg_pool_range" "*,*,*,1004,*")
>      (set_attr "thumb2_neg_pool_range" "*,*,*,0,*")]
>   )
> +
> +;; Splitter for the above.
> +(define_split
> +  [(set (match_operand:DF 0 "s_register_operand")
> +       (match_operand:DF 1 "const_double_operand"))]
> +  "arm_disable_literal_pool && TARGET_SOFT_FLOAT"
> +  [(const_int 0)]
> +{
> +  long buf[2];
> +  int order = BYTES_BIG_ENDIAN ? 1 : 0;
> +  real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
> +  unsigned HOST_WIDE_INT ival = zext_hwi (buf[order], 32);
> +  ival |= (zext_hwi (buf[1 - order], 32) << 32);
> +  rtx cst = gen_int_mode (ival, DImode);
> +  emit_move_insn (simplify_gen_subreg (DImode, operands[0], DFmode, 0), cst);
>
> This is the part I'm most hesitant about, especially for big-endian.
> Did you run any armeb tests tahat exercise this?
> Would you not want to use gen_highpart_mode/gen_lowpart that handles all the 
> endianness-subreg subtleties for you?
>
>
> Thanks,
> Kyrill
>
>
>   +  DONE;
> +}
> +)
>
>
>   ;; load- and store-multiple insns
> diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
> index 
> 7576c6fc401fc5ce25245fa2b740db99169ce7ce..657e540816bdd82cddd23059dea2be19df7eb1bb
>  100644
> --- a/gcc/config/arm/constraints.md
> +++ b/gcc/config/arm/constraints.md
> @@ -31,9 +31,10 @@
>   ;; 'H' was previously used for FPA.
>
>   ;; The following multi-letter normal constraints have been used:
> -;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, 
> Dz
> +;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp,
> +;;                      Dz, Tu
>   ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
> -;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
> +;; in Thumb-2 state: Ha, Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py, Pz
>   ;; in all states: Pf
>
>   ;; The following memory constraints have been used:
> @@ -234,6 +235,12 @@
>    (and (match_code "const_double")
>         (match_test "TARGET_32BIT && arm_const_double_rtx (op)")))
>
> +(define_constraint "Ha"
> +  "@internal In ARM / Thumb-2 a float constant iff literal pools are 
> allowed."
> +  (and (match_code "const_double")
> +       (match_test "satisfies_constraint_E (op)")
> +       (match_test "!arm_disable_literal_pool")))
> +
>   (define_constraint "Dz"
>    "@internal
>     In ARM/Thumb-2 state a vector of constant zeros."
> @@ -351,6 +358,12 @@
>          (match_test "TARGET_32BIT
>                     && vfp3_const_double_for_bits (op) > 0")))
>
> +(define_constraint "Tu"
> +  "@internal In ARM / Thumb-2 an integer constant iff literal pools are
> +   allowed."
> +  (and (match_test "CONSTANT_P (op)")
> +       (match_test "!arm_disable_literal_pool")))
> +
>   (define_register_constraint "Ts" "(arm_restrict_it) ? LO_REGS : 
> GENERAL_REGS"
>    "For arm_restrict_it the core registers @code{r0}-@code{r7}.  GENERAL_REGS 
> otherwise.")
>
> diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
> index 
> 7e198f9bce441c55913615e4c601a760d7e62c20..f73264cc2a07cacec5e7c4e31ce12299a1fadd0b
>  100644
> --- a/gcc/config/arm/predicates.md
> +++ b/gcc/config/arm/predicates.md
> @@ -456,6 +456,24 @@
>          (and (match_code "reg,subreg,mem")
>             (match_operand 0 "nonimmediate_soft_df_operand"))))
>
> +;; Predicate for thumb2_movsf_vfp.  Compared to general_operand, this
> +;; forbids constant loaded via literal pool iff literal pools are disabled.
> +(define_predicate "hard_sf_operand"
> +  (and (match_operand 0 "general_operand")
> +       (ior (not (match_code "const_double"))
> +           (not (match_test "arm_disable_literal_pool"))
> +           (match_test "satisfies_constraint_Dv (op)"))))
> +
> +;; Predicate for thumb2_movdf_vfp.  Compared to soft_df_operand used in
> +;; movdf_soft_insn, this forbids constant loaded via literal pool iff
> +;; literal pools are disabled.
> +(define_predicate "hard_df_operand"
> +  (and (match_operand 0 "soft_df_operand")
> +       (ior (not (match_code "const_double"))
> +           (not (match_test "arm_disable_literal_pool"))
> +           (match_test "satisfies_constraint_Dy (op)")
> +           (match_test "satisfies_constraint_G (op)"))))
> +
>   (define_special_predicate "load_multiple_operation"
>     (match_code "parallel")
>   {
> diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
> index 
> c42670f8643c3286bc5abf537d4fd0483cba68ac..727ceb9b37957efbc7ab8809f57e8825deb6b1df
>  100644
> --- a/gcc/config/arm/thumb2.md
> +++ b/gcc/config/arm/thumb2.md
> @@ -252,16 +252,26 @@
>     "TARGET_THUMB2 && !TARGET_IWMMXT && !TARGET_HARD_FLOAT
>      && (   register_operand (operands[0], SImode)
>          || register_operand (operands[1], SImode))"
> -  "@
> -   mov%?\\t%0, %1
> -   mov%?\\t%0, %1
> -   mov%?\\t%0, %1
> -   mvn%?\\t%0, #%B1
> -   movw%?\\t%0, %1
> -   ldr%?\\t%0, %1
> -   ldr%?\\t%0, %1
> -   str%?\\t%1, %0
> -   str%?\\t%1, %0"
> +{
> +  switch (which_alternative)
> +    {
> +    case 0:
> +    case 1:
> +    case 2:
> +      return \"mov%?\\t%0, %1\";
> +    case 3: return \"mvn%?\\t%0, #%B1\";
> +    case 4: return \"movw%?\\t%0, %1\";
> +    case 5:
> +    case 6:
> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> +       return \"#\";
> +      return \"ldr%?\\t%0, %1\";
> +    case 7:
> +    case 8: return \"str%?\\t%1, %0\";
> +    default: gcc_unreachable ();
> +    }
> +}
>     [(set_attr "type" 
> "mov_reg,mov_imm,mov_imm,mvn_imm,mov_imm,load_4,load_4,store_4,store_4")
>      (set_attr "length" "2,4,2,4,4,4,4,4,4")
>      (set_attr "predicable" "yes")
> diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
> index 
> 611ebe2d83698e3129df6c55a03e4f5f33c891e7..f3d4f30cb53d82e2ffd2c4fcaad2cc873d97c24b
>  100644
> --- a/gcc/config/arm/vfp.md
> +++ b/gcc/config/arm/vfp.md
> @@ -259,7 +259,7 @@
>   ;; arm_restrict_it.
>   (define_insn "*thumb2_movsi_vfp"
>     [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r, l,*hk,m, 
> *m,*t, r,*t,*t,  *Uv")
> -       (match_operand:SI 1 "general_operand"      "rk,I,Py,K,j,mi,*mi,l,*hk, 
> r,*t,*t,*Uvi,*t"))]
> +       (match_operand:SI 1 "general_operand"      "rk,I,Py,K,j,mi,*mi,l,*hk, 
> r,*t,*t,*UvTu,*t"))]
>     "TARGET_THUMB2 && TARGET_HARD_FLOAT
>      && (   s_register_operand (operands[0], SImode)
>          || s_register_operand (operands[1], SImode))"
> @@ -276,6 +276,9 @@
>         return \"movw%?\\t%0, %1\";
>       case 5:
>       case 6:
> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> +       return \"#\";
>         return \"ldr%?\\t%0, %1\";
>       case 7:
>       case 8:
> @@ -305,7 +308,7 @@
>
>   (define_insn "*movdi_vfp"
>     [(set (match_operand:DI 0 "nonimmediate_di_operand" 
> "=r,r,r,r,q,q,m,w,!r,w,w, Uv")
> -       (match_operand:DI 1 "di_operand"              
> "r,rDa,Db,Dc,mi,mi,q,r,w,w,Uvi,w"))]
> +       (match_operand:DI 1 "di_operand"              
> "r,rDa,Db,Dc,mi,mi,q,r,w,w,UvTu,w"))]
>     "TARGET_32BIT && TARGET_HARD_FLOAT
>      && (   register_operand (operands[0], DImode)
>          || register_operand (operands[1], DImode))
> @@ -321,6 +324,10 @@
>         return \"#\";
>       case 4:
>       case 5:
> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> +       return \"#\";
> +      /* Fall through.  */
>       case 6:
>         return output_move_double (operands, true, NULL);
>       case 7:
> @@ -587,7 +594,7 @@
>
>   (define_insn "*thumb2_movsf_vfp"
>     [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t  ,Uv,r 
> ,m,t,r")
> -       (match_operand:SF 1 "general_operand"      " ?r,t,Dv,UvE,t, 
> mE,r,t,r"))]
> +       (match_operand:SF 1 "hard_sf_operand"      " ?r,t,Dv,UvHa,t, 
> mHa,r,t,r"))]
>     "TARGET_THUMB2 && TARGET_HARD_FLOAT
>      && (   s_register_operand (operands[0], SFmode)
>          || s_register_operand (operands[1], SFmode))"
> @@ -676,7 +683,7 @@
>
>   (define_insn "*thumb2_movdf_vfp"
>     [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w  
> ,Uv,r ,m,w,r")
> -       (match_operand:DF 1 "soft_df_operand"              " ?r,w,Dy,G,UvF,w, 
> mF,r, w,r"))]
> +       (match_operand:DF 1 "hard_df_operand"              " 
> ?r,w,Dy,G,UvHa,w, mHa,r, w,r"))]
>     "TARGET_THUMB2 && TARGET_HARD_FLOAT
>      && (   register_operand (operands[0], DFmode)
>          || register_operand (operands[1], DFmode))"
> @@ -1983,39 +1990,50 @@
>   ;; Support for xD (single precision only) variants.
>   ;; fmrrs, fmsrr
>
> -;; Split an immediate DF move to two immediate SI moves.
> +;; Load a DF immediate via GPR (where combinations of MOV and MOVT can be 
> used)
> +;; and then move it into a VFP register.
>   (define_insn_and_split "no_literal_pool_df_immediate"
> -  [(set (match_operand:DF 0 "s_register_operand" "")
> -       (match_operand:DF 1 "const_double_operand" ""))]
> -  "TARGET_THUMB2 && arm_disable_literal_pool
> -  && !(TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE
> -       && vfp3_const_double_rtx (operands[1]))"
> +  [(set (match_operand:DF 0 "s_register_operand" "=w")
> +       (match_operand:DF 1 "const_double_operand" "F"))
> +   (clobber (match_operand:DF 2 "s_register_operand" "=r"))]
> +  "arm_disable_literal_pool
> +   && TARGET_HARD_FLOAT
> +   && !arm_const_double_rtx (operands[1])
> +   && !(TARGET_VFP_DOUBLE && vfp3_const_double_rtx (operands[1]))"
>     "#"
> -  "&& !reload_completed"
> -  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
> -   (set (subreg:SI (match_dup 1) 4) (match_dup 3))
> -   (set (match_dup 0) (match_dup 1))]
> -  "
> +  ""
> +  [(const_int 0)]
> +{
>     long buf[2];
> +  int order = BYTES_BIG_ENDIAN ? 1 : 0;
>     real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
> -  operands[2] = GEN_INT ((int) buf[0]);
> -  operands[3] = GEN_INT ((int) buf[1]);
> -  operands[1] = gen_reg_rtx (DFmode);
> -  ")
> +  unsigned HOST_WIDE_INT ival = zext_hwi (buf[order], 32);
> +  ival |= (zext_hwi (buf[1 - order], 32) << 32);
> +  rtx cst = gen_int_mode (ival, DImode);
> +  emit_move_insn (simplify_gen_subreg (DImode, operands[2], DFmode, 0), cst);
> +  emit_move_insn (operands[0], operands[2]);
> +  DONE;
> +}
> +)
>
> -;; Split an immediate SF move to one immediate SI move.
> +;; Load a SF immediate via GPR (where combinations of MOV and MOVT can be 
> used)
> +;; and then move it into a VFP register.
>   (define_insn_and_split "no_literal_pool_sf_immediate"
> -  [(set (match_operand:SF 0 "s_register_operand" "")
> -       (match_operand:SF 1 "const_double_operand" ""))]
> -  "TARGET_THUMB2 && arm_disable_literal_pool
> -  && !(TARGET_HARD_FLOAT && vfp3_const_double_rtx (operands[1]))"
> +  [(set (match_operand:SF 0 "s_register_operand" "=t")
> +       (match_operand:SF 1 "const_double_operand" "E"))
> +   (clobber (match_operand:SF 2 "s_register_operand" "=r"))]
> +  "arm_disable_literal_pool
> +   && TARGET_HARD_FLOAT
> +   && !vfp3_const_double_rtx (operands[1])"
>     "#"
> -  "&& !reload_completed"
> -  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
> -   (set (match_dup 0) (match_dup 1))]
> -  "
> +  ""
> +  [(const_int 0)]
> +{
>     long buf;
>     real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
> -  operands[2] = GEN_INT ((int) buf);
> -  operands[1] = gen_reg_rtx (SFmode);
> -  ")
> +  rtx cst = gen_int_mode (buf, SImode);
> +  emit_move_insn (simplify_gen_subreg (SImode, operands[2], SFmode, 0), cst);
> +  emit_move_insn (operands[0], operands[2]);
> +  DONE;
> +}
> +)
> diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c 
> b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
> index 
> 90bd44e27e5c53d34f2816f4d6320acbc1dc709b..231243759cfe486c390ca27f10bd06177f60bd43
>  100644
> --- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
> +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
> @@ -1,6 +1,7 @@
>   /* { dg-do compile } */
>   /* { dg-require-effective-target arm_cortex_m } */
>   /* { dg-require-effective-target arm_thumb2_ok } */
> +/* { dg-require-effective-target arm_fp_ok } */
>   /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { 
> "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
>   /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } 
> { "-mfloat-abi=hard" } } */
>   /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { 
> *-*-* } { "-mword-relocations" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c 
> b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
> index 
> 5d9cd9c4df28837b81b2de48c25d38cdf2c15999..27e72ec20863866acdc5e7fea632bc6880678dfd
>  100644
> --- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
> +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
> @@ -1,6 +1,7 @@
>   /* { dg-do compile } */
>   /* { dg-require-effective-target arm_cortex_m } */
>   /* { dg-require-effective-target arm_thumb2_ok } */
> +/* { dg-require-effective-target arm_fp_ok } */
>   /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { 
> "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
>   /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } 
> { "-mfloat-abi=hard" } } */
>   /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { 
> *-*-* } { "-mword-relocations" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c 
> b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
> index 
> 0eeddd5e6ec1f42a96fc6220277f9ecb7cad44f5..8dbe87a1e68d5eb2edfd8259948988fbe0658ced
>  100644
> --- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
> +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
> @@ -1,6 +1,7 @@
>   /* { dg-do compile } */
>   /* { dg-require-effective-target arm_cortex_m } */
>   /* { dg-require-effective-target arm_thumb2_ok } */
> +/* { dg-require-effective-target arm_fp_ok } */
>   /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { 
> "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
>   /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } 
> { "-mfloat-abi=hard" } } */
>   /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { 
> *-*-* } { "-mword-relocations" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c 
> b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
> index 
> 7d52f3801b6d4b62b27833871ac830d6d077894d..b98eb7624e42b5a7f4a11c604c7d2826339bcfd5
>  100644
> --- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
> +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
> @@ -1,6 +1,7 @@
>   /* { dg-do compile } */
>   /* { dg-require-effective-target arm_cortex_m } */
>   /* { dg-require-effective-target arm_thumb2_ok } */
> +/* { dg-require-effective-target arm_fp_ok } */
>   /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { 
> "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
>   /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } 
> { "-mfloat-abi=hard" } } */
>   /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { 
> *-*-* } { "-mword-relocations" } } */
>
>
>

Re: [PATCH, ARM] Improve robustness of -mslow-flash-data

Reply via email to