On Mon, 10 Feb 2020 at 17:45, Richard Earnshaw (lists)
<richard.earns...@arm.com> wrote:
>
> On 10/02/2020 09:27, Christophe Lyon wrote:
> > On Fri, 7 Feb 2020 at 17:55, Richard Earnshaw (lists)
> > <richard.earns...@arm.com> wrote:
> >>
> >> On 07/02/2020 16:43, Christophe Lyon wrote:
> >>> On Fri, 7 Feb 2020 at 14:49, Richard Earnshaw (lists)
> >>> <richard.earns...@arm.com> wrote:
> >>>>
> >>>> On 07/02/2020 13:19, Christophe Lyon wrote:
> >>>>> When running the testsuite with -fdisable-rtl-fwprop2 and -mpure-code
> >>>>> for cortex-m0, I noticed that some testcases were failing because we
> >>>>> still generate "ldr rX, .LCY", which is what we want to avoid with
> >>>>> -mpure-code. This is latent since a recent improvement in fwprop
> >>>>> (PR88833).
> >>>>>
> >>>>> In this patch I change the thumb1_movsi_insn pattern so that it emits
> >>>>> the desired instruction sequence when arm_disable_literal_pool is set.
> >>>>>
> >>>>> I tried to add a define_split instead, but couldn't make it work: the
> >>>>> compiler then complains it cannot split the instruction, while my new
> >>>>> define_split accepts the same operand types as thumb1_movsi_insn:
> >>>>>
> >>>>> c-c++-common/torture/complex-sign-mixed-add.c:41:1: error: could not 
> >>>>> split insn
> >>>>> (insn 2989 425 4844 (set (reg/f:SI 3 r3 [1342])
> >>>>>            (symbol_ref/u:SI ("*.LC6") [flags 0x2])) 836 
> >>>>> {*thumb1_movsi_insn}
> >>>>>         (expr_list:REG_EQUIV (symbol_ref/u:SI ("*.LC6") [flags 0x2])
> >>>>>            (nil)))
> >>>>> during RTL pass: final
> >>>>>
> >>>>> (define_split
> >>>>>      [(set (match_operand:SI 0 "register_operand" "")
> >>>>>            (match_operand:SI 1 "general_operand" ""))]
> >>>>>      "TARGET_THUMB1
> >>>>>       && arm_disable_literal_pool
> >>>>>       && GET_CODE (operands[1]) == SYMBOL_REF"
> >>>>>      [(clobber (const_int 0))]
> >>>>>      "
> >>>>>        gen_thumb1_movsi_symbol_ref(operands[0], operands[1]);
> >>>>>        DONE;
> >>>>>      "
> >>>>> )
> >>>>> and I put this in thumb1_movsi_insn:
> >>>>> if (GET_CODE (operands[1]) == SYMBOL_REF && arm_disable_literal_pool)
> >>>>>      {
> >>>>>        return \"#\";
> >>>>>      }
> >>>>>      return \"ldr\\t%0, %1\";
> >>>>>
> >>>>> 2020-02-07  Christophe Lyon  <christophe.l...@linaro.org>
> >>>>>
> >>>>>            * config/arm/thumb1.md (thumb1_movsi_insn): Fix ldr 
> >>>>> alternative to
> >>>>>            work with -mpure-code.
> >>>>>
> >>>>
> >>>> +    case 0:
> >>>> +    case 1:
> >>>> +      return \"movs    %0, %1\";
> >>>> +    case 2:
> >>>> +      return \"movw    %0, %1\";
> >>>>
> >>>> This is OK, but please replace the hard tab in the strings for MOVS/MOVW
> >>>> with \\t.
> >>>>
> >>>
> >>> OK that was merely a cut & paste from the existing code.
> >>>
> >>> I'm concerned that the length attribute is becoming wrong with my
> >>> patch, isn't this a problem?
> >>>
> >>
> >> Potentially yes.  The branch range code needs this to handle overly long
> >> jumps correctly.
> >>
> >
> > Do you mean that the probability of problems due to that shortcoming
> > is low enough that I can commit my patch?
>
> It's hard to say that.  The number of instructions you generate for this
> is significant, so it likely will throw off the calculations and
> somebody will get unlucky.  On the other hand, I don't think we should
> pessimize code for the non-pure-code variants by inflating the size for
> this unconditionally.
>
> It seems there are two ways to fix this.
>
> 1) create a new alternative for the pure-code variant with its own
> length attribute, then only enable that for the case you need.  That
> would also allow you to go back to the templated asm format.
> 2) use a level of indirection to calculate the length - I haven't tried
> this, but it would be something like:
>
>   - create a new attribute - lets say default_length
>   - rename length for this pattern to default_length
>   - create another new attribute - lets say purecode_length, add lengths
> for each alternative but adjusted for the purecode variant.
>   - make the length attribute for this pattern select based on the
> disable_literal_pool variable between the default_length and
> purecode_length attributes.
>

Hi Richard,

Thanks for the suggestions.  I've implemented option (1) above, does
it match what you had in mind?

Tested on arm-eabi, with -mpure-code forced, no regression. Manually
checked that the expected sequence is generated with
-fdisable-rtl-fwprop2.

Thanks,

Christophe

> R.
>
> >
> > Thanks,
> >
> > Christophe
> >
> >> R.
> >>
> >>> Thanks,
> >>>
> >>> Christophe
> >>>
> >>>> R.
> >>
>
[ARM] Fix -mpure-code for v6m

When running the testsuite with -fdisable-rtl-fwprop2 and -mpure-code
for cortex-m0, I noticed that some testcases were failing because we
still generate "ldr rX, .LCY", which is what we want to avoid with
-mpure-code. This is latent since a recent improvement in fwprop
(PR88833).

In this patch I change the thumb1_movsi_insn pattern so that it emits
the desired instruction sequence when arm_disable_literal_pool is set.

To achieve that, I introduce a new required_for_purecode attribute to
enable the corresponding alternative in thumb1_movsi_insn and take the
actual instruction sequence length into account.

gcc/ChangeLog:

2020-02-13  Christophe Lyon  <christophe.l...@linaro.org>

        * config/arm/arm.md (required_for_purecode): New attribute.
        (enabled): Handle required_for_purecode.
        * config/arm/thumb1.md (thumb1_movsi_insn): Add alternative to
        work with -mpure-code.

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index f89a2d4..aa8c34d 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -97,6 +97,11 @@
 ; an IT block in their expansion which is not a short IT.
 (define_attr "enabled_for_short_it" "no,yes" (const_string "yes"))
 
+; Mark an instruction sequence as the required way of loading a
+; constant when -mpure-code is enabled (which implies
+; arm_disable_literal_pool)
+(define_attr "required_for_purecode" "no,yes" (const_string "no"))
+
 ;; Operand number of an input operand that is shifted.  Zero if the
 ;; given instruction does not shift one of its input operands.
 (define_attr "shift" "" (const_int 0))
@@ -230,6 +235,10 @@
               (match_test "arm_restrict_it"))
          (const_string "no")
 
+         (and (eq_attr "required_for_purecode" "yes")
+              (not (match_test "arm_disable_literal_pool")))
+         (const_string "no")
+
          (eq_attr "arch_enabled" "no")
          (const_string "no")]
         (const_string "yes")))
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 613cf9c..2486163 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -691,8 +691,8 @@
 )
 
 (define_insn "*thumb1_movsi_insn"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=l,l,r,l,l,l,>,l, 
m,*l*h*k")
-       (match_operand:SI 1 "general_operand"      "l, 
I,j,J,K,>,l,mi,l,*l*h*k"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=l,l,r,l,l,l,>,l, l, 
m,*l*h*k")
+       (match_operand:SI 1 "general_operand"      "l, I,j,J,K,>,l,i, 
mi,l,*l*h*k"))]
   "TARGET_THUMB1
    && (   register_operand (operands[0], SImode)
        || register_operand (operands[1], SImode))"
@@ -704,14 +704,16 @@
    #
    ldmia\\t%1, {%0}
    stmia\\t%0, {%1}
+   movs\\t%0, #:upper8_15:%1; lsls\\t%0, #8; adds\\t%0, #:upper0_7:%1; 
lsls\\t%0, #8; adds\\t%0, #:lower8_15:%1; lsls\\t%0, #8; adds\\t%0, 
#:lower0_7:%1
    ldr\\t%0, %1
    str\\t%1, %0
    mov\\t%0, %1"
-  [(set_attr "length" "2,2,4,4,4,2,2,2,2,2")
-   (set_attr "type" 
"mov_reg,mov_imm,mov_imm,multiple,multiple,load_4,store_4,load_4,store_4,mov_reg")
-   (set_attr "pool_range" "*,*,*,*,*,*,*,1018,*,*")
-   (set_attr "arch" "t1,t1,v8mb,t1,t1,t1,t1,t1,t1,t1")
-   (set_attr "conds" 
"set,clob,nocond,*,*,nocond,nocond,nocond,nocond,nocond")])
+  [(set_attr "length" "2,2,4,4,4,2,2,14,2,2,2")
+   (set_attr "type" 
"mov_reg,mov_imm,mov_imm,multiple,multiple,load_4,store_4,alu_sreg,load_4,store_4,mov_reg")
+   (set_attr "pool_range" "*,*,*,*,*,*,*, *,1018,*,*")
+   (set_attr "arch" "t1,t1,v8mb,t1,t1,t1,t1,t1,t1,t1,t1")
+   (set_attr "required_for_purecode" "no,no,no,no,no,no,no,yes,no,no,no")
+   (set_attr "conds" 
"set,clob,nocond,*,*,nocond,nocond,nocond,nocond,nocond,nocond")])
 
 ; Split the load of 64-bit constant into two loads for high and low 32-bit 
parts respectively
 ; to see if we can load them in fewer instructions or fewer cycles.

Reply via email to