On 8/10/23 02:50, Wilco Dijkstra wrote:
Hi Richard,
Why would HWCAP_USCAT not be set by the kernel?
Failing that, I would think you would check ID_AA64MMFR2_EL1.AT.
Answering my own question, N1 does not officially have FEAT_LSE2.
It doesn't indeed. However most cores support atomic 128-bi
On 8/9/23 19:11, Richard Henderson wrote:
On 8/4/23 08:05, Wilco Dijkstra via Gcc-patches wrote:
+#ifdef HWCAP_USCAT
+
+#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
+#define MIDR_PARTNUM(midr) (((midr) >> 4) & 0xfff)
+
+static inline bool
+ifunc1 (unsigned
On 8/4/23 08:05, Wilco Dijkstra via Gcc-patches wrote:
+#ifdef HWCAP_USCAT
+
+#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
+#define MIDR_PARTNUM(midr) (((midr) >> 4) & 0xfff)
+
+static inline bool
+ifunc1 (unsigned long hwcap)
+{
+ if (hwcap & HWCAP_USCAT)
+return true;
+ if (!
2022-04-19 Richard Henderson
* MAINTAINERS: Update my email address.
---
MAINTAINERS | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 30f81b3dd52..15973503722 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -53,7 +53,7 @@ aarch64 port
On 4/9/20 2:52 PM, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Apr 02, 2020 at 11:53:47AM -0700, Richard Henderson wrote:
>> The rtl description of signed/unsigned overflow from subtract
>> was fine, as far as it goes -- we have CC_Cmode and CC_Vmode
>> that indicate
While cmp (extended register) and cmp (immediate) uses ,
cmp (shifted register) uses . So we can perform cmp xzr, x0.
For ccmp, we only have as an input.
* config/aarch64/aarch64.md (cmp): For operand 0, use
aarch64_reg_or_zero. Shuffle reg/reg to last alternative
and a
* config/aarch64/aarch64-modes.def (CC_NV): New.
* config/aarch64/aarch64.c (aarch64_gen_compare_reg): Expand
all of the comparisons for TImode, not just NE.
(aarch64_select_cc_mode): Recognize cmp_carryin.
(aarch64_get_condition_code_1): Handle CC_NVmode.
We are about to use !C in more contexts than add-with-carry.
Choose a more generic name.
* config/aarch64/aarch64-modes.def (CC_NOTC): Rename CC_ADC.
* config/aarch64/aarch64.c (aarch64_select_cc_mode): Update.
(aarch64_get_condition_code_1): Likewise.
* config/aarc
Return the entire comparison expression, not just the cc_reg.
This will allow the routine to adjust the comparison code as
needed for TImode comparisons.
Note that some users were passing e.g. EQ to aarch64_gen_compare_reg
and then using gen_rtx_NE. Pass the proper code in the first place.
In one place we open-code a special case of this pattern into the
more specific sub3_compare1_imm, and miss this special case
in other places. Centralize that special case into an expander.
* config/aarch64/aarch64.md (*sub3_compare1): Rename
from sub3_compare1.
(sub3_comp
We have been using CCmode, which is not correct for this case.
Mirror the same code from the arm target.
* config/aarch64/aarch64.c (aarch64_select_cc_mode):
Recognize usub*_carryinC patterns.
* config/aarch64/aarch64.md (usubvti4): Use CC_NOTC.
(usub3_carryinC): Li
Modify aarch64_expand_subvti into a form that handles all
addition and subtraction, modulo, signed or unsigned overflow.
Use expand_insn to put the operands into the proper form,
and do not force values into register if not required.
* config/aarch64/aarch64.c (aarch64_ti_split) New.
These CC_MODEs are identical, merge them into a more generic name.
* config/arm/arm-modes.def (CC_NOTC): New.
(CC_ADC, CC_B): Remove.
* config/arm/arm.c (arm_select_cc_mode): Update to match.
(arm_gen_dicompare_reg): Likewise.
(maybe_get_arm_condition_code):
The arm target has some improvements over aarch64 for
double-word arithmetic and comparisons.
* config/aarch64/aarch64.c (aarch64_select_cc_mode): Check
for swapped operands to CC_Cmode; check for zero_extend to
CC_ADCmode; check for swapped operands to CC_Vmode.
---
gcc/c
Some implementations have a higher cost for the csel insn
(and its specializations) than they do for adc/sbc.
* config/aarch64/aarch64.md (*cstore_carry): New.
(*cstoresi_carry_uxtw): New.
(*cstore_borrow): New.
(*cstoresi_borrow_uxtw): New.
(*csinc2_carry):
Rather than duplicating the rather verbose integral test,
pull it out to a predicate.
* config/aarch64/predicates.md (const_dword_umaxp1): New.
* config/aarch64/aarch64.c (aarch64_select_cc_mode): Use it.
* config/aarch64/aarch64.md (add*add3_carryinC): Likewise.
(*
ven't put enough thought
into the problem.
r~
Richard Henderson (12):
aarch64: Provide expander for sub3_compare1
aarch64: Match add3_carryin expander and insn
aarch64: Add cset, csetm, cinc patterns for carry/borrow
aarch64: Add const_dword_umaxp1
aarch64: Impro
The expander and insn predicates do not match,
which can lead to insn recognition errors.
* config/aarch64/aarch64.md (add3_carryin):
Use register_operand instead of aarch64_reg_or_zero.
---
gcc/config/aarch64/aarch64.md | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
d
On 4/7/20 4:58 PM, Segher Boessenkool wrote:
>> I wonder if it would be helpful to have
>>
>> (uoverflow_plus x y carry)
>> (soverflow_plus x y carry)
>>
>> etc.
>
> Those have three operands, which is nasty to express.
How so? It's a perfectly natural operation.
> On rs6000 we have the car
On 4/7/20 1:27 PM, Segher Boessenkool wrote:
> On Mon, Apr 06, 2020 at 12:19:42PM +0100, Richard Sandiford wrote:
>> The reason I'm not keen on using special modes for this case is that
>> they'd describe one way in which the result can be used rather than
>> describing what the instruction actuall
On 4/7/20 9:32 AM, Richard Sandiford wrote:
> It's not really reversibility that I'm after (at least not for its
> own sake).
>
> If we had a three-input compare_cc rtx_code that described a comparison
> involving a carry input, we'd certainly be using it here, because that's
> what the instructio
On 4/2/20 11:53 AM, Richard Henderson via Gcc-patches wrote:
> This is attacking case 3 of PR 94174.
>
> In v2, I unify the various subtract-with-borrow and add-with-carry
> patterns that also output flags with unspecs. As suggested by
> Richard Sandiford during review of v1
Return the entire comparison expression, not just the cc_reg.
This will allow the routine to adjust the comparison code as
needed for TImode comparisons.
Note that some users were passing e.g. EQ to aarch64_gen_compare_reg
and then using gen_rtx_NE. Pass the proper code in the first place.
Modify aarch64_expand_subvti into a form that handles all
addition and subtraction, modulo, signed or unsigned overflow.
Use expand_insn to put the operands into the proper form,
and do not force values into register if not required.
* config/aarch64/aarch64.c (aarch64_ti_split) New.
The rtl description of signed/unsigned overflow from subtract
was fine, as far as it goes -- we have CC_Cmode and CC_Vmode
that indicate that only those particular bits are valid.
However, it's not clear how to extend that description to
handle signed comparison, where N == V (GE) N != V (LT) are
Now that we're using UNSPEC_ADCS instead of rtl, there's
no reason to distinguish CC_ADCmode from CC_Cmode. Both
examine only the C bit. Within uaddvti4, using CC_Cmode
is clearer, since it's the carry-outthat's relevant.
* config/aarch64/aarch64-modes.def (CC_ADC): Remove.
* con
* config/aarch64/aarch64.md (absti2): New.
---
gcc/config/aarch64/aarch64.md | 29 +
1 file changed, 29 insertions(+)
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index cf716f815a1..4a30d4cca93 100644
--- a/gcc/config/aarch64/aarch
* config/aarch64/predicates.md (aarch64_reg_or_minus1): New.
* config/aarch64/aarch64.md (add3_carryin): Use it.
(*add3_carryin): Likewise.
(*addsi3_carryin_uxtw): Likewise.
---
gcc/config/aarch64/aarch64.md| 26 +++---
gcc/config/aarch64/pre
Similar to UNSPEC_SBCS, we can unify the signed/unsigned overflow
paths by using an unspec.
Accept -1 for the second input by using SBCS.
* config/aarch64/aarch64.md (UNSPEC_ADCS): New.
(addvti4, uaddvti4): Use adddi_carryin_cmp.
(add3_carryinC): Remove.
(*add3_car
Use ccmp to perform all TImode comparisons branchless.
* config/aarch64/aarch64.c (aarch64_gen_compare_reg): Expand all of
the comparisons for TImode, not just NE.
* config/aarch64/aarch64.md (cbranchti4, cstoreti4): New.
---
gcc/config/aarch64/aarch64.c | 122 +++
While cmp (extended register) and cmp (immediate) uses ,
cmp (shifted register) uses . So we can perform cmp xzr, x0.
For ccmp, we only have as an input.
* config/aarch64/aarch64.md (cmp): For operand 0, use
aarch64_reg_or_zero. Shuffle reg/reg to last alternative
and a
This is attacking case 3 of PR 94174.
In v2, I unify the various subtract-with-borrow and add-with-carry
patterns that also output flags with unspecs. As suggested by
Richard Sandiford during review of v1. It does seem cleaner.
r~
Richard Henderson (11):
aarch64: Accept 0 as first
The expander and the insn pattern did not match, leading to
recognition failures in expand.
* config/aarch64/aarch64.md (*add3_carryin): Accept zeros.
---
gcc/config/aarch64/aarch64.md | 9 +
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/gcc/config/aarch64/aarch64.
In one place we open-code a special case of this pattern into the
more specific sub3_compare1_imm, and miss this special case
in other places. Centralize that special case into an expander.
* config/aarch64/aarch64.md (*sub3_compare1): Rename
from sub3_compare1.
(sub3_comp
On 4/1/20 9:28 AM, Richard Sandiford wrote:
> How important is it to describe the flags operation as a compare though?
> Could we instead use an unspec with three inputs, and keep it as :CC?
> That would still allow special-case matching for zero operands.
I'm not sure.
My guess is that the only
On 3/31/20 11:34 AM, Richard Sandiford wrote:
>> +(define_insn "*cmp3_carryinC"
>> + [(set (reg:CC CC_REGNUM)
>> +(compare:CC
>> + (ANY_EXTEND:
>> +(match_operand:GPI 0 "register_operand" "r"))
>> + (plus:
>> +(ANY_EXTEND:
>> + (match_operand:GPI 1 "register_
On 3/31/20 9:55 AM, Richard Sandiford wrote:
>> (define_insn "cmp"
>>[(set (reg:CC CC_REGNUM)
>> -(compare:CC (match_operand:GPI 0 "register_operand" "rk,rk,rk")
>> -(match_operand:GPI 1 "aarch64_plus_operand" "r,I,J")))]
>> +(compare:CC (match_operand:GPI 0 "aarch64_re
On 3/22/20 2:55 PM, Segher Boessenkool wrote:
> Maybe this stuff would be simpler (and more obviously correct) if it
> was more explicit CC_REGNUM is a fixed register, and the code would use
> it directly everywhere?
Indeed the biggest issue I have in this patch is what CC_MODE to expose from
the
On 3/22/20 12:30 PM, Segher Boessenkool wrote:
> Hi!
>
> On Fri, Mar 20, 2020 at 07:42:25PM -0700, Richard Henderson via Gcc-patches
> wrote:
>> Duplicate all usub_*_carryinC, but use xzr for the output when we
>> only require the flags output. The signed versions use s
Return the entire comparison expression, not just the cc_reg.
This will allow the routine to adjust the comparison code as
needed for TImode comparisons.
Note that some users were passing e.g. EQ to aarch64_gen_compare_reg
and then using gen_rtx_NE. Pass the proper code in the first place.
Use ccmp to perform all TImode comparisons branchless.
* config/aarch64/aarch64.c (aarch64_gen_compare_reg): Expand all of
the comparisons for TImode, not just NE.
* config/aarch64/aarch64.md (cbranchti4, cstoreti4): New.
---
gcc/config/aarch64/aarch64.c | 130 +++
Modify aarch64_expand_subvti into a form that handles all
addition and subtraction, modulo, signed or unsigned overflow.
Use expand_insn to put the operands into the proper form,
and do not force values into register if not required.
* config/aarch64/aarch64.c (aarch64_ti_split) New.
While cmp (extended register) and cmp (immediate) uses ,
cmp (shifted register) uses . So we can perform cmp xzr, x0.
For ccmp, we only have as an input.
* config/aarch64/aarch64.md (cmp): For operand 0, use
aarch64_reg_or_zero. Shuffle reg/reg to last alternative
and a
* config/aarch64/aarch64.md (absti2): New.
---
gcc/config/aarch64/aarch64.md | 30 ++
1 file changed, 30 insertions(+)
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 284a8038e28..7a112f89487 100644
--- a/gcc/config/aarch64/aarc
In a couple of places we open-code a special case of this
pattern into the more specific sub3_compare1_imm.
Centralize that special case into an expander.
* config/aarch64/aarch64.md (*sub3_compare1): Rename
from sub3_compare1.
(sub3_compare1): New expander.
---
gcc/config
Duplicate all usub_*_carryinC, but use xzr for the output when we
only require the flags output. The signed versions use sign_extend
instead of zero_extend for combine's benefit.
These will be used shortly for TImode comparisons.
* config/aarch64/aarch64.md (cmp3_carryinC): New.
Combine will fold immediate -1 differently than the other
*cmp*_carryinC* patterns. In this case we can use adcs
with an xzr input, and it occurs frequently when comparing
128-bit values to small negative constants.
* config/aarch64/aarch64.md (cmp_carryinC_m2): New.
---
gcc/config/aarch
The expander and the insn pattern did not match, leading to
recognition failures in expand.
* config/aarch64/aarch64.md (*add3_carryin): Accept zeros.
---
gcc/config/aarch64/aarch64.md | 9 +
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/gcc/config/aarch64/aarch64.
! .L1:
ret
.p2align 2,,3
- .L6:
- bne .L4
- cmp x0, 1
- bhi .L1
.L4:
b doit
--- 11,19
subsx0, x0, x2
sbc x1, x1, xzr
! cmp x0, 2
! sbcsxzr, x1, xzr
! blt .L4
ret
.
On 3/19/20 8:47 AM, Wilco Dijkstra wrote:
> Hi Richard,
>
> Thanks for these patches - yes TI mode expansions can certainly be improved!
> So looking at your expansions for signed compares, why not copy the optimal
> sequence from 32-bit Arm?
>
> Any compare can be done in at most 2 instructions:
The first two arguments were "reversed", in that operand 0 was not
the output, but the input cc_reg. Remove operand 0 entirely, since
we can get the input cc_reg from within the operand 3 comparison
expression. This moves the output operand to index 0.
* config/aarch64/aarch64.md (@ccmpc
Return the entire comparison expression, not just the cc_reg.
This will allow the routine to adjust the comparison code as
needed for TImode comparisons.
Note that some users were passing e.g. EQ to aarch64_gen_compare_reg
and then using gen_rtx_NE. Pass the proper code in the first place.
Use ccmp to perform all TImode comparisons branchless.
* config/aarch64/aarch64.c (aarch64_gen_compare_reg): Expand all of
the comparisons for TImode, not just NE.
* config/aarch64/aarch64.md (cbranchti4, cstoreti4): New.
---
gcc/config/aarch64/aarch64.c | 182 +++
Currently we use %k to interpret an aarch64_cond_code value.
This interpretation is done via an array, aarch64_nzcv_codes.
The rtl is neither hindered nor harmed by using the proper
nzcv value itself, so index the array earlier than later.
This makes it easier to compare the rtl to the assembly.
I
Use xzr for the output when we only require the flags output.
This will be used shortly for TImode comparisons.
* config/aarch64/aarch64.md (ucmp3_carryinC): New.
(*ucmp3_carryinC_z1): New.
(*ucmp3_carryinC_z2): New.
(*ucmp3_carryinC): New.
---
gcc/config/aarch64/a
While cmp (extended register) and cmp (immediate) uses ,
cmp (shifted register) uses . So we can perform cmp xzr, x0.
For ccmp, we only have as an input.
* config/aarch64/aarch64.md (cmp): For operand 0, use
aarch64_reg_or_zero. Shuffle reg/reg to last alternative
and a
b.ne20 // b.any
+ 1c: d65f03c0ret
20: 1400b 0
r~
Richard Henderson (6):
aarch64: Add ucmp_*_carryinC patterns for all usub_*_carryinC
aarch64: Adjust result of aarch64_gen_compare_reg
aarch64: Accept 0 as first argument to compares
aarch
I'm not sure what happened to v2. I can see it in my sent email, but it never
made it to the mailing list, and possibly not to Richard E. either.
So resending, with an extra testsuite fix for ilp32, spotted by Christophe.
Re thumb1, rather than an ifdef in config/arm/aarch-common.c, as I did in
On 11/19/19 9:29 AM, Christophe Lyon wrote:
> On Mon, 18 Nov 2019 at 20:54, Richard Henderson
> wrote:
>>
>> On 11/18/19 1:30 PM, Christophe Lyon wrote:
>>> I'm sorry to notice that the last test (asm-flag-6.c) fails to execute
>>> when compiling with -mab
On 11/18/19 1:30 PM, Christophe Lyon wrote:
> I'm sorry to notice that the last test (asm-flag-6.c) fails to execute
> when compiling with -mabi=ilp32. I have less details than for Arm,
> because here I'm using the Foundation Model as simulator instead of
> Qemu. In addition, I'm using an old versi
On 11/18/19 1:25 PM, Christophe Lyon wrote:
> Hi Richard
>
> On Thu, 14 Nov 2019 at 11:08, Richard Henderson
> wrote:
>>
>> Inspired by the tests in gcc.target/i386. Testing code generation,
>> diagnostics, and execution.
>>
>>
What I committed today does in fact ICE for thumb1, as you suspected.
I'm currently testing the following vs
arm-sim/
arm-sim/-mthumb
arm-sim/-mcpu=cortex-a15/-mthumb.
which, with the default cpu for arm-elf-eabi, should test all of arm, thumb1,
thumb2.
I'm not thrilled about the ifdef in
On 11/14/19 3:48 PM, Richard Earnshaw (lists) wrote:
> On 14/11/2019 10:07, Richard Henderson wrote:
>> Since all but a couple of lines is shared between the two targets,
>> enable them both at once.
>>
>> * config/arm/aarch-common-protos.h (arm_md_asm_adjust): D
On 11/14/19 3:39 PM, Richard Earnshaw (lists) wrote:
> Not had a chance to look at this in detail, but I don't see any support for
>
> 1) Thumb1 where we do not expose the condition codes at all
> 2) Thumb2 where we need IT instructions along-side the conditional
> instructions
> themselves.
>
>
On 11/13/19 8:35 PM, Jeff Law wrote:
> On 11/13/19 6:04 AM, Bernd Schmidt wrote:
>> The cc0 machinery allows for eliminating unnecessary comparisons by
>> examining the effect instructions have on the flags registers. I have
>> replicated that mechanism with a relatively modest amount of code based
On 11/14/19 2:08 PM, Kyrill Tkachov wrote:
> Hi Richard,
>
> On 11/14/19 10:07 AM, Richard Henderson wrote:
>> I've put the implementation into config/arm/aarch-common.c, so
>> that it can be shared between the two targets. This required
>> a little bit of cleanup
On 11/14/19 2:07 PM, Kyrill Tkachov wrote:
>
> On 11/14/19 10:07 AM, Richard Henderson wrote:
>> The existing definition using register class CC_REG does not
>> work because CC_REGNUM does not support normal modes, and so
>> fails to match register_operand. Use a non-re
Inspired by the tests in gcc.target/i386. Testing code generation,
diagnostics, and execution.
* gcc.target/arm/asm-flag-1.c: New test.
* gcc.target/arm/asm-flag-3.c: New test.
* gcc.target/arm/asm-flag-5.c: New test.
* gcc.target/arm/asm-flag-6.c: New test.
---
g
Inspired by the tests in gcc.target/i386. Testing code generation,
diagnostics, and execution.
* gcc.target/aarch64/asm-flag-1.c: New test.
* gcc.target/aarch64/asm-flag-3.c: New test.
* gcc.target/aarch64/asm-flag-5.c: New test.
* gcc.target/aarch64/asm-flag-6.c:
CC_NZmode is a more accurate description of what we require
from the mode, and matches up with the definition in aarch64.
Rename noov_comparison_operator to nz_comparison_operator
in order to match.
* config/arm/arm-modes.def (CC_NZ): Rename from CC_NOOV.
* config/arm/predicates.m
Since all but a couple of lines is shared between the two targets,
enable them both at once.
* config/arm/aarch-common-protos.h (arm_md_asm_adjust): Declare.
* config/arm/aarch-common.c (arm_md_asm_adjust): New.
* config/arm/arm-c.c (arm_cpu_builtins): Define
__GCC_
Mirror arm in letting "c" match the condition code register.
* config/aarch64/constraints.md (c): New constraint.
---
gcc/config/aarch64/constraints.md | 4
1 file changed, 4 insertions(+)
diff --git a/gcc/config/aarch64/constraints.md
b/gcc/config/aarch64/constraints.md
index d0c3
The existing definition using register class CC_REG does not
work because CC_REGNUM does not support normal modes, and so
fails to match register_operand. Use a non-register constraint
and the cc_register predicate instead.
* config/arm/constraints.md (c): Use cc_register predicate.
---
ot;lo" as aliases of "cs" and "cc".
* Add unsigned cmp tests to asm-flag-6.c.
Richard Sandiford has given his ack for the aarch64 side.
I'm still looking for an ack for the arm side.
r~
Richard Henderson (6):
aarch64: Add "c" constraint
arm: Fix the
On 11/12/19 9:21 PM, Richard Sandiford wrote:
> Apart from the vc/vs thing you mentioned in the follow-up for 4/6,
> it looks like 4/6, 5/6 and 6/6 are missing "hs" and "lo". OK for
> aarch64 with those added.
Are those aliases for two of the other conditions? They're not in the list
within the
On 11/11/19 4:03 PM, Andreas Krebbel wrote:
> On 11.11.19 15:39, Richard Henderson wrote:
>> On 11/7/19 12:52 PM, Andreas Krebbel wrote:
>>> +; Such patterns get directly emitted by noce_emit_store_flag.
>>> +(define_insn_and_split "*cstorecc_z13&quo
On 11/7/19 12:52 PM, Andreas Krebbel wrote:
> +; Such patterns get directly emitted by noce_emit_store_flag.
> +(define_insn_and_split "*cstorecc_z13"
> + [(set (match_operand:GPR 0 "register_operand""=&d")
> + (match_operator:GPR 1 "s390_comparison"
> +
> +;; define_subst and associated attributes
> +
> +(define_subst "add_setq"
> + [(set (match_operand:SI 0 "" "")
> +(match_operand:SI 1 "" ""))]
> + ""
> + [(set (match_dup 0)
> +(match_dup 1))
> + (set (reg:CC APSRQ_REGNUM)
> + (unspec:CC [(reg:CC APSRQ_REGNUM)] UNSPEC_Q_
On 11/8/19 11:54 AM, Richard Henderson wrote:
> +@table @code
> +@item eq
> +``equal'' or Z flag set
> +@item ne
> +``not equal'' or Z flag clear
> +@item cs
> +``carry'' or C flag set
> +@item cc
> +C flag clear
> +@item mi
> +
Inspired by the tests in gcc.target/i386. Testing code generation,
diagnostics, and execution.
* gcc.target/arm/asm-flag-1.c: New test.
* gcc.target/arm/asm-flag-3.c: New test.
* gcc.target/arm/asm-flag-5.c: New test.
* gcc.target/arm/asm-flag-6.c: New test.
---
g
Inspired by the tests in gcc.target/i386. Testing code generation,
diagnostics, and execution.
* gcc.target/aarch64/asm-flag-1.c: New test.
* gcc.target/aarch64/asm-flag-3.c: New test.
* gcc.target/aarch64/asm-flag-5.c: New test.
* gcc.target/aarch64/asm-flag-6.c:
Since all but a couple of lines is shared between the two targets,
enable them both at once.
* config/arm/aarch-common-protos.h (arm_md_asm_adjust): Declare.
* config/arm/aarch-common.c (arm_md_asm_adjust): New.
* config/arm/arm-c.c (arm_cpu_builtins): Define
__GCC_
The existing definition using register class CC_REG does not
work because CC_REGNUM does not support normal modes, and so
fails to match register_operand. Use a non-register constraint
and the cc_register predicate instead.
* config/arm/constraints.md (c): Use cc_register predicate.
---
CC_NZmode is a more accurate description of what we require
from the mode, and matches up with the definition in aarch64.
Rename noov_comparison_operator to nz_comparison_operator
in order to match.
* config/arm/arm-modes.def (CC_NZ): Rename from CC_NOOV.
* config/arm/predicates.m
done now and I could just use it in the kernel... ;-)
r~
Richard Henderson (6):
aarch64: Add "c" constraint
arm: Fix the "c" constraint
arm: Rename CC_NOOVmode to CC_NZmode
arm, aarch64: Add support for __GCC_ASM_FLAG_OUTPUTS__
arm: Add testsuite checks fo
Mirror arm in letting "c" match the condition code register.
* config/aarch64/constraints.md (c): New constraint.
---
gcc/config/aarch64/constraints.md | 4
1 file changed, 4 insertions(+)
diff --git a/gcc/config/aarch64/constraints.md
b/gcc/config/aarch64/constraints.md
index d0c3
On 9/25/19 3:54 PM, Joseph Myers wrote:
> On Fri, 20 Sep 2019, Richard Henderson wrote:
>
>> Tested on aarch64-linux (glibc) and aarch64-elf (installed newlib).
>>
>> The existing configure claims to be generated by 2.69, but there
>> are changes wrt the autoc
On 9/25/19 3:54 PM, Joseph Myers wrote:
> On Fri, 20 Sep 2019, Richard Henderson wrote:
>
>> Tested on aarch64-linux (glibc) and aarch64-elf (installed newlib).
>>
>> The existing configure claims to be generated by 2.69, but there
>> are changes wrt the autoc
As diagnosed in the PR.
* config/aarch64/lse.S (LDNM): Ensure STXR output does not
overlap the inputs.
---
libgcc/config/aarch64/lse.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
index a5f6673596
Tested on aarch64-linux (glibc) and aarch64-elf (installed newlib).
The existing configure claims to be generated by 2.69, but there
are changes wrt the autoconf distributed with Ubuntu 18. Nothing
that seems untoward though.
r~
* config/aarch64/lse-init.c: Include auto-target.h. Dis
On 9/18/19 5:58 AM, Kyrill Tkachov wrote:
> Thanks for this.
>
> I've bootstrapped and tested this patch series on systems with and without LSE
> support, both with and without patch [6/6], so 4 setups in total.
>
> It all looks clean for me.
>
> I'm favour of this series going in (modulo patch
* config/aarch64/aarch64.opt (-moutline-atomics): New.
* config/aarch64/aarch64.c (aarch64_atomic_ool_func): New.
(aarch64_ool_cas_names, aarch64_ool_swp_names): New.
(aarch64_ool_ldadd_names, aarch64_ool_ldset_names): New.
(aarch64_ool_ldclr_names, aarch64_o
This is the libgcc part of the interface -- providing the functions.
Rationale is provided at the top of libgcc/config/aarch64/lse.S.
* config/aarch64/lse-init.c: New file.
* config/aarch64/lse.S: New file.
* config/aarch64/t-lse: New file.
* config.host: Add t-lse
---
gcc/common/config/aarch64/aarch64-common.c | 6 --
gcc/config/aarch64/aarch64.c | 6 --
2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/gcc/common/config/aarch64/aarch64-common.c
b/gcc/common/config/aarch64/aarch64-common.c
index 07c03253951..2bbf454eea9 1
This pattern will only be used with the __sync functions, because
we do not yet have a bare TImode atomic load.
* config/aarch64/aarch64.c (aarch64_gen_compare_reg): Add support
for NE comparison of TImode values.
(aarch64_emit_load_exclusive): Add support for TImode.
With aarch64_track_speculation, we had extra code to do exactly what the
!strong_zero_p path already did. The rest is reducing code duplication.
* config/aarch64/aarch64 (aarch64_split_compare_and_swap): Disable
strong_zero_p for aarch64_track_speculation; unify some code paths;
* config/aarch64/aarch64.c (aarch64_print_operand): Allow integer
registers with %R.
---
gcc/config/aarch64/aarch64.c | 15 ---
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 232317d4a5a..9
-linux on a thunder x1.
I have not run tests on any platform supporting LSE, even qemu.
r~
Richard Henderson (6):
aarch64: Extend %R for integer registers
aarch64: Implement TImode compare-and-swap
aarch64: Tidy aarch64_split_compare_and_swap
aarch64: Add out-of-line functions for LSE
On 9/17/19 6:55 AM, Wilco Dijkstra wrote:
> Hi Kyrill,
>
>>> When you select a CPU the goal is that we optimize and schedule for that
>>> specific microarchitecture. That implies using atomics that work best for
>>> that core rather than outlining them.
>>
>> I think we want to go ahead with this
On 9/5/19 10:35 AM, Wilco Dijkstra wrote:
> Agreed. I've got a couple of general comments:
>
> * The option name -matomic-ool sounds too abbreviated. I think eg.
> -moutline-atomics is more descriptive and user friendlier.
Changed.
> * Similarly the exported __aa64_have_atomics variable could be
1 - 100 of 2706 matches
Mail list logo