[Bug rtl-optimization/90378] [9/10 regression] -Os -flto miscompiles 454.calculix after r266385 on Arm

2020-03-11 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90378

Christophe Lyon  changed:

   What|Removed |Added

   Last reconfirmed||2020-03-11
 CC||clyon at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #7 from Christophe Lyon  ---
I am able to reproduce the failure with the same commit mentioned by Maxim in
comment #3. Using a more recent trunk (Feb 21, 2020) made the problem
disappear.

I'm using -Os -flto -mthumb, with a GCC bootstrapped on an armv7 machine
(cortex-a15, NVidia jetson-tk1).

Like Maxim said in comment #1, if I copy the binary and runtime libs
(libgfortran, etc) to an ARMv8 machine with AArch32 mode support, the
execution is successful.

The failure looks like this:

[...]
increment 1 attempt 1 
increment size= 5.00e-02
sum of previous increments=0.00e+00

ilin=0
iteration 1

Segmentation fault


Under gdb:
Program received signal SIGSEGV, Segmentation fault.
0xb6deab58 in ?? () from /home/christophe.lyon/calculix.broken/lib/libc.so.6
(gdb) bt
#0  0xb6deab58 in ?? () from
/home/christophe.lyon/calculix.broken/lib/libc.so.6
#1  0xb6deb01e in ?? () from
/home/christophe.lyon/calculix.broken/lib/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)


When using valgrind (3.11.0), several errors are reported before reaching the
point where the code normally crashes, but execution continues:

[...]
increment 1 attempt 1 
increment size= 5.00e-02
sum of previous increments=0.00e+00

ilin=0
iteration 1

largest residual force= 203.899659
no convergence

iteration 2

Most of the errors are "Invalid write of size 4" or "Use of uninitialised value
of size 4" in bpabi.S lines 256-259, which correspond to macro expansion of
push_for_divide and pop_for_divide in aeabi_uldivmod. The errors are about
reading/writing in the stack.


When using valgrind (3.13.0) on ARMv8 hardware, it does not report any error,
so I'm puzzled: was it a bug in valgrind-3.11.0, or are some glibc ifuncs
changing the behaviour?


Anyway, I still don't know where the program crashes on ARMv7 hardware.

[Bug rtl-optimization/90378] [9/10 regression] -Os -flto miscompiles 454.calculix after r266385 on Arm

2020-03-12 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90378

--- Comment #8 from Christophe Lyon  ---
I also tried to run the program under QEMU, it works (doesn't crash)

[Bug rtl-optimization/90378] [9/10 regression] -Os -flto miscompiles 454.calculix after r266385 on Arm

2020-03-13 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90378

--- Comment #10 from Christophe Lyon  ---
I checked the stack settings on the ARMv7 and ARMv8 machines:
ARMv7: beced000-bed0e000 rw-p  00:00 0  [stack]
ARMv8: fff12000-fff33000 rw-p  00:00 0  [stack]

In both cases ulimit -a says:
stack size  (kbytes, -s) 8192


I recompiled with "-Os -flto  -g -mthumb" (ie. I added -g) and execution
sometimes reaches iteration 2, sometimes not.

Using gdb I got a bit more info:
Program received signal SIGSEGV, Segmentation fault.
_int_free (av=0xc59a20, p=, have_lock=12975032) at malloc.c:4088
4088malloc.c: No such file or directory.

But this is random too, another run gave:
Program received signal SIGSEGV, Segmentation fault.
__brk (addr=0x1) at ../sysdeps/unix/sysv/linux/arm/brk.c:31
31  ../sysdeps/unix/sysv/linux/arm/brk.c: No such file or directory.


Other runs from the same gdb gave no backtrace at all, and at least one of them
reached iteration 4.

[Bug target/94317] gcc/config/arm/arm_mve.h:13907: strange assignment ?

2020-03-25 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94317

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #6 from Christophe Lyon  ---
There are lots of tests under gcc.target/arm/mve, but I think all them just
check that the generated code contains the instruction we expect to generate
from the intrinsic. (All of them have dg-do compile)

In the past I contributed Neon intrinsics executable tests (see
gcc.target/aarch64/advsimd-intrinsics) which caught several corner-case bugs.

It would probably be good to have similar things for MVE, but that's a huge
task.

[Bug middle-end/94339] New: [10 regression] ICE in tree_class_check_failed

2020-03-26 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94339

Bug ID: 94339
   Summary: [10 regression] ICE in tree_class_check_failed
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Hi,

I've noticed this ICE while building GDB with recent GCC trunk, it appeared
between: r10-7336 and r10-7346

x86_64-unknown-linux-gnu-g++ -g -O2 -c ada-lang.ii 
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/binutils-gdb.git~gdb-8.3.1-release/gdb/ada-lang.c:
In function ‘char* ada_main_name()’:
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/binutils-gdb.git~gdb-8.3.1-release/gdb/ada-lang.c:934:386:
internal compiler error: tree check: expected class ‘constant’, have ‘unary’
(nop_expr) in warnings_for_convert_and_check, at c-family/c-warn.c:1378
  934 |   main_program_name_addr = BMSYMBOL_VALUE_ADDRESS (msym);
  |
   
   
   
 ^
0x604b15 tree_class_check_failed(tree_node const*, tree_code_class, char
const*, int, char const*)
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/tree.c:9763
0x9f8c69 tree_class_check(tree_node*, tree_code_class, char const*, int, char
const*)
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/tree.h:3401
0x9f8c69 warnings_for_convert_and_check(unsigned int, tree_node*, tree_node*,
tree_node*)
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/c-family/c-warn.c:1378
0x74330a cp_convert_and_check(tree_node*, tree_node*, int)
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/cp/cvt.c:676
0x6b9c2e convert_like_real
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/cp/call.c:7844
0x6bb3bb perform_implicit_conversion_flags(tree_node*, tree_node*, int, int)
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/cp/call.c:11867
0x6c8794 perform_implicit_conversion(tree_node*, tree_node*, int)
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/cp/call.c:11879
0x6c8794 build_conditional_expr_1
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/cp/call.c:5642
0x6c90be build_conditional_expr(op_location_t const&, tree_node*, tree_node*,
tree_node*, int)
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/cp/call.c:5705
0x9406f2 build_x_conditional_expr(unsigned int, tree_node*, tree_node*,
tree_node*, int)
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/cp/typeck.c:6972
0x82a52a cp_parser_assignment_expression
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/cp/parser.c:9828
0x82a7fc cp_parser_expression
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/cp/parser.c:9992
0x8356ef cp_parser_primary_expression
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/cp/parser.c:5359
0x84ec7d cp_parser_postfix_expression
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/cp/parser.c:7219
0x831900 cp_parser_unary_expression
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/cp/parser.c:8525
0x829312 cp_parser_cast_expression
   
/home/christophe.lyon/src/Linaro/abe/abe-master/mybuild/snapshots/gcc.git~master_rev_75fb811dfaa29d60a897924b0d1629577b90eee7/gcc/cp/parser.c:9416
0x829ddf cp_parser_simple_cast_expression
   
/home/christophe.lyon/src/Linaro/abe/abe-maste

[Bug middle-end/94339] [10 regression] ICE in tree_class_check_failed

2020-03-26 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94339

--- Comment #2 from Christophe Lyon  ---
Created attachment 48123
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48123&action=edit
ada-lang.ii.xz

[Bug tree-optimization/90332] New test case gcc.dg/vect/slp-reduc-sad-2.c in r270847 fails

2020-03-30 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90332

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #10 from Christophe Lyon  ---
Hi,
This caused a regression on aarch64:
FAIL: gcc.dg/vect/pr92420.c -flto -ffat-lto-objects execution test
FAIL: gcc.dg/vect/pr92420.c execution test

[Bug tree-optimization/94401] New: pr92420.c fails on aarch64 since r10-7415

2020-03-30 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94401

Bug ID: 94401
   Summary: pr92420.c fails on aarch64 since r10-7415
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

I've noticed that the fix for PR90332 caused a regression on aarch64:

FAIL: gcc.dg/vect/pr92420.c -flto -ffat-lto-objects execution test
FAIL: gcc.dg/vect/pr92420.c execution test

[Bug tree-optimization/90332] New test case gcc.dg/vect/slp-reduc-sad-2.c in r270847 fails

2020-03-30 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90332

--- Comment #12 from Christophe Lyon  ---

> Can you open a new bugreport?

Sure, I filed PR94401

[Bug tree-optimization/94401] pr92420.c fails on aarch64 since r10-7415

2020-03-30 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94401

--- Comment #2 from Christophe Lyon  ---
The defaults are OK (either native or cross aarch64)

[Bug target/94445] New: gcc.target/arm/cmse/cmse-15.c fails for cortex-m33

2020-04-01 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94445

Bug ID: 94445
   Summary: gcc.target/arm/cmse/cmse-15.c fails for cortex-m33
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

I've noticed that when GCC is configured --target arm-none-abi
--with-mode=thumb --with-cpu=cortex-m33, the cmse-15.c test fails:

FAIL: gcc.target/arm/cmse/cmse-15.c   -O2   scan-assembler-times
bl\\s+__gnu_cmse_nonsecure_call 6
FAIL: gcc.target/arm/cmse/cmse-15.c   -O3 -g   scan-assembler-times
bl\\s+__gnu_cmse_nonsecure_call 6
FAIL: gcc.target/arm/cmse/cmse-15.c   -Os   scan-assembler-times
bl\\s+__gnu_cmse_nonsecure_call 6


I've looked at the -O2 case, where 8 calls to __gnu_cmse_nonsecure_call instead
of the expected 6.

The testcase can then be reduced to:
=
typedef int __attribute__ ((cmse_nonsecure_call)) ns_foo_t (void);
typedef int s_bar_t (void);

typedef int (*s_bar_ptr) (void);

int nonsecure0 (ns_foo_t * ns_foo_p)
{
  return ns_foo_p ();
}
int secure0 (s_bar_t * s_bar_p)
{
  return s_bar_p ();
}
int secure2 (s_bar_ptr s_bar_p)
{
  return s_bar_p ();
}
=

secure0 and secure2 make use of __gnu_cmse_nonsecure_call instead of doing a
normal call.

If I comment out nonsecure0(), then the test passes (only "bx r0" is generated
for secure0/secure2).


With nonsecure0() un-commented as above, I've looked at
arm_function_ok_for_sibcall() which is called 3 times.

The first time (for nonsecure0):
 
public unsigned SI size  unit-size

align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7facbe670930>
visited var 
def_stmt GIMPLE_NOP
version:2
ptr-info 0x7facbe39c300>


The next two times (for secure0/secure2):
 
public unsigned SI size  unit-size

align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7facbe670930>

arg:0 
visited var 
def_stmt GIMPLE_NOP
version:2
ptr-info 0x7facbe3b5be8>>


 
public unsigned SI size  unit-size

align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7facbe670930>

arg:0 
visited var 
def_stmt GIMPLE_NOP
version:2
ptr-info 0x7facbe3b8b28>>


So in the last two cases, the function being called has an incorrect 'ns_foo_t'
type.

The fact that commenting out nonsecure0 makes the test pass makes me think that
some global state is modified when handling the  cmse_nonsecure_call attribute,
but I couldn't find such a problem in arm_handle_cmse_nonsecure_call().


I haven't yet found where that function type is attached to the 'fn' node,
maybe that's where the problem is.

I've also noticed that in the first case 'fn' is an ssa_name, while in the two
other ones, it is a nop_expr.



I've been investigating this for quite some time, so any clue is appreciated.

[Bug target/94445] gcc.target/arm/cmse/cmse-15.c fails for cortex-m33

2020-04-02 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94445

--- Comment #3 from Christophe Lyon  ---
I also checked that arm_handle_cmse_nonsecure_call correctly duplicates the
type.

[Bug tree-optimization/94456] New: ICE in aarch64/sve/pr87815.c since r10-7491

2020-04-02 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94456

Bug ID: 94456
   Summary: ICE in aarch64/sve/pr87815.c since r10-7491
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Since r10-7491, I've noticed a regression with an ICE:

FAIL: gcc.target/aarch64/sve/pr87815.c -march=armv8.2-a+sve (internal compiler
error)


/gcc/testsuite/gcc.target/aarch64/sve/pr87815.c: In function 'f':
/gcc/testsuite/gcc.target/aarch64/sve/pr87815.c:6:6: error: missing definition
for SSA_NAME: _50 in statement:
_31 = _50;
during GIMPLE pass: vect
/gcc/testsuite/gcc.target/aarch64/sve/pr87815.c:6:6: internal compiler error:
verify_ssa failed
0xfe9253 verify_ssa(bool, bool)
/gcc/tree-ssa.c:1208
0xc6c293 execute_function_todo
/gcc/passes.c:1992
0xc6cb75 execute_todo
/gcc/passes.c:2039
Please submit a full bug report,
with preprocessed source if appropriate.

[Bug tree-optimization/94043] [9 Regression] ICE in superloop_at_depth, at cfgloop.c:78

2020-04-02 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94043

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #22 from Christophe Lyon  ---
This caused an ICE on aarch64, see PR94456

[Bug tree-optimization/91322] [10 regression] alias-4 test failure

2020-04-03 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91322

--- Comment #5 from Christophe Lyon  ---
Created attachment 48183
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48183&action=edit
executable asm from objdump

[Bug tree-optimization/91322] [10 regression] alias-4 test failure

2020-04-03 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91322

--- Comment #6 from Christophe Lyon  ---
Created attachment 48184
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48184&action=edit
GCC passes dumps

[Bug tree-optimization/91322] [10 regression] alias-4 test failure

2020-04-03 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91322

--- Comment #7 from Christophe Lyon  ---
Created attachment 48185
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48185&action=edit
qemu execution trace

[Bug target/94531] New: gcc.target/arm/its.c fails for cortex-m3

2020-04-08 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94531

Bug ID: 94531
   Summary: gcc.target/arm/its.c fails for cortex-m3
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

I've noticed that gcc.target/arm/its.c fails when targetting
cortex-m3 or m33, but that's probably true with all cortex-m versions.

The code generated at r206697 (just before the test was
introduced) was (max cond insns = 5):

test:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
cmp r0, #10
itete   gt
subgt   r0, r0, r1
suble   r0, r1, r0
addgt   r0, r0, #10
suble   r0, r0, #7
cmp r0, #0
it  gt
subgt   r0, r0, #3
bx  lr

At r206698 (arm.c (arm_v7m_tune): Set max_insns_skipped to 2),
which introduced its.c, the generated code is the same as above,
so it seems that r206698 was wrong and the problem was not caught
by the testcase.

The testcase started to fail at r249215 (Make
gcc.target/arm/its.c more robust), but the generated code was
still the same.

As of recent trunk. the generated code is:
test:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
cmp r0, #10
itete   gt
subgt   r1, r0, r1
suble   r1, r1, r0
addgt   r0, r1, #10
suble   r0, r1, #7
cmp r0, #0
it  gt
subgt   r0, r0, #3
bx  lr

That is, almost same, except that r1 is used as temporary instead
of r0 in some parts.

Reducing max cond insns to 1 produces:
test:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
cmp r0, #10
ble .L2
subsr1, r0, r1
add r0, r1, #10
cmp r0, #0
it  gt
subgt   r0, r0, #3
bx  lr
.L2:
subsr1, r1, r0
subsr0, r1, #7
cmp r0, #0
it  gt
subgt   r0, r0, #3
bx  lr
Increasing max cond insns back to 5 produces the same code as with '2'.

Since there was little justification of the patches around this
test, I'm wondering whether the current is bad or should we just
change the test?

[Bug target/52565] __builtin_va_arg(va, double); may fail on cortex-m3

2020-04-09 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52565

Christophe Lyon  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 CC||clyon at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Christophe Lyon  ---
(In reply to Ravaz from comment #0)
[...]
> The instruction at 0x810c forces the address used for the ldrd to be
> alligned to an 8 bytes boundary. The problem is that the double parameter is
> passed on register r2-r3 and pushed on the stack at the entry point of 
> vargTest function. Since the stack is aligned on 4 bytes boundary only the
> double value may be misaligned and as a consequence the
> __builtin_va_arg(vaList, double) function fails to retrive the correct
> value. 
> 
> Is this a bug?

AFAIC, the ARM ABI (AAPCS) mandates that the stack is aligned on 8 bytes, so
r2/r3 are pushed on an 8 bytes boundary.

For the record, today's trunk generates (-O0):

vargTest:
@ args = 4, pretend = 16, frame = 16
@ frame_needed = 1, uses_anonymous_args = 1
@ link register save eliminated.
push{r0, r1, r2, r3}
str fp, [sp, #-4]!
add fp, sp, #0
sub sp, sp, #20
add r3, fp, #8
str r3, [fp, #-16]
ldr r3, [fp, #-16]
add r3, r3, #7
bic r3, r3, #7
add r2, r3, #8
str r2, [fp, #-16]
ldrdr2, [r3]
strdr2, [fp, #-12]
ldrdr2, [fp, #-12]
mov r0, r2
mov r1, r3
add sp, fp, #0
@ sp needed
ldr fp, [sp], #4
add sp, sp, #16
bx  lr

and -O2:
vargTest:
@ args = 4, pretend = 16, frame = 8
@ frame_needed = 0, uses_anonymous_args = 1
@ link register save eliminated.
push{r0, r1, r2, r3}
sub sp, sp, #8
add r3, sp, #19
add r2, sp, #12
bic r3, r3, #7
ldrdr0, [r3]
str r2, [sp, #4]
add sp, sp, #8
@ sp needed
add sp, sp, #16
bx  lr


So I think this is invalid.

[Bug testsuite/94547] New: gcc.target/arm/acle/cde.c fails on armeb

2020-04-10 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94547

Bug ID: 94547
   Summary: gcc.target/arm/acle/cde.c fails on armeb
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

I've noticed that gcc.target/arm/acle/cde.c fails on
armeb-none-linux-gnueabihf.

gcc.target/arm/acle/cde.c   -O1   check-function-bodies test_cde_cx1da
gcc.target/arm/acle/cde.c   -O1   check-function-bodies test_cde_cx2da
gcc.target/arm/acle/cde.c   -O1   check-function-bodies test_cde_cx3da
gcc.target/arm/acle/cde.c   -O2   check-function-bodies test_cde_cx1da
gcc.target/arm/acle/cde.c   -O2   check-function-bodies test_cde_cx2da
gcc.target/arm/acle/cde.c   -O2   check-function-bodies test_cde_cx3da
gcc.target/arm/acle/cde.c   -O3 -g   check-function-bodies test_cde_cx1da
gcc.target/arm/acle/cde.c   -O3 -g   check-function-bodies test_cde_cx2da
gcc.target/arm/acle/cde.c   -O3 -g   check-function-bodies test_cde_cx3da
gcc.target/arm/acle/cde.c   -Os   check-function-bodies test_cde_cx1da
gcc.target/arm/acle/cde.c   -Os   check-function-bodies test_cde_cx2da
gcc.target/arm/acle/cde.c   -Os   check-function-bodies test_cde_cx3da
gcc.target/arm/acle/cde.c  -O2 -flto -fno-use-linker-plugin
-flto-partition=none -ffat-lto-objects  check-function-bodies test_cde_cx1da
gcc.target/arm/acle/cde.c  -O2 -flto -fno-use-linker-plugin
-flto-partition=none -ffat-lto-objects  check-function-bodies test_cde_cx2da
gcc.target/arm/acle/cde.c  -O2 -flto -fno-use-linker-plugin
-flto-partition=none -ffat-lto-objects  check-function-bodies test_cde_cx3da
gcc.target/arm/acle/cde.c  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects -ffat-lto-objects  check-function-bodies test_cde_cx1da
gcc.target/arm/acle/cde.c  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects -ffat-lto-objects  check-function-bodies test_cde_cx2da
gcc.target/arm/acle/cde.c  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects -ffat-lto-objects  check-function-bodies test_cde_cx3da


I don't know if that's supposed to work on big-endian, so maybe it's just a
matter of skipping the test in that config?

[Bug target/56550] cortex-m3: incorrect write to member of volatile packed structure

2020-04-10 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56550

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org
 Target||arm

--- Comment #1 from Christophe Lyon  ---
With today's trunk, the generated code with -mcpu=cortex-m3 -mthumb is:
main:
@ args = 0, pretend = 0, frame = 8
@ frame_needed = 1, uses_anonymous_args = 0
@ link register save eliminated.
push{r7}
sub sp, sp, #12
add r7, sp, #0
mvn r3, #254
str r3, [r7, #4]
ldr r3, [r7, #4]
ldr r2, .L3
str r3, [r2, #7]@ unaligned
movsr3, #0
mov r0, r3
addsr7, r7, #12
mov sp, r7
@ sp needed
ldr r7, [sp], #4
bx  lr

Since cortex-m3 supports unaligned accesses, the use of
str r3, [r2, #7]@ unaligned
looks OK

[Bug rtl-optimization/70164] [8/9/10 Regression] Code/performance regression due to poor register allocation on Cortex-M0

2020-04-10 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70164

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #24 from Christophe Lyon  ---
With today's trunk, I still see the same problem:

history_expand_line_internal:
@ args = 0, pretend = 0, frame = 8
@ frame_needed = 0, uses_anonymous_args = 0
push{r0, r1, r2, r4, r5, r6, r7, lr}
ldr r3, .L3
ldr r6, .L3+4
ldr r7, [r3]
ldr r3, [r6]   ; ; <--- load of 'hist_verify' onto r3
movsr5, r0
str r3, [sp, #4]; <--- Spill
movsr3, #0
str r3, [r6]
bl  pre_process_line
ldr r3, [sp, #4]; <--- reload
movsr4, r0
addsr7, r7, r3
str r7, [r6]
cmp r5, r0
bne .L1
bl  str_len
addsr0, r0, #1
bl  x_malloc
movsr1, r4
bl  str_cpy
movsr4, r0
.L1:
@ sp needed
movsr0, r4
pop {r1, r2, r3, r4, r5, r6, r7, pc}
.L4:
.align  2
.L3:
.word   a1
.word   hist_verify

[Bug target/82038] Very poor optimization of constant multiply on ARM Cortex-M7

2020-04-10 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82038

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #3 from Christophe Lyon  ---
With todays' trunk, we get rid of the push/pop sequence:

_Z1fi:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
asrsr1, r0, #31
mov r3, r0
lslsr0, r0, #14
lslsr1, r1, #14
orr r1, r1, r3, lsr #18
bx  lr

_Z1gi:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
mov r1, r0
lslsr0, r0, #14
asrsr1, r1, #18
bx  lr

[Bug target/94576] Regression build newlib for Arm64

2020-04-14 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94576

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #3 from Christophe Lyon  ---
The title and target mention arm64/aarch64 but your description uses
arm-none-eabi-gcc, so it looks like the problem is when arm/aarch32 code.

Could you clarify?

There was a regression last week on aarch64 while building newlib, but it was a
different error it seems:
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543668.html
and it was fixed by:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=a615ea71bc8fbf31b9bc71cb373a7ca5b9cca44a

[Bug target/94538] [10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-14 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538

--- Comment #8 from Christophe Lyon  ---

> Adding Christophe. I'm thinking the best approach right now is to revert
> given -mpure-code doesn't work at all on Thumb-1 targets - it still emits
> literal pools, switch tables etc. That's not pure code!

Do you have testcases that show these failures?

I did check some of the problematic testcases in the GCC testsuite when I
committed that patch. Did I miss some of them?

Can you point me to offending testcases and compiler options so that I can
reproduce them?

[Bug target/94576] Regression build newlib for Arm

2020-04-14 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94576

Christophe Lyon  changed:

   What|Removed |Added

 Target|aarch64 |arm
Summary|Regression build newlib for |Regression build newlib for
   |Arm64   |Arm

--- Comment #5 from Christophe Lyon  ---
(In reply to Trass3r from comment #4)

> It's -march=armv8.1-m.main+mve. arm-none-eabi is just the configured target
> name which hasn't been adapted.
> 

Yes, I saw that, so it's definitely arm/aarch32 mode, *not* arm64/aarch64.

You are using armv8.1-m architecture in aarch32 mode.

[Bug target/94595] New: gcc.target/arm/thumb2-cond-cmp-*.c fail for cortex-m

2020-04-14 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94595

Bug ID: 94595
   Summary: gcc.target/arm/thumb2-cond-cmp-*.c fail for cortex-m
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

I've noticed that gcc.target/arm/thumb2-cond-cmp-*.c fail when targeting
cortex-M CPUs.

For thumb2-cond-cmp-1.c, the code generated at svn r196196 for cortex-m3 was:
f:
cmp r1, #45
it  ne
cmpne   r0, #43
ite ne
movne   r0, #0
moveq   r0, #1
bx  lr

Since r204778, we generate:
f:
cmp r0, #43 @ 8 [c=4 l=2]  *arm_cmpsi_insn/0
ittte   ne
subne   r0, r1, #45 @ 66[c=8 l=6]  *p *arm_subsi3_insn/5
clzne   r0, r0  @ 67[c=8 l=6]  *p clzsi2
lsrne   r0, r0, #5  @ 68[c=8 l=6]  *p *arm_shiftsi3/1
moveq   r0, #1  @ 5 [c=8 l=6]  *p *thumb2_movsi_insn/1
bx  lr  @ 61[c=8 l=4]  *thumb2_return


but using -mcpu=cortex-a9 generates:
f:
cmp r1, #45 @ 26[c=20 l=6]  *cmp_ior/0
it  ne
cmpne   r0, #43
ite eq
moveq   r0, #1  @ 29[c=8 l=6]  *p *thumb2_movsi_insn/1
movne   r0, #0  @ 30[c=8 l=6]  *p *thumb2_movsi_insn/1
bx  lr  @ 33[c=8 l=4]  *thumb2_return


I'm not quite sure about the intent of the test (the comment says "Use
conditional compare"): strictly speaking, the cortex-m3 version does not use
conditional compares, but I'm wondering whether subne can be considered as a
conditional compare? If so, then only the testcases need an update in the
scan-assembler directive.

Or do we want to enforce the use of conditional compare instructions?

FWIW, these tests were introducted in 2011 (r178102) and started failing for
cortex-m in 2013 (r204778), does anyone remember the context?

[Bug target/94538] [10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-14 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538

--- Comment #11 from Christophe Lyon  ---
(In reply to Wilco from comment #10)
> 
> For example:
> 
> int x;
> int f1 (void) { return x; }
> 
> with eg. -O2 -mcpu=cortex-m0 -mpure-code I get:
> 
> movsr3, #:upper8_15:#.LC1
> lslsr3, #8
> addsr3, #:upper0_7:#.LC1
> lslsr3, #8
> addsr3, #:lower8_15:#.LC1
> lslsr3, #8
> addsr3, #:lower0_7:#.LC1
> @ sp needed
> ldr r3, [r3]
> ldr r0, [r3, #40]
> bx  lr
> 
> That's an extra indirection through a literal... There should only be
> one ldr to read x.

Right, but the code is functional. I mentioned that problem when I submitted
the patch. I thought it was good to provide functionality and improve the
generated code later.
I wrote: "I haven't found yet how to make code for cortex-m0 apply upper/lower
relocations to "p" instead of .LC2. The current code looks functional, but
could be improved."

> 
> Big switch tables are produced for any Thumb-1 core, however I would expect
> Cortex-m0/m23 versions to look almost identical to the Cortex-m3 one, and
> use a sequence of comparisons instead of tables.
> 
> int f2 (int x, int y)
> {
>   switch (x)
>   {
> case 0: return y + 0;
> case 1: return y + 1;
> case 2: return y + 2;
> case 3: return y + 3;
> case 4: return y + 4;
> case 5: return y + 5;
>   }
>   return y;
> }
> 

I believe this is expected: as I wrote in my commit message
"CASE_VECTOR_PC_RELATIVE is now false with -mpure-code, to avoid generating
invalid assembly code with differences from symbols from two different sections
(the difference cannot be computed by the assembler)."

Maybe there's a possibility to tune this to detect cases where we can do
better?


> Immediate generation for common cases seems to be screwed up:
> 
> int f3 (void) { return 0x1100; }
> 
> -O2 -mcpu=cortex-m0 -mpure-code:
> 
> movsr0, #17
> lslsr0, r0, #8
> lslsr0, r0, #8
> lslsr0, r0, #8
> bx  lr

This is not optimal, but functional, right?


> This also regressed Cortex-m23 which previously generated:
> 
> movsr0, #136
> lslsr0, r0, #21
> bx  lr
> Similar regressions happen with other immediates:
> 
> int f3 (void) { return 0x12345678; }
> 
> -O2 -mcpu=cortex-m23 -mpure-code:
> 
> movsr0, #86
> lslsr0, r0, #8
> addsr0, r0, #120
> movtr0, 4660
> bx  lr
> 
> Previously it was:
> 
> movwr0, #22136
> movtr0, 4660
> bx  lr
OK, I'll check how to fix that.


> Also relocations with a small offset should be handled within the
> relocation. I'd expect this to never generate an extra addition, let alone
> an extra literal pool entry:
> 
> int arr[10];
> int *f4 (void) { return &arr[1]; }
> 
> -O2 -mcpu=cortex-m3 -mpure-code generates the expected:
> 
> movwr0, #:lower16:.LANCHOR0+4
> movtr0, #:upper16:.LANCHOR0+4
> bx  lr
> 
> -O2 -mcpu=cortex-m23 -mpure-code generates this:
> 
> movwr0, #:lower16:.LANCHOR0
> movtr0, #:upper16:.LANCHOR0
> addsr0, r0, #4
> bx  lr

For cortex-m23, I get the same code with and without -mpure-code.

> 
> And cortex-m0 again inserts an extra literal load:
> 
> movsr3, #:upper8_15:#.LC0
> lslsr3, #8
> addsr3, #:upper0_7:#.LC0
> lslsr3, #8
> addsr3, #:lower8_15:#.LC0
> lslsr3, #8
> addsr3, #:lower0_7:#.LC0
> ldr r0, [r3]
> addsr0, r0, #4
> bx  lr
Yes, same problem as in f1()


So I think -mpure-code for v6m is not broken, but yes the generated code can be
improved. So this may not be relevant to this PR?

[Bug c++/94608] New: Fix for PR94426 causes a regression in g++.dg/lto/pr83720 on arm

2020-04-15 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94608

Bug ID: 94608
   Summary: Fix for PR94426 causes a regression in
g++.dg/lto/pr83720 on arm
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Hi,

The recent fix for PR94426 (g8d213cbbe1856e6088282aa0076646cec694b030)
causes regressions on arm:
g++.dg/lto/pr83720 cp_lto_pr83720_0.o assemble, -O0 -flto
-flto-partition=1to1 -fno-use-linker-plugin 
g++.dg/lto/pr83720 cp_lto_pr83720_0.o assemble, -O0 -flto
-flto-partition=none -fuse-linker-plugin
g++.dg/lto/pr83720 cp_lto_pr83720_0.o assemble, -O0 -flto
-fuse-linker-plugin -fno-fat-lto-objects 
g++.dg/lto/pr83720 cp_lto_pr83720_0.o assemble, -O2 -flto
-flto-partition=1to1 -fno-use-linker-plugin 
g++.dg/lto/pr83720 cp_lto_pr83720_0.o assemble, -O2 -flto
-flto-partition=none -fuse-linker-plugin -fno-fat-lto-objects 
g++.dg/lto/pr83720 cp_lto_pr83720_0.o assemble, -O2 -flto
-fuse-linker-plugin

The logs say:
/gcc/testsuite/g++.dg/lto/pr83720_0.C:51:18: warning: 'virtual
void::be::f()' used but never defined

Does the testcase need an adjustment?

[Bug target/94538] [10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-16 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538

--- Comment #12 from Christophe Lyon  ---
I've posted a patch to fix the regression for your f3() examples:
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543993.html

[Bug target/94538] [10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-16 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538

--- Comment #15 from Christophe Lyon  ---
(In reply to Wilco from comment #14)
> (In reply to Christophe Lyon from comment #11)
> > (In reply to Wilco from comment #10)
> 
> > Right, but the code is functional.
> 
> It doesn't avoid the literal load from flash which is exactly what pure-code
> and slow-flash-data is all about.

For f1 on M0, I can see:
.section.rodata.cst4,"aM",%progbits,4
.align  2
.LC0:
.word   .LANCHOR0
.section .text,"0x2006",%progbits
[...]
f1:
movsr3, #:upper8_15:#.LC0
lslsr3, #8
addsr3, #:upper0_7:#.LC0
lslsr3, #8
addsr3, #:lower8_15:#.LC0
lslsr3, #8
addsr3, #:lower0_7:#.LC0
ldr r3, [r3]@ 6 [c=10 l=2]  *thumb1_movsi_insn/8
ldr r0, [r3]@ 7 [c=10 l=2]  *thumb1_movsi_insn/8
bx  lr
[...]
.bss
.align  2
.set.LANCHOR0,. + 0
.type   x, %object
.size   x, 4
x:
.space  4

So the 1st load is from .rodata.cst4 and the 2nd load is from bss, both of
which do not have the purecode bit set (unlike .text). Isn't that OK?

[Bug target/94604] support for the ETSI basic operations

2020-04-21 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94604

--- Comment #2 from Christophe Lyon  ---
I think this is provided by arm_acle.h:

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/arm/arm_acle.h;h=6b08ffd4174c8d829fe5730f99cd8f28e300afab;hb=HEAD

You can see that saturating and DSP intrinsics have been recently added and
will be available with gcc-10.

You can get more details about ACLE (ARM C Language Extensions) from Arm:
https://developer.arm.com/architectures/system-architectures/software-standards/acle

There is paragraph showing how to implement the ETSI basic operations using the
ACLE DSP intrinsics.

Does that cover your need?

[Bug rtl-optimization/70164] [8/9/10 Regression] Code/performance regression due to poor register allocation on Cortex-M0

2020-04-21 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70164

--- Comment #26 from Christophe Lyon  ---
For what CPU did you configure GCC?
With today's trunk I still see the same code as in comment #24.

I can get the same code as you have in comment #25 if I force -mcpu=cortex-a9.

The bug report is about cortex-m0, so you either need to configure GCC
--with-cpu=cortex-m0 or use -mcpu=cortex-m0

[Bug target/94743] New: IRQ handler implementation wrong when using __attribute__ ((interrupt("IRQ")))

2020-04-24 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

Bug ID: 94743
   Summary: IRQ handler implementation wrong when using
__attribute__ ((interrupt("IRQ")))
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Hi,

As described in https://bugs.linaro.org/show_bug.cgi?id=5614:

IRQ implementation when using __attribute__ ((interrupt("IRQ"))) by GCC does
not save/restore NEON scratch registers when using -mfloat-abi=hard
-mfpu=neon-fp16.

But if the handler calls a function that uses Neon scratch registers, this will
corrupt the values seen by the program when the IRQ handler completes.

An easy example uses memcpy in the handler:
=== irq_test.c ===

typedef struct {
int data[32];
} dummy_t ;

extern void foo(dummy_t d);


__attribute__ ((interrupt("IRQ"))) void IRQ_HDLR_Test(void) {

dummy_t d;
d.data[3] = 3;

foo(d);
}

===


Compile with arm-none-eabi-gcc  -mcpu=cortex-a9 -mtune=cortex-a9 -marm
-mfloat-abi=hard -mfpu=neon-fp16 -Ofast irq_test.c

The generated code looks like:

  21IRQ_HDLR_Test:
  22@ Interrupt Service Routine.
  23@ args = 0, pretend = 0, frame = 128
  24@ frame_needed = 0, uses_anonymous_args = 0
  25  04E04EE2  sub lr, lr, #4
  26 0004 0F502DE9  push{r0, r1, r2, r3, ip, lr}
  27 0008 F0D04DE2  sub sp, sp, #240
  28 000c 0330A0E3  mov r3, #3
  29 0010 80108DE2  add r1, sp, #128
  30 0014 7020A0E3  mov r2, #112
  31 0018 0D00A0E1  mov r0, sp
  32 001c 7C308DE5  str r3, [sp, #124]
  33 0020 FEEB  bl  memcpy
  34 0024 70308DE2  add r3, sp, #112
  35 0028 0F0093E8  ldm r3, {r0, r1, r2, r3}
  36 002c FEEB  bl  foo
  37 0030 F0D08DE2  add sp, sp, #240
  38@ sp needed
  39 0034 0F90FDE8  ldmfd   sp!, {r0, r1, r2, r3, ip, pc}^


In this case, the newlib version of memcpy uses NEON code and will clobber NEON
registers.

To be safe, the IRQ handler should
push {d0-d7, d16-d31}

This can be costly in terms of CPU cycles for an IRQ handler, so maybe we could
think of adding another attribute but it would be hard for the end-user to
guess that he might use Neon registers in an implicit way (like here where the
compiler calls memcpy)

[Bug testsuite/94763] UNRESOLVED scan assembler tests on arm-none-eabi

2020-04-26 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94763

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #1 from Christophe Lyon  ---
How do you configure GCC, and what flags to you use to run the tests?
They work for me, on several configuration of arm-non-eabi-gcc as
cross-compiler.

[Bug target/94697] aarch64: bti j at function start instead of bti c

2020-04-27 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94697

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #3 from Christophe Lyon  ---

> PR target/94697
> * gcc.target/aarch64/pr94697.c: New test.

The new test fails with -mabi=ilp32

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-04-27 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

--- Comment #1 from Christophe Lyon  ---
clang has implemented a warning for this case:
https://reviews.llvm.org/D28820

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-04-27 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

--- Comment #2 from Christophe Lyon  ---
I have a preliminary patch which generates:
vpush.64{d0, d1, d2, d3, d4, d5, d6, d7}
vpush.64{d16, d17, d18, d19, d20, d21, d22, d23, d24, d25, d26,
d27, d28, d29, d30, d31}

I'm not sure users would be happy with such long push sequences

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-04-28 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

--- Comment #3 from Christophe Lyon  ---
Maybe we could
- save the VFP registers as needed by default
- emit a warning "IRQ handler 'foo' saves VFP registers because it is not a
leaf function. If you know none of the callees will clobber the VFP registers
you can use the 'IRQ-dont-push-VFP-regs' attribute"
- implement this new attribute such that users can disable such long vpush
sequences

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-04-28 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

Christophe Lyon  changed:

   What|Removed |Added

 CC||ktkachov at gcc dot gnu.org

--- Comment #4 from Christophe Lyon  ---
Adding Kyrylo so that he can share Arm's thoughts.

[Bug target/94820] New: pr94780.c fails with ICE on aarch64

2020-04-28 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94820

Bug ID: 94820
   Summary: pr94780.c fails with ICE on aarch64
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Hi,

The new gcc.dg/pr94780.c test causes an ICE on aarch64:

/gcc/testsuite/gcc.dg/pr94780.c: In function 'foo':
/gcc/testsuite/gcc.dg/pr94780.c:8:1: internal compiler error: Segmentation
fault
0xd5f4af crash_signal
/gcc/toplev.c:328
0xe2b238 contains_struct_check
/gcc/tree.h:3400
0xe2b238 convert_nonlocal_reference_op
/gcc/tree-nested.c:1065
0x10cf01b walk_tree_1(tree_node**, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*,
tree_node* (*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*),
void*, hash_set >*))
/gcc/tree.c:12000
0xa34d93 walk_gimple_op(gimple*, tree_node* (*)(tree_node**, int*, void*),
walk_stmt_info*)
/gcc/gimple-walk.c:268
0xa355e4 walk_gimple_stmt(gimple_stmt_iterator*, tree_node*
(*)(gimple_stmt_iterator*, bool*, walk_stmt_info*), tree_node* (*)(tree_node**,
int*, void*), walk_stmt_info*)
/gcc/gimple-walk.c:596
0xa357e8 walk_gimple_seq_mod(gimple**, tree_node* (*)(gimple_stmt_iterator*,
bool*, walk_stmt_info*), tree_node* (*)(tree_node**, int*, void*),
walk_stmt_info*)
/gcc/gimple-walk.c:51
0xa35682 walk_gimple_stmt(gimple_stmt_iterator*, tree_node*
(*)(gimple_stmt_iterator*, bool*, walk_stmt_info*), tree_node* (*)(tree_node**,
int*, void*), walk_stmt_info*)
/gcc/gimple-walk.c:606
0xa357e8 walk_gimple_seq_mod(gimple**, tree_node* (*)(gimple_stmt_iterator*,
bool*, walk_stmt_info*), tree_node* (*)(tree_node**, int*, void*),
walk_stmt_info*)
/gcc/gimple-walk.c:51
0xe23ac1 walk_body
/gcc/tree-nested.c:713
0xe24508 walk_function
/gcc/tree-nested.c:724
0xe24508 walk_all_functions
/gcc/tree-nested.c:789
0xe2a1af lower_nested_functions(tree_node*)
/gcc/tree-nested.c:3553
0x86f243 cgraph_node::analyze()
/gcc/cgraphunit.c:676
0x872ae3 analyze_functions
/gcc/cgraphunit.c:1227
0x8738a2 symbol_table::finalize_compilation_unit()
/gcc/cgraphunit.c:2974

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-04-28 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

--- Comment #6 from Christophe Lyon  ---
If we consider the initial testcase, it doesn't clobber any FP register
directly, but the compiler inserts a call to memcpy which does.

So IIUC your 1st suggestion, it would mean:
- save no FP register in the IRQ handler
- call a libgcc routine to save all FP registers+status registers (this routine
would have to decide about d16 vs d32 at runtime, unless we can rely on
multilibs -- it would mean defining mandatory d16 and d32 libgcc multilibs...)

Hence I like your simpler suggestion :-)
But I think we should help the user diagnose potential problems:

- maybe issue a warning when compiling an IRQ handler without
-mgeneral-regs-only. That might break some packages, but would force them to
check their code
- also emit a warning when calling a function defined in another unit (but we
should recurse)

However this does not help if the compiler inserts a call to memcpy which
happens to be using FPU code: the user would get a warning, but how could he
solve it? Avoid implicit use of memcpy?

[Bug target/57002] ARM back end has extra entries in attribute interrupt array.

2020-04-29 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57002

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #3 from Christophe Lyon  ---
Patch sent: https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544871.html

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-04-29 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

Christophe Lyon  changed:

   What|Removed |Added

   Last reconfirmed||2020-04-29
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |clyon at gcc dot gnu.org

--- Comment #8 from Christophe Lyon  ---
Patch sent: https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544872.html

This is a simple improvement, hopefully simple enough for stage 4, yet useful
for the end-users.

[Bug target/57002] ARM back end has extra entries in attribute interrupt array.

2020-04-30 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57002

Christophe Lyon  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Christophe Lyon  ---
Fixed on trunk.

[Bug testsuite/94763] UNRESOLVED scan assembler tests on arm-none-eabi

2020-04-30 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94763

--- Comment #3 from Christophe Lyon  ---
(In reply to vvinayag from comment #2)

> Sorry for the late reply. 
> The tests appear to pass when I invoke them locally. They only failed as
> part of our buildbot run tests. It could be a glitch in our test system. But
> I was wondering whether there were any recent changes in the testsuites.

If you still have the .log file, you could check why the tests are UNRESOLVED,
there's probably an error message nearby.

[Bug target/94538] [9/10 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-04-30 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538

Christophe Lyon  changed:

   What|Removed |Added

  Known to work||9.2.0

--- Comment #18 from Christophe Lyon  ---
I've just checked that the testcase passes with 9.2.0

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-05-04 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

--- Comment #9 from Christophe Lyon  ---

> My initial thoughts are along the lines of...
> Only try to save FP registers that this function directly clobbers.
What's the point of saving these if a callee clobbers other registers?

Shouldn't that be something like save-nothing vs save-all-FP-regs if there is a
callee?

Do you mean save direct clobbers only when the handler is a leaf function?

> Provide libgcc routines to save/restore the FP context.
Do you mean such routines should push all FP regs+status regs?

> Or we could say simply:
> interrupt routines should be compiled as if with -mgeneral-regs-only and if
> they want to call some routine that uses FP then they must take it upon
> themselves to save and restore the FP context.

[Bug c++/94896] [10/11 regression] ICE: canonical types differ for identical types

2020-05-04 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94896

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #6 from Christophe Lyon  ---
For the record, this introduces regressions on arm-linux-gnueabihf too:
g++.dg/cpp0x/alignas3.C  -std=c++14 (internal compiler error)
g++.dg/cpp0x/alignas3.C  -std=c++17 (internal compiler error)
g++.dg/cpp0x/alignas3.C  -std=c++2a (internal compiler error)
g++.dg/other/pr54300.C  -std=gnu++14 (internal compiler error)
g++.dg/other/pr54300.C  -std=gnu++17 (internal compiler error)
g++.dg/other/pr54300.C  -std=gnu++2a (internal compiler error)
g++.dg/other/pr54300.C  -std=gnu++98 (internal compiler error)

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-05-04 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

--- Comment #11 from Christophe Lyon  ---
(In reply to Richard Earnshaw from comment #10)
> (In reply to Christophe Lyon from comment #9)
> > > My initial thoughts are along the lines of...
> > > Only try to save FP registers that this function directly clobbers.
> > What's the point of saving these if a callee clobbers other registers?
> > 
> 
> They need to be done early enough to ensure that any code in *this* function
> does not clobber them.  Any additional registers would have to be saved by a
> library call that does that.
> 
Why do we need a library function for that? It would have to be special with
the stack: push FP registers, but do not restore SP, so that the dual restore
function can pop them and restore SP.

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-05-04 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

--- Comment #13 from Christophe Lyon  ---

> > Why do we need a library function for that? It would have to be special with
> > the stack: push FP registers, but do not restore SP, so that the dual
> > restore function can pop them and restore SP.
> 
> Because it's a lot of code to work out how many FP registers there are.  You
> can't assume that the FPU used to compile the interrupt handler is the same
> as that being used at run time.

Ha, I had missed that point in your comment #5. I'll re-iterate on my WIP
patch.

But, in general (non-interrupt) code, what is supposed to happen if you compile
for a d32 VFP and run on d16 one ? (and the code uses the extra registers)

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-05-05 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

--- Comment #15 from Christophe Lyon  ---

> Well obviously that won't work.  But if you build the interrupt routine with
> a d16 system and then call a function from it that requires d32 then that
> should still work if running on a d32 CPU.

Thanks, I hadnt' thought of this combination.

> 
> I think we can probably make that work, but it's probably a bit of a dance
> to get it all right.  Hence the suggestion that this be done in a library
> function.

I suspect I'll have to iterate a few times to get that function right: I
haven't yet checked the interactions with the secure/non-secure modes,
LSPEN/ASPEN (I've noticed CMSE code in GCC that takes care of FP registers).

So what about adding a simple warning along the lines of comment #5 and comment
#6, like the one I posted (comment #8, but maybe it should also make sure to
call __aeabi_memcpy instead of memcpy?)

Then a second step would allow not to use -mgeneral-regs-only and save whatever
is needed. I am wondering whether we could introduce other attributes such as:
- "irq-nosave-fp-regs" basically saying the user does not want to save FP
registers; this would clear the warning
- "irq-save-fp-regs", asking the compiler to save all the needed regs despite
the penalty; this would also avoid the warning

At least that would make users think about their code, but we'd needed to
document that properly :-)

I've noticed that several existing tests fail because of my new warning if the
target defaults to float-abi=hard, depending on the default cpu/mode:

gcc.misc-tests/arm-isr.c (test for excess errors)
gcc.target/arm/empty_fiq_handler.c (test for excess errors)
gcc.target/arm/interrupt-1.c (test for excess errors)
gcc.target/arm/interrupt-2.c (test for excess errors)
gcc.target/arm/pr70830.c (test for excess errors)

so I'd need to change their attribute or compile them with -mgeneral-regs-only

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-05-05 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

--- Comment #16 from Christophe Lyon  ---
Another potential issue just came to my mind: what if the IRQ handler is
compiled with -mfloat-abi=soft but calls a function compiled with
-mfloat-abi=softfp? We have no way to guess that the FP registers can be
clobbered when compiling the handler.

[Bug target/95055] New: gcc.dg/compat/scalar-by-value-3 fails on aarch64 after r11-165-geb72dc663e9070b281be83a80f6f838a3a878822

2020-05-11 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95055

Bug ID: 95055
   Summary: gcc.dg/compat/scalar-by-value-3 fails on aarch64 after
r11-165-geb72dc663e9070b281be83a80f6f838a3a878822
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Hi,

After r11-165-geb72dc663e9070b281be83a80f6f838a3a878822, I've noticed that
scalar-by-value-3 fails on aarch64:
gcc.dg/compat/scalar-by-value-3 c_compat_x_tst.o-c_compat_y_tst.o execute 
gcc.dg/compat/scalar-by-value-4 c_compat_x_tst.o-c_compat_y_tst.o execute 
gcc.dg/compat/scalar-by-value-5 c_compat_x_tst.o-c_compat_y_tst.o execute 
gcc.dg/compat/scalar-by-value-6 c_compat_x_tst.o-c_compat_y_tst.o execute 
gcc.dg/compat/scalar-return-3 c_compat_x_tst.o-c_compat_y_tst.o execute 
gcc.dg/compat/scalar-return-4 c_compat_x_tst.o-c_compat_y_tst.o execute

[Bug target/95056] New: slp-perm-9.c fails on aarch64 after gbc484e250990393e887f7239157cc85ce6fadcce

2020-05-11 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95056

Bug ID: 95056
   Summary: slp-perm-9.c fails on aarch64 after
gbc484e250990393e887f7239157cc85ce6fadcce
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Hi,

I've noticed that
FAIL: gcc.dg/vect/slp-perm-9.c -flto -ffat-lto-objects  scan-tree-dump-times
vect "permutation requires at least three vectors" 1
FAIL: gcc.dg/vect/slp-perm-9.c scan-tree-dump-times vect "permutation requires
at least three vectors" 1

on aarch64

since commit gbc484e250990393e887f7239157cc85ce6fadcce

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-05-13 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

--- Comment #17 from Christophe Lyon  ---
(In reply to Richard Earnshaw from comment #10)
> (In reply to Christophe Lyon from comment #9)
> > > My initial thoughts are along the lines of...
> > > Only try to save FP registers that this function directly clobbers.
> > What's the point of saving these if a callee clobbers other registers?
> > 
> 
> They need to be done early enough to ensure that any code in *this* function
> does not clobber them.  Any additional registers would have to be saved by a
> library call that does that.

So if this function clobbers, say d16-d17, but calls another function, do you
mean we should
vpush d16-d17
then call the new lib function which saves all the FP context (thus saving
d16-d17 twice)?

> 
> > Shouldn't that be something like save-nothing vs save-all-FP-regs if there
> > is a callee?
> > 
> > Do you mean save direct clobbers only when the handler is a leaf function?
> 
> Well, obviously if it's a leaf function, saving only the registers that are
> clobbered is enough, and the compiler can do the analysis to ensure that.

I'm working on this, and just realized that this also means saving FPSR. It
seems there's no support for that yet in arm.md (unlike aarch64.md), am I
missing something?

> 
> > 
> > > Provide libgcc routines to save/restore the FP context.
> > Do you mean such routines should push all FP regs+status regs?
> 
> All registers that are are call clobbered.  There's no need to do the
> call-saved registers as the compiler can do that on an as-needed case
> already.

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-05-14 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

--- Comment #18 from Christophe Lyon  ---

> I'm working on this, and just realized that this also means saving FPSR. It
> seems there's no support for that yet in arm.md (unlike aarch64.md), am I
> missing something?
> 

Sorry, I see it's called FPSCR for arm.

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-05-14 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

--- Comment #19 from Christophe Lyon  ---
(In reply to Christophe Lyon from comment #8)
> Patch sent: https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544872.html
> 
> This is a simple improvement, hopefully simple enough for stage 4, yet
> useful for the end-users.

I have just sent an updated version of this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545747.html

Maybe that would be sufficient to consider this PR fixed?

Indeed I fear we open a can of worms, as I've also realized that at least some
Cortex-M cores save part of the FP registers when taking an interruption, the
number depends on several parameters (eg secure/non-secure, ... see "Exception
entry in Cortex-M33 GUG for instance
https://static.docs.arm.com/100235/0002/arm_cortex_m33_dgug_100235_0002_00_en.pdf)

I haven't found such documentation for Cortex-A, so I'm not sure if they have
the same behaviour.

I have attached a WIP patch that demonstrates local saving of FP registers as
an attachment to https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545754.html

[Bug tree-optimization/95273] [11 regression] many ICEs after r11-564

2020-05-26 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95273

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org
 Target|powerpc64*-linux-gnu|powerpc64*-linux-gnu
   ||aarch64 arm

--- Comment #4 from Christophe Lyon  ---
If that helps:

on aarch64:
gcc.dg/vshift-5.c (internal compiler error)

on arm-linux-gnueabihf (--with-fpu neon-fp16):
gcc.dg/pr48616.c (internal compiler error)
gcc.dg/pr86179.c (internal compiler error)
gcc.dg/torture/pr66856-1.c   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (internal compiler error)
gcc.dg/torture/pr66856-1.c   -O3 -g  (internal compiler error)
gcc.dg/torture/pr66856-2.c   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (internal compiler error)
gcc.dg/torture/pr66856-2.c   -O3 -g  (internal compiler error)
gcc.dg/torture/pr93428.c   -O1  (internal compiler error)
gcc.dg/torture/pr93428.c   -O2  (internal compiler error)
gcc.dg/torture/pr93428.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (internal compiler error)
gcc.dg/torture/pr93428.c   -O3 -g  (internal compiler error)
gcc.dg/torture/pr93428.c   -Os  (internal compiler error)
gcc.dg/vect/bb-slp-over-widen-1.c (internal compiler error)
gcc.dg/vect/bb-slp-over-widen-1.c -flto -ffat-lto-objects (internal
compiler error)
gcc.dg/vect/bb-slp-over-widen-2.c (internal compiler error)
gcc.dg/vect/bb-slp-over-widen-2.c -flto -ffat-lto-objects (internal
compiler error)
gcc.dg/vect/pr33369.c (internal compiler error)
gcc.dg/vect/pr33369.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr33953.c (internal compiler error)
gcc.dg/vect/pr33953.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr46049.c (internal compiler error)
gcc.dg/vect/pr46049.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr46126.c (internal compiler error)
gcc.dg/vect/pr46126.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr51581-3.c (internal compiler error)
gcc.dg/vect/pr51581-3.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr51581-4.c (internal compiler error)
gcc.dg/vect/pr51581-4.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/slp-36.c (internal compiler error)
gcc.dg/vect/slp-36.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/slp-multitypes-13.c (internal compiler error)
gcc.dg/vect/slp-multitypes-13.c -flto -ffat-lto-objects (internal compiler
error)
gcc.dg/vect/vect-avg-15.c (internal compiler error)
gcc.dg/vect/vect-avg-15.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/vect-avg-16.c (internal compiler error)
gcc.dg/vect/vect-avg-16.c -flto -ffat-lto-objects (internal compiler error)
gcc.target/arm/pr53636.c (internal compiler error)

[Bug other/95362] New: [11 regression] pr34457-1.c fails on arm since ga746f952abb78af9db28a7f3bce442e113877c9c

2020-05-27 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95362

Bug ID: 95362
   Summary: [11 regression] pr34457-1.c fails on arm since
ga746f952abb78af9db28a7f3bce442e113877c9c
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Hi,

since ga746f952abb78af9db28a7f3bce442e113877c9c, I've noticed that
pr34457-1.c fails on arm and aarch64:

FAIL: gcc.dg/pr34457-1.c (internal compiler error)
FAIL: gcc.dg/pr34457-1.c (test for excess errors)
Excess errors:
during IPA pass: cp
lto1: internal compiler error: in operator[], at vec.h:867
0x99feed vec::operator[](unsigned int)
/gcc/vec.h:867
0x99feed vec::operator[](unsigned int)
/gcc/vec.h:1433
0x99feed lto_symtab_encoder_deref
/gcc/lto-streamer.h:1173
0x99feed ipa_prop_read_section
/gcc/ipa-prop.c:5060
0x99feed ipa_prop_read_jump_functions()
/gcc/ipa-prop.c:5089
0xaf2fb1 ipa_read_summaries_1
/gcc/passes.c:2837
0x64b9a5 read_cgraph_and_symbols(unsigned int, char const**)
/gcc/lto/lto-common.c:2921
0x62d432 lto_main()
/gcc/lto/lto.c:625
lto-wrapper: fatal error:
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/gcc/xgcc
returned 1 exit status
compilation terminated.
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools/arm-none-linux-gnueabi/bin/ld: error:
lto-wrapper failed

[Bug other/95362] [11 regression] pr34457-1.c and pr92088-1.c fail on arm and aarch64 since ga746f952abb78af9db28a7f3bce442e113877c9c

2020-05-27 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95362

Christophe Lyon  changed:

   What|Removed |Added

Summary|[11 regression] pr34457-1.c |[11 regression] pr34457-1.c
   |fails on arm since  |and pr92088-1.c fail on arm
   |ga746f952abb78af9db28a7f3bc |and aarch64 since
   |e442e113877c9c  |ga746f952abb78af9db28a7f3bc
   ||e442e113877c9c

--- Comment #1 from Christophe Lyon  ---
FAIL: gcc.dg/torture/pr92088-1.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (internal compiler error)
FAIL: gcc.dg/torture/pr92088-1.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (test for excess errors)
Excess errors:
during IPA pass: cp
lto1: internal compiler error: Segmentation fault
0xbd698f crash_signal
/gcc/toplev.c:328
0x8b9244 unshare_expr_without_location(tree_node*)
/gcc/gimplify.c:1039
0x9926eb ipa_set_jf_constant
/gcc/ipa-prop.c:539
0x99e0b9 ipa_read_jump_function
/gcc/ipa-prop.c:4629
0x99f357 ipa_read_edge_info
/gcc/ipa-prop.c:4909
0x99fb91 ipa_read_node_info
/gcc/ipa-prop.c:4978
0x99fb91 ipa_prop_read_section
/gcc/ipa-prop.c:5062
0x99fb91 ipa_prop_read_jump_functions()
/gcc/ipa-prop.c:5089
0xaf2fb1 ipa_read_summaries_1
/gcc/passes.c:2837
0x64b9a5 read_cgraph_and_symbols(unsigned int, char const**)
/gcc/lto/lto-common.c:2921
0x62d432 lto_main()
/gcc/lto/lto.c:625
lto-wrapper: fatal error:
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/gcc/xgcc
returned 1 exit status
compilation terminated.
collect2: fatal error: lto-wrapper returned 1 exit status
compilation terminated.

[Bug tree-optimization/95363] New: [11 regression] bb-slp-pr95271.c fails on arm since gc0e27f72358794692e367363940c6383e9ad1e45

2020-05-27 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95363

Bug ID: 95363
   Summary: [11 regression] bb-slp-pr95271.c fails on arm since
gc0e27f72358794692e367363940c6383e9ad1e45
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Since gc0e27f72358794692e367363940c6383e9ad1e45, I've noticed that
gcc.dg/vect/bb-slp-pr95271.c fails on arm (that's when it was introduced):

FAIL: gcc.dg/vect/bb-slp-pr95271.c (test for excess errors)
Excess errors:
/gcc/testsuite/gcc.dg/vect/bb-slp-pr95271.c:13:21: warning: left shift count >=
width of type [-Wshift-count-overflow]
/gcc/testsuite/gcc.dg/vect/bb-slp-pr95271.c:13:60: warning: left shift count >=
width of type [-Wshift-count-overflow]
/gcc/testsuite/gcc.dg/vect/bb-slp-pr95271.c:14:38: warning: left shift count >=
width of type [-Wshift-count-overflow]
/gcc/testsuite/gcc.dg/vect/bb-slp-pr95271.c:15:38: warning: left shift count >=
width of type [-Wshift-count-overflow]

Do you want to use 'long long' instead of 'long' for variable 'd'?

[Bug tree-optimization/95406] New: [11 regression] vshift-5.c fails since g:e31cd607e999ca6ab47b7e65a7045b1594e4fba4

2020-05-29 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95406

Bug ID: 95406
   Summary: [11 regression] vshift-5.c fails since
g:e31cd607e999ca6ab47b7e65a7045b1594e4fba4
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Since g:e31cd607e999ca6ab47b7e65a7045b1594e4fba4
I've noticed
gcc.dg/vshift-5.c (internal compiler error)
on aarch64

The logs say:
Excess errors:
during GIMPLE pass: slp
/gcc/testsuite/gcc.dg/vshift-5.c:8:1: internal compiler error: in
vect_create_constant_vectors, at tree-vect-slp.c:3674
0x107f8be vect_create_constant_vectors
/gcc/tree-vect-slp.c:3674
0x107f8be vect_schedule_slp_instance
/gcc/tree-vect-slp.c:4066
0x107ebc7 vect_schedule_slp_instance
/gcc/tree-vect-slp.c:4071
0x107ebc7 vect_schedule_slp_instance
/gcc/tree-vect-slp.c:4071
0x1086a44 vect_schedule_slp(vec_info*)
/gcc/tree-vect-slp.c:4303
0x108a07e vect_slp_bb_region
/gcc/tree-vect-slp.c:3342
0x108b0f7 vect_slp_bb(basic_block_def*)
/gcc/tree-vect-slp.c:3465
0x108c24c execute
/gcc/tree-vectorizer.c:1320

[Bug fortran/95373] [9/10/11 Regression] ICE in build_reference_type, at tree.c:7942

2020-05-29 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95373

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #7 from Christophe Lyon  ---
(In reply to CVS Commits from comment #5)
> The master branch has been updated by Harald Anlauf :
> 
> https://gcc.gnu.org/g:5c715e6a2990cfb6c15acc1ee14219523534ec69
> 
> commit r11-705-g5c715e6a2990cfb6c15acc1ee14219523534ec69
> Author: Harald Anlauf 
> Date:   Thu May 28 22:28:08 2020 +0200
> 
> PR fortran/95373 - ICE in build_reference_type, at tree.c:7942
> 
> The use of KIND, LEN, RE, and IM inquiry references for applicable
> intrinsic
> types is valid only for suffienctly new Fortran standards.  Add
> appropriate
> check.
> 
> 2020-05-28  Harald Anlauf  
> 
> gcc/fortran/
> PR fortran/95373
> * primary.c (is_inquiry_ref): Check validity of inquiry
> references against selected Fortran standard.
> 
> gcc/testsuite/
> PR fortran/95373
> * gfortran.dg/pr95373_1.f90: New test.
> * gfortran.dg/pr95373_2.f90: New test.



This causes regressions on arm and aarch64:
FAIL: gfortran.dg/inquiry_type_ref_2.f90   -O   (test for errors, line 13)
FAIL: gfortran.dg/inquiry_type_ref_2.f90   -O   (test for errors, line 14)
FAIL: gfortran.dg/inquiry_type_ref_2.f90   -O   (test for errors, line 15)
FAIL: gfortran.dg/inquiry_type_ref_2.f90   -O   (test for errors, line 16)
FAIL: gfortran.dg/inquiry_type_ref_2.f90   -O  (test for excess errors)
Excess errors:
/gcc/testsuite/gfortran.dg/inquiry_type_ref_2.f90:13:6: Error: Unexpected '%'
for nonderived-type variable 'a' at (1)
/gcc/testsuite/gfortran.dg/inquiry_type_ref_2.f90:14:10: Error: Unexpected '%'
for nonderived-type variable 'a' at (1)
/gcc/testsuite/gfortran.dg/inquiry_type_ref_2.f90:15:15: Error: Unexpected '%'
for nonderived-type variable 'z' at (1)
/gcc/testsuite/gfortran.dg/inquiry_type_ref_2.f90:16:15: Error: Unexpected '%'
for nonderived-type variable 'z' at (1)

[Bug target/95421] [AArch64] Missing NEON functions documented on ARM's web site

2020-06-03 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95421

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #2 from Christophe Lyon  ---
See also:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70369

[Bug target/95399] [ARM] 32/64-bit vcvtnq_* functions are missing

2020-06-03 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95399

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #5 from Christophe Lyon  ---
See also:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71233
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70369

[Bug ipa/95600] New: [11 regression] tree-prof/indir-call-prof-2.c fails on armeb-linux-gnueabihf since r11-830

2020-06-09 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95600

Bug ID: 95600
   Summary: [11 regression] tree-prof/indir-call-prof-2.c fails on
armeb-linux-gnueabihf since r11-830
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Hi,

Since r11-830 (g:85bce484d37fdda9c7eadb9bdcdb1ded891462bb), I've noticed
regressions on armeb-linux-gnueabihf:
gcc.dg/tree-prof/indir-call-prof-2.c compilation,  -fprofile-use -D_PROFILE_USE
gcc.dg/tree-prof/indir-call-prof.c compilation,  -fprofile-use -D_PROFILE_USE
gcc.dg/tree-prof/pr59003.c compilation,  -fprofile-use -D_PROFILE_USE
gcc.dg/tree-prof/val-prof-2.c scan-ipa-dump profile "Transformation done:
div/mod by constant 256"
gcc.dg/tree-prof/val-prof-6.c compilation,  -fprofile-use -D_PROFILE_USE

gcc.log says:
/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c: In function 'sub1':
/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c:10:1: warning: profile for
function 'sub1' not found in profile data [-Wmissing-profile]
/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c: In function 'add1':
/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c:4:1: warning: profile for
function 'add1' not found in profile data [-Wmissing-profile]
FAIL: gcc.dg/tree-prof/indir-call-prof-2.c compilation,  -fprofile-use
-D_PROFILE_USE
UNRESOLVED: gcc.dg/tree-prof/indir-call-prof-2.c

/gcc/testsuite/gcc.dg/tree-prof/pr59003.c: In function 'foo':
/gcc/testsuite/gcc.dg/tree-prof/pr59003.c:9:10: error: corrupted value profile:
stringops profile counter (28246016 out of 11) inconsistent with
basic-block count (11)
compiler exited with status 1
FAIL: gcc.dg/tree-prof/pr59003.c compilation,  -fprofile-use -D_PROFILE_USE
UNRESOLVED: gcc.dg/tree-prof/pr59003.c execution,-fprofile-use
-D_PROFILE_USE


/gcc/testsuite/gcc.dg/tree-prof/val-prof-6.c: In function 't':
/gcc/testsuite/gcc.dg/tree-prof/val-prof-6.c:8:3: error: corrupted value
profile: stringops profile counter (28246016 out of 1000) inconsistent with
basic-block count (1000)
compiler exited with status 1
FAIL: gcc.dg/tree-prof/val-prof-6.c compilation,  -fprofile-use -D_PROFILE_USE
UNRESOLVED: gcc.dg/tree-prof/val-prof-6.c execution,-fprofile-use
-D_PROFILE_USE


/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c: In function 'setp':
/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c:17:6: warning: profile for
function 'setp' not found in profile data [-Wmissing-profile]
/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c: In function 'a2':
/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c:8:12: warning: profile for
function 'a2' not found in profile data [-Wmissing-profile]
/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c: In function 'a1':
/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c:3:12: warning: profile for
function 'a1' not found in profile data [-Wmissing-profile]
FAIL: gcc.dg/tree-prof/indir-call-prof.c compilation,  -fprofile-use
-D_PROFILE_USE
UNRESOLVED: gcc.dg/tree-prof/indir-call-prof.c execution,-fprofile-use
-D_PROFILE_USE

[Bug tree-optimization/95633] New: [11 regression] ICEs since r11-1143-gb05d5563f4be13b4a0d0951375a82adf483973c0

2020-06-11 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95633

Bug ID: 95633
   Summary: [11 regression] ICEs since
r11-1143-gb05d5563f4be13b4a0d0951375a82adf483973c0
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Hi,

I've noticed regressions since
r11-1143-gb05d5563f4be13b4a0d0951375a82adf483973c0:

on aarch64:
FAIL: gcc.target/aarch64/sve/clastb_5.c -march=armv8.2-a+sve (internal compiler
error)
FAIL: gcc.target/aarch64/sve/clastb_5.c -march=armv8.2-a+sve (test for excess
errors)
Excess errors:
during GIMPLE pass: vect
dump file: clastb_5.c.163t.vect
/gcc/testsuite/gcc.target/aarch64/sve/clastb_2.c:15:1: internal compiler error:
in operator[], at vec.h:867
0x649e5a vec::operator[](unsigned int)
/gcc/vec.h:867
0x649e5a vec::operator[](unsigned int)
/gcc/vec.h:1433
0x104773d vec::operator[](unsigned int)
/gcc/vec.h:998
0x104773d vectorizable_condition
/gcc/tree-vect-stmts.c:9986
0x105df8e vect_transform_stmt(vec_info*, _stmt_vec_info*,
gimple_stmt_iterator*, _slp_tree*, _slp_instance*)
/gcc/tree-vect-stmts.c:10735
0x1060fa7 vect_transform_loop_stmt
/gcc/tree-vect-loop.c:8310
0x1078f61 vect_transform_loop(_loop_vec_info*, gimple*)
/gcc/tree-vect-loop.c:8711
0x109eccc try_vectorize_loop_1
/gcc/tree-vectorizer.c:991
0x109f7a9 vectorize_loops()
/gcc/tree-vectorizer.c:1128




on arm-none-linux-gnueabihf --with-cpu cortex-a9 --with-fpu neon-fp1:
FAIL: gcc.dg/pr86179.c (internal compiler error)
FAIL: gcc.dg/pr86179.c (test for excess errors)
Excess errors:
during GIMPLE pass: vect
/gcc/testsuite/gcc.dg/pr86179.c:7:6: internal compiler error: in operator[], at
vec.h:867
0xfba61e vec::operator[](unsigned int)
/gcc/vec.h:867
0xfba61e vec::operator[](unsigned int)
/gcc/vec.h:1433
0xfba61e vect_create_vectorized_promotion_stmts
/gcc/tree-vect-stmts.c:4466
0xfba61e vectorizable_conversion
/gcc/tree-vect-stmts.c:4934
0xfd9906 vect_transform_stmt(vec_info*, _stmt_vec_info*, gimple_stmt_iterator*,
_slp_tree*, _slp_instance*)
/gcc/tree-vect-stmts.c:10680
0x100912d vect_schedule_slp_instance
/gcc/tree-vect-slp.c:4052
0x100900f vect_schedule_slp_instance
/gcc/tree-vect-slp.c:3953
0x100900f vect_schedule_slp_instance
/gcc/tree-vect-slp.c:3953
0x100900f vect_schedule_slp_instance
/gcc/tree-vect-slp.c:3953
0x100900f vect_schedule_slp_instance
/gcc/tree-vect-slp.c:3953
0x100900f vect_schedule_slp_instance
/gcc/tree-vect-slp.c:3953
0x1010774 vect_schedule_slp(vec_info*)
/gcc/tree-vect-slp.c:4167
0xff34d2 vect_transform_loop(_loop_vec_info*, gimple*)
/gcc/tree-vect-loop.c:8623
0x10188aa try_vectorize_loop_1
/gcc/tree-vectorizer.c:991
0x1019349 vectorize_loops()
/gcc/tree-vectorizer.c:1128

[Bug tree-optimization/95633] [11 regression] ICEs since r11-1143-gb05d5563f4be13b4a0d0951375a82adf483973c0

2020-06-12 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95633

--- Comment #6 from Christophe Lyon  ---
(In reply to Richard Biener from comment #3)
> I cannot reproduce the arm failure, neon-fp1 doesn't seem to exist and any
> combo of -mcpu=cortex-a9 and -mfpu=... does not ICE for me.

Sorry, that was a cut & paste error: did you try neon-fp16 ?

[Bug testsuite/95706] New test case gfortran.dg/pr95690.f90 fails

2020-06-18 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95706

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #4 from Christophe Lyon  ---
Seen on arm and aarch64 too.

[Bug tree-optimization/95745] New: [11 regression] O3-pr85794.c fails since r11-1445-g502d63b6d6141597bb18fd23c87736a1b384cf8f

2020-06-18 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95745

Bug ID: 95745
   Summary: [11 regression] O3-pr85794.c fails since
r11-1445-g502d63b6d6141597bb18fd23c87736a1b384cf8f
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Hi,

Since r11-1445-g502d63b6d6141597bb18fd23c87736a1b384cf8f I have noticed that 
O3-pr85794.c fails on arm:
FAIL: gcc.dg/vect/O3-pr85794.c (internal compiler error)
FAIL: gcc.dg/vect/O3-pr85794.c (test for excess errors)
Excess errors:
during RTL pass: expand
/gcc/testsuite/gcc.dg/vect/O3-pr85794.c:7:1: internal compiler error: in
do_store_flag, at expr.c:12247
0x8fe346 do_store_flag
/gcc/expr.c:12247
0x8ff3c1 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
/gcc/expr.c:9610
0x7bd97a expand_gimple_stmt_1
/gcc/cfgexpand.c:3787
0x7bd97a expand_gimple_stmt
/gcc/cfgexpand.c:3847
0x7bfadd expand_gimple_basic_block
/gcc/cfgexpand.c:5888
0x7c1c50 execute
/gcc/cfgexpand.c:6572


Many other ICEs appeared between r11-1409 and r11-1457 which are probably
caused by the same commit:
gcc.dg/vect/O3-pr85794.c (internal compiler error)
gcc.dg/vect/bb-slp-43.c (internal compiler error)
gcc.dg/vect/bb-slp-43.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/bb-slp-cond-1.c (internal compiler error)
gcc.dg/vect/bb-slp-cond-1.c -flto -ffat-lto-objects (internal compiler
error)
gcc.dg/vect/bb-slp-pattern-2.c (internal compiler error)
gcc.dg/vect/bb-slp-pattern-2.c -flto -ffat-lto-objects (internal compiler
error)
gcc.dg/vect/bb-slp-pr92596.c (internal compiler error)
gcc.dg/vect/bb-slp-pr92596.c -flto -ffat-lto-objects (internal compiler
error)
gcc.dg/vect/pr18308.c (internal compiler error)
gcc.dg/vect/pr18308.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr24059.c (internal compiler error)
gcc.dg/vect/pr24059.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr51000.c (internal compiler error)
gcc.dg/vect/pr51000.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr51581-3.c (internal compiler error)
gcc.dg/vect/pr51581-3.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr51581-4.c (internal compiler error)
gcc.dg/vect/pr51581-4.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr56625.c (internal compiler error)
gcc.dg/vect/pr56625.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr59519-2.c (internal compiler error)
gcc.dg/vect/pr59519-2.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr59591-1.c (internal compiler error)
gcc.dg/vect/pr59591-1.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr62075.c (internal compiler error)
gcc.dg/vect/pr62075.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr63605.c (internal compiler error)
gcc.dg/vect/pr63605.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr65947-1.c (internal compiler error)
gcc.dg/vect/pr65947-1.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr65947-12.c (internal compiler error)
gcc.dg/vect/pr65947-12.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr65947-13.c (internal compiler error)
gcc.dg/vect/pr65947-13.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr65947-14.c (internal compiler error)
gcc.dg/vect/pr65947-14.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr65947-2.c (internal compiler error)
gcc.dg/vect/pr65947-2.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr65947-3.c (internal compiler error)
gcc.dg/vect/pr65947-3.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr65947-4.c (internal compiler error)
gcc.dg/vect/pr65947-4.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr65947-6.c (internal compiler error)
gcc.dg/vect/pr65947-6.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr68305.c (internal compiler error)
gcc.dg/vect/pr68305.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr69820.c (internal compiler error)
gcc.dg/vect/pr69820.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr71259.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr72866.c (internal compiler error)
gcc.dg/vect/pr72866.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr80631-1.c (internal compiler error)
gcc.dg/vect/pr80631-1.c -flto -ffat-lto-objects (internal compiler error)
gcc.dg/vect/pr80631-2.c (internal compiler error)
gcc.dg/vect

[Bug tree-optimization/95745] [11 regression] O3-pr85794.c fails since r11-1445-g502d63b6d6141597bb18fd23c87736a1b384cf8f

2020-06-19 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95745

--- Comment #3 from Christophe Lyon  ---
I still see it with r11-1521-gaae80e833d2826fc0afe7ff1704d2ab0f4607c5a

[Bug middle-end/95757] [11 regression] missing warning in gcc.dg/Wstringop-overflow-25.c since r11-1517

2020-06-19 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95757

Christophe Lyon  changed:

   What|Removed |Added

 Target|powerpc64*-linux-gnu|powerpc64*-linux-gnu arm
 CC||clyon at gcc dot gnu.org

--- Comment #1 from Christophe Lyon  ---
I see the same thing on some arm targets:
arm-none-linux-gnueabihf --with-cpu=cortex-a5
arm-none-eabi -mcpu=cortex-m[034]

but for instance arm-none-linux-gnueabihf --with-cpu=cortex-a9 works.

[Bug tree-optimization/95745] [11 regression] O3-pr85794.c fails since r11-1445-g502d63b6d6141597bb18fd23c87736a1b384cf8f

2020-06-19 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95745

--- Comment #6 from Christophe Lyon  ---
(In reply to Martin Liška from comment #4)
> Ok, can I test it with a x86_64-linux-gnu cross compiler?

Yes, that's what I am using.

Target: arm-none-linux-gnueabi
Configured with: /configure --target=arm-none-linux-gnueabi
--prefix=/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools
--with-sysroot=/aci-gcc-fsf/builds/gcc-fsf-gccsrc/sysroot-arm-none-linux-gnueabi
--disable-nls --disable-libgomp --disable-libmudflap --disable-libcilkrts
--enable-checking --enable-languages=c,c++,fortran --with-float=soft
--enable-build-with-cxx --with-mode=arm --with-cpu=cortex-a9


> Can you please provide exact command line for some of the problematic
> test-cases?

/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/gcc/xgcc
-B/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/gcc/
/gcc/testsuite/gcc.dg/vect/O3-pr85794.c -fno-diagnostics-show-caret
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never
-fdiagnostics-urls=never -mfloat-abi=softfp -ffast-math -ftree-vectorize
-fno-tree-loop-distribute-patterns -fno-vect-cost-model -fno-common -O2
-fdump-tree-vect-details -O3 -fno-ipa-cp-clone -S -o O3-pr85794.s
during RTL pass: expand
/gcc/testsuite/gcc.dg/vect/O3-pr85794.c: In function 'foo':
/gcc/testsuite/gcc.dg/vect/O3-pr85794.c:7:1: internal compiler error: in
do_store_flag, at expr.c:12247

[Bug fortran/95858] [11 Regression] gcc/testsuite/gfortran.fortran-torture/execute/forall_5.f90 fails since r11-1595-gabcde0a658e17dbb

2020-06-24 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95858

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #1 from Christophe Lyon  ---
Seen on arm and aarch64 too.

[Bug testsuite/95720] [11 Regression] New dump output filename strategy invalidates tests

2020-06-25 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95720

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #8 from Christophe Lyon  ---
(In reply to Alexandre Oliva from comment #5)
> that's because of the second input gcc_tg.o
> 
> can you tell where that comes from?

I guess that's the "testglue" object file added by Dejagnu when
needs_status_wrapper is set in the .exp file.

[Bug fortran/95893] New: pr95690.f90 fails

2020-06-25 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95893

Bug ID: 95893
   Summary: pr95690.f90 fails
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Hi,

The new testcase pr95690.f90 fails on arm and aarch64 (and powerpc, s390
accordng to gcc-testresults).
compiler exited with status 1
FAIL: gfortran.dg/pr95690.f90   -O   (test for errors, line 5)
FAIL: gfortran.dg/pr95690.f90   -O  (test for excess errors)
Excess errors:
/gcc/testsuite/gfortran.dg/pr95690.f90:6:0: Error: initializer for floating
value is not a floating constant

[Bug tree-optimization/95896] New: [11 regression] ICE in mask_load_slp_1 since r11-1621-gd32708e796504eaeaad7d19990909204d74f9ba3

2020-06-25 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95896

Bug ID: 95896
   Summary: [11 regression] ICE in mask_load_slp_1 since
r11-1621-gd32708e796504eaeaad7d19990909204d74f9ba3
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Hi,

Since r11-1621-gd32708e796504eaeaad7d19990909204d74f9ba3
I have noticed:
FAIL: gcc.target/aarch64/sve/mask_load_slp_1.c -march=armv8.2-a+sve (internal
compiler error)
FAIL: gcc.target/aarch64/sve/mask_load_slp_1.c -march=armv8.2-a+sve (test for
excess errors)
Excess errors:
/gcc/testsuite/gcc.target/aarch64/sve/mask_load_slp_1.c:8:1: error: definition
in block 3 follows the use
for SSA_NAME: mask_patt_58.126_186 in statement:
vec_mask_and_190 = mask_patt_58.126_186 & loop_mask_189;
during GIMPLE pass: vect
/gcc/testsuite/gcc.target/aarch64/sve/mask_load_slp_1.c:8:1: internal compiler
error: verify_ssa failed
0x10144f3 verify_ssa(bool, bool)
/gcc/tree-ssa.c:1208
0xc8e653 execute_function_todo
/gcc/passes.c:1992
0xc8ef35 execute_todo
/gcc/passes.c:2039

[Bug testsuite/95900] [11 Regression] New test case gcc.dg/vect/bb-slp-pr95866.c in r11-1647 fails

2020-06-26 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95900

Christophe Lyon  changed:

   What|Removed |Added

 Target|powerpc64*-linux-gnu|powerpc64*-linux-gnu
   ||arm*-linux-gnueabihf
 CC||clyon at gcc dot gnu.org

--- Comment #3 from Christophe Lyon  ---
I see it on arm-none-linux-gnueabihf too
(--with-cpu cortex-a9 --with-fpu neon-fp16 for instance)

[Bug target/94743] IRQ handler doesn't save scratch VFP registers

2020-06-30 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94743

--- Comment #22 from Christophe Lyon  ---
Not sure if we can close this PR: I have only implemented a part of what we
discussed here. GCC now emits a warning so the user can take action to make
sure his code is correct/correctly generated, but GCC does not handle
saving/restoring all of the FP registers automatically.

[Bug middle-end/96136] New: [11 regression] ICE in reduce_to_bit_field_precision

2020-07-09 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96136

Bug ID: 96136
   Summary: [11 regression] ICE in reduce_to_bit_field_precision
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Hi,

Since r11-1914-g760df6d296b8fc59796f42dca5eb14012fbfa28b, I've noticed an ICE
while building glibc-2.29 when GCC is configured --target
arm-none-linux-gnueabihf  --with-cpu cortex-a15 --with-mode thumb --with-fpu
neon-vfpv4

$ arm-none-linux-gnueabihf-gcc -c -O2 iso646.i
during RTL pass: expand
In file included from iso646.c:901:
../iconv/skeleton.c: In function 'gconv':
../iconv/skeleton.c:390:1: internal compiler error: in
reduce_to_bit_field_precision, at expr.c:11530
  390 | FUNCTION_NAME (struct __gconv_step *step, struct __gconv_step_data
*data,
  | ^
0x92fdc3 reduce_to_bit_field_precision
   
/home/christophe.lyon/src/GCC/sources/gcc-fsf-git/trunk/gcc/expr.c:11530
0x9397b9 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
/home/christophe.lyon/src/GCC/sources/gcc-fsf-git/trunk/gcc/expr.c:9276
0x9254c0 expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
   
/home/christophe.lyon/src/GCC/sources/gcc-fsf-git/trunk/gcc/expr.c:10152
0x92fbbe expand_expr
/home/christophe.lyon/src/GCC/sources/gcc-fsf-git/trunk/gcc/expr.h:282
0x92fbbe expand_operands(tree_node*, tree_node*, rtx_def*, rtx_def**,
rtx_def**, expand_modifier)
/home/christophe.lyon/src/GCC/sources/gcc-fsf-git/trunk/gcc/expr.c:8065
0x939972 expand_cond_expr_using_cmove
/home/christophe.lyon/src/GCC/sources/gcc-fsf-git/trunk/gcc/expr.c:8519
0x939972 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
/home/christophe.lyon/src/GCC/sources/gcc-fsf-git/trunk/gcc/expr.c:9869
0x7ead60 expand_gimple_stmt_1
   
/home/christophe.lyon/src/GCC/sources/gcc-fsf-git/trunk/gcc/cfgexpand.c:3787
0x7ead60 expand_gimple_stmt
   
/home/christophe.lyon/src/GCC/sources/gcc-fsf-git/trunk/gcc/cfgexpand.c:3847
0x7f20cb expand_gimple_basic_block
   
/home/christophe.lyon/src/GCC/sources/gcc-fsf-git/trunk/gcc/cfgexpand.c:5888
0x7f47bb execute
   
/home/christophe.lyon/src/GCC/sources/gcc-fsf-git/trunk/gcc/cfgexpand.c:6572

[Bug testsuite/96109] gcc.dg/vect/slp-47.c etc. FAIL

2020-07-10 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96109

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org
 Target|sparc-sun-solaris2.11,  |sparc-sun-solaris2.11,
   |arm*-*-*,   |arm*-*-*,
   |ia64-suse-linux-gnu |ia64-suse-linux-gnu,
   ||aarch64*elf

--- Comment #4 from Christophe Lyon  ---
Also seen on aarch64-elf (aarch64-linux-gnu is OK)

[Bug testsuite/96149] New: gcc.dg/vect/slp-46.c on aarch64

2020-07-10 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96149

Bug ID: 96149
   Summary: gcc.dg/vect/slp-46.c on aarch64
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

gcc.dg/vect/slp-46.c fails on aarch64 since it was introduced.

In the logs I can see:
PASS: gcc.dg/vect/slp-46.c execution test
gcc.dg/vect/slp-46.c: pattern found 0 times
FAIL: gcc.dg/vect/slp-46.c scan-tree-dump-times vect "vectorizing stmts using
SLP" 2

[Bug middle-end/96136] [11 regression] ICE in reduce_to_bit_field_precision

2020-07-12 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96136

Christophe Lyon  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Christophe Lyon  ---
This was fixed by the fix for PR96151, thanks.

[Bug target/96372] New: [11 regression] arm/ivopts.c fails since r11-2012

2020-07-29 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96372

Bug ID: 96372
   Summary: [11 regression] arm/ivopts.c fails since r11-2012
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Since r11-2012-gd2ed233cb940aa3eecc163d98b47979dd81dbc0a, I've noticed that
FAIL: gcc.target/arm/ivopts.c object-size text <= 20

depending on how GCC is configured.

For instance:
* target arm-none-eabi with target-board=-mcpu=cortex-a7/-mfloat-abi=hard
or
* target arm-none-linux-gnueabi --with-mode arm --with-cpu cortex-a9

The log says:
spawn -ignore SIGHUP arm-none-linux-gnueabi-size ivopts.o
   textdata bss dec hex filename
 32   0   0  32  20 ivopts.o
text size is 32
FAIL: gcc.target/arm/ivopts.c object-size text <= 20

[Bug target/96375] New: [11 regression] arm/lob[2-5].c fail on some configurations

2020-07-29 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96375

Bug ID: 96375
   Summary: [11 regression] arm/lob[2-5].c fail on some
configurations
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

Hi,

Since these new tests were introduced, I've noticed that they fail on some
configurations.

For instance, with target arm-none-linux-gnueabi --with-mode arm --with-cpu
cortex-a9:
spawn -ignore SIGHUP
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/gcc/xgcc
-B/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/gcc/
/gcc/testsuite/gcc.target/arm/lob2.c -fno-diagnostics-show-caret
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never
-fdiagnostics-urls=never -march=armv8.1-m.main -O3 --save-temps
-ffat-lto-objects -fno-ident -S -o lob2.s
cc1: error: target CPU does not support ARM mode
compiler exited with status 1
FAIL: gcc.target/arm/lob2.c (test for excess errors)
Excess errors:
cc1: error: target CPU does not support ARM mode

gcc.target/arm/lob2.c: output file does not exist

The current dg-skip-if is not sufficient.

Note that lob1.c is UNSUPPORTED in this case, because arm_v8_1_lob_hw_available
fails to compile:
cc1: error: target CPU does not support ARM mode

You probably want to make sure that -mthumb is also used when compiling these
tests.

Sadly, all these new tests are skipped in all my arm-eabi configurations
because I always override -mcpu :-(

[Bug tree-optimization/96376] New: [11 regression] vect/vect-alias-check.c and vect/vect-live-5.c fail on armeb

2020-07-29 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96376

Bug ID: 96376
   Summary: [11 regression] vect/vect-alias-check.c and
vect/vect-live-5.c fail on armeb
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

I've noticed regressions on target armeb-none-linux-gnueabihf --with-mode arm
--with-cpu cortex-a9 --with-fpu neon-fp16:
gcc.dg/vect/vect-alias-check.c -flto -ffat-lto-objects  scan-tree-dump-times
vect "vectorized 1 loops" 1
gcc.dg/vect/vect-alias-check.c scan-tree-dump-times vect "vectorized 1 loops" 1
gcc.dg/vect/vect-live-5.c -flto -ffat-lto-objects  scan-tree-dump-times vect
"vectorized 1 loops" 1
gcc.dg/vect/vect-live-5.c scan-tree-dump-times vect "vectorized 1 loops" 1

In my logs I can see:
PASS: gcc.dg/vect/vect-live-5.c execution test
gcc.dg/vect/vect-live-5.c: pattern found 0 times
FAIL: gcc.dg/vect/vect-live-5.c scan-tree-dump-times vect "vectorized 1 loops"
1

PASS: gcc.dg/vect/vect-alias-check.c (test for excess errors)
gcc.dg/vect/vect-alias-check.c: pattern found 0 times
FAIL: gcc.dg/vect/vect-alias-check.c scan-tree-dump-times vect "vectorized 1
loops" 1

This appeared between r11-1908 and r11-1952.

[Bug tree-optimization/96376] [11 regression] vect/vect-alias-check.c and vect/vect-live-5.c fail on armeb

2020-07-30 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96376

--- Comment #1 from Christophe Lyon  ---
Bisect identified commit g30fdaead5b7880c4e9f140618e26ad1c545642d5

[Bug target/96375] [11 regression] arm/lob[2-5].c fail on some configurations

2020-08-03 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96375

--- Comment #2 from Christophe Lyon  ---
(In reply to akrl from comment #1)
> Created attachment 48968 [details]
> pr96375 lob tests patch
> 
> Hi Christophe,
> 
> The following patch does the job for me.  Would you double check is
> effective for you too?
> 
> Thanks
>   Andrea

Hi,

It does fix the FAIL, thanks.
I suspect you also want to add -mthumb to
check_effective_target_arm_v8_1_lob_ok for consistency.

In practice, how do you exercise these tests given that with the arm-eabi
configurations I'm testing they are unsupported because I override -mcpu?

[Bug ipa/96431] New: [11 regression] ipa-clone-2.c fails since r13cdbb6a97c3d853cd380e5a03be8e0d35966c1e

2020-08-03 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96431

Bug ID: 96431
   Summary: [11 regression] ipa-clone-2.c fails since
r13cdbb6a97c3d853cd380e5a03be8e0d35966c1e
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ipa
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Hi,

Since r13cdbb6a97c3d853cd380e5a03be8e0d35966c1e, I've noticed that
FAIL: gcc.dg/ipa/ipa-clone-2.c scan-ipa-dump-times cp "Creating a specialized
node of recur_fn/[0-9]*\\." 12

Seen on arm and aarch64, but also on several other targets according to
gcc-testresults.

Occurs since:
commit 13cdbb6a97c3d853cd380e5a03be8e0d35966c1e
Author: Jan Hubicka 
Date:   Sat Aug 1 17:02:24 2020 +0200

Cap frequency of recursive calls by 90%

* predict.c (estimate_bb_frequencies): Cap recursive calls by 90%.

[Bug target/96375] [11 regression] arm/lob[2-5].c fail on some configurations

2020-08-03 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96375

--- Comment #4 from Christophe Lyon  ---
(In reply to Andrea Corallo from comment #3)
> "clyon at gcc dot gnu.org"  writes:
> > Hi,
> 
> Hi,
> 
> > It does fix the FAIL, thanks.
> 
> Thanks for testing it!
> 
> > I suspect you also want to add -mthumb to
> > check_effective_target_arm_v8_1_lob_ok for consistency.
> 
> isn't the patch already doing this?

Hmmm right, I probably missed that part when testing manually, sorry for the
noise.


> 
> > In practice, how do you exercise these tests given that with the arm-eabi
> > configurations I'm testing they are unsupported because I override -mcpu?
> 
> Not sure I understand, what is the problem if the test is not supported
> with a specific -mcpu?
> 

Just that I have no config where lob[16].c are supported because when running
arm-none-eabi tests I override -mcpu. When I don't override -mcpu, I don't
enable multilibs which in turn causes some effective-target checks failures.
Hence I'm wondering what toolchain settings you are using to run these new
tests?

[Bug testsuite/96519] [11 regression] new test case gcc.dg/ia64-sync-5.c fails

2020-08-10 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96519

Christophe Lyon  changed:

   What|Removed |Added

 Target|powerpc64*-linux-gnu|powerpc64*-linux-gnu
   ||aarch64 arm
 CC||clyon at gcc dot gnu.org

--- Comment #1 from Christophe Lyon  ---
Seen also on aarch64 and arm

[Bug libstdc++/94681] filesystem::sysmlink_status using stat instead of lstat when --disable-libstdcxx-filesystem-ts

2020-08-10 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94681

Christophe Lyon  changed:

   What|Removed |Added

 CC||clyon at gcc dot gnu.org

--- Comment #5 from Christophe Lyon  ---
The commit r11-2633 broke the build of libstdc++ on aarch64-none-elf. My build
logs say:
/tmp/7968837_9.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/src/c++17/fs_ops.cc:
In function 'std::filesystem::__cxx11::path std::filesystem::read_symlink(const
std::filesystem::__cxx11::path&, std::error_code&)':
/tmp/7968837_9.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/src/c++17/fs_ops.cc:1178:9:
error: '::lstat' has not been declared; did you mean
'std::filesystem::__gnu_posix::lstat'?
 1178 |   if (::lstat(p.c_str(), &st))
  | ^
  | std::filesystem::__gnu_posix::lstat
In file included from
/tmp/7968837_9.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/src/c++17/fs_ops.cc:58:
/tmp/7968837_9.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/src/c++17/../filesystem/ops-common.h:131:14:
note: 'std::filesystem::__gnu_posix::lstat' declared here
  131 |   inline int lstat(const char* path, stat_type* buffer)
  |  ^
make[5]: *** [Makefile:572: fs_ops.lo] Error 1
make[5]: Leaving directory
'/tmp/7968837_9.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64_be-none-elf/gcc3/aarch64_be-none-elf/libstdc++-v3/src/c++17'
make[4]: *** [Makefile:732: all-recursive] Error 1

[Bug target/94531] gcc.target/arm/its.c fails for cortex-m3

2020-08-20 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94531

--- Comment #1 from Christophe Lyon  ---
(In reply to Christophe Lyon from comment #0)
> I've noticed that gcc.target/arm/its.c fails when targetting
> cortex-m3 or m33, but that's probably true with all cortex-m versions.
> 
Since I have extending testing to more CPUs, I've noticed that the test fails
for M3, M4 and M33, but passes for M7 (which has '1' as max_insns_skipped
tuning, while other v7m CPUs have '2').

If these are the expected results, that's not easy to describe in the testcase
:-)

[Bug target/96767] New: -mpure-code produces indirect loads for thumb-1

2020-08-24 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96767

Bug ID: 96767
   Summary: -mpure-code produces indirect loads for thumb-1
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

As described in PR94538, -mpure-code produces suboptimal code for thumb-1 CPUs.

int x;
int f1 (void) { return x; }
Compiled with -O2 -mpure-code,
-mcpu=cortex-m0:
movsr3, #:upper8_15:#.LC0
lslsr3, #8
addsr3, #:upper0_7:#.LC0
lslsr3, #8
addsr3, #:lower8_15:#.LC0
lslsr3, #8
addsr3, #:lower0_7:#.LC0
@ sp needed
ldr r3, [r3]
ldr r0, [r3]
bx  lr
-> extra indirection, there should be only one ldr

For reference, -mcpu=cortex-m[347] and m23 produce:
movwr3, #:lower16:.LANCHOR0
movtr3, #:upper16:.LANCHOR0
ldr r0, [r3]
bx  lr

[Bug target/96768] New: -mpure-code produces switch tables for thumb-1

2020-08-24 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96768

Bug ID: 96768
   Summary: -mpure-code produces switch tables for thumb-1
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

As discussed in PR94538, -mpure-code produces switch tables for thumb-1.

int f2 (int x, int y)
{
  switch (x)
  {
case 0: return y + 0;
case 1: return y + 1;
case 2: return y + 2;
case 3: return y + 3;
case 4: return y + 4;
case 5: return y + 5;
  }
  return y;
}

Compiled with -O2 -mpure-code,
-mcpu=cortex-m0:
f2:
cmp r0, #5
bhi .L9
movsr2, #:upper8_15:#.LC0
lslsr2, #8
addsr2, #:upper0_7:#.LC0
lslsr2, #8
addsr2, #:lower8_15:#.LC0
lslsr2, #8
addsr2, #:lower0_7:#.LC0
ldr r2, [r2]
lslsr0, r0, #2
ldr r3, [r2, r0]
mov pc, r3
.section.rodata
.align  2
.L4:
.word   .L9
.word   .L8
.word   .L7
.word   .L6
.word   .L5
.word   .L3
.section .text,"0x2006",%progbits
.L3:
addsr0, r1, #5
.L1:
@ sp needed
bx  lr
.L8:
addsr0, r1, #1
b   .L1
.L7:
addsr0, r1, #2
b   .L1
.L6:
addsr0, r1, #3
b   .L1
.L5:
addsr0, r1, #4
b   .L1
.L9:
movsr0, r1
b   .L1


For cortex-m23:
f2:
cmp r0, #5
bhi .L9
movwr2, #:lower16:.LC0
movtr2, #:upper16:.LC0
ldr r2, [r2]
lslsr0, r0, #2
ldr r3, [r2, r0]
mov pc, r3


For reference, for cortex-m3:
f2:
cmp r0, #3
beq .L2
ble .L11
cmp r0, #4
beq .L7
cmp r0, #5
bne .L9
addsr0, r1, #5
bx  lr
.L11:
cmp r0, #1
beq .L4
cmp r0, #2
bne .L9
addsr0, r1, #2
bx  lr
.L2:
addsr0, r1, #3
bx  lr
.L7:
addsr0, r1, #4
bx  lr
.L4:
addsr0, r1, #1
bx  lr
.L9:
mov r0, r1
bx  lr

[Bug target/96769] New: -mpure-code produces suboptimal code for immediate generation for thumb-1

2020-08-24 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96769

Bug ID: 96769
   Summary: -mpure-code produces suboptimal code for immediate
generation for thumb-1
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

As discussed in PR94538, -mpure-code produces switch tables for thumb-1.

int f3 (void) { return 0x1100; }
int f3_2 (void) { return 0x12345678; }

Compiled with -O2 -mpure-code,
-mcpu=cortex-m0:
f3:
movsr0, #17
@ sp needed
lslsr0, r0, #8
lslsr0, r0, #8
lslsr0, r0, #8
bx  lr
f3_2:
movsr0, #18
@ sp needed
lslsr0, r0, #8
addsr0, r0, #52
lslsr0, r0, #8
addsr0, r0, #86
lslsr0, r0, #8
addsr0, r0, #120
bx  lr

-mcpu=cortex-m23:
f3:
movsr0, #136
@ sp needed
lslsr0, r0, #21
bx  lr
f3_2:
movwr0, #22136
@ sp needed
movtr0, 4660
bx  lr

Code for cortex-m23 is OK, but the code for cortex-m0 could be improved for
f3(). For f3_2(), code for cortex-m0 looks OK since that CPU does not have
movw/movt instructions.

[Bug target/96770] New: -mpure-code produces suboptimal code for relocations with small offset for thumb-1

2020-08-24 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96770

Bug ID: 96770
   Summary: -mpure-code produces suboptimal code for relocations
with small offset for thumb-1
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

As discussed in PR94538, -mpure-code produces suboptimal code for relocations
with small offset for thumb-1.

int arr[10];
int *f4 (void) { return &arr[1]; }

Compiled with -O2 -mpure-code,
-mcpu=cortex-m0:
f4:
movsr3, #:upper8_15:#.LC0
lslsr3, #8
addsr3, #:upper0_7:#.LC0
lslsr3, #8
addsr3, #:lower8_15:#.LC0
lslsr3, #8
addsr3, #:lower0_7:#.LC0
@ sp needed
ldr r0, [r3]
addsr0, r0, #4
bx  lr

We should avoid the extra load from the literal pool (related to PR96767), and
the 'adds r0,r0,4'.

-mcpu=cortex-m23:
f4:
movwr0, #:lower16:.LANCHOR0
@ sp needed
movtr0, #:upper16:.LANCHOR0
addsr0, r0, #4
bx  lr

For reference, -mcpu=cortex-m3 produces:
f4:
movwr0, #:lower16:.LANCHOR0+4
movtr0, #:upper16:.LANCHOR0+4
bx  lr

We should generate the same code for cortex-m23.

[Bug target/94538] [9/10/11 Regression] ICE: in extract_constrain_insn_cached, at recog.c:2223 (insn does not satisfy its constraints) with -mcpu=cortex-m23 -mslow-flash-data

2020-08-24 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94538

--- Comment #21 from Christophe Lyon  ---
I filed PR96767, PR96768, PR96769, PR96770 to track the enhancements discussed
here.

The ICE is now fixed in trunk.

[Bug middle-end/96771] New: arm/pr32920-2.c fails since svn r228175 / f11a7b6d57f6fcba1bf2e5a0403dc49120195320

2020-08-24 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96771

Bug ID: 96771
   Summary: arm/pr32920-2.c fails since svn r228175 /
f11a7b6d57f6fcba1bf2e5a0403dc49120195320
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: clyon at gcc dot gnu.org
  Target Milestone: ---

The gcc.target/arm/pr43920-c testcase fails since svn r228175 / git
f11a7b6d57f6fcba1bf2e5a0403dc49120195320 (r6-3529).

That commit from 2015 says:
revert to assign_parms assignments using default defs

Revert the fragile and complicated changes to assign_parms designed to
enable it to use RTL assigments chosen by cfgexpand, and instead have
cfgexpand use the RTL assignments by assign_parms, keying them off of
the default defs that are now necessarily introduced for each parm and
result.  The possible lack of a default def was already a problem, and
the fallbacks in place were not enough, as shown by PR67312.  We now
have checking asserts in set_rtl that verify that we're assigning to
each var a piece of RTL that matches the expectations set forth by
use_register_for_decl.


Looking at the generated code, before the patch we had: (-mcpu=cortex-m3):
getFileStartAndLength:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
push{r3, r4, r5, r6, r7, lr}
mov r6, r1
mov r5, r2
movsr1, #0
movsr2, #1
mov r7, r0
bl  lseek
mov r4, r0
movsr2, #2
movsr1, #0
mov r0, r7
bl  lseek
addsr2, r4, #1
beq .L4
addsr3, r0, #1
beq .L2
subsr0, r0, r4
beq .L4
str r4, [r6]
str r0, [r5]
movsr0, #0
pop {r3, r4, r5, r6, r7, pc}
.L4:
mov r0, #-1
.L2:
pop {r3, r4, r5, r6, r7, pc}

and now we have:
getFileStartAndLength:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
push{r3, r4, r5, r6, r7, lr}
mov r6, r1
mov r5, r2
movsr1, #0
movsr2, #1
mov r7, r0
bl  lseek
mov r4, r0
movsr2, #2
movsr1, #0
mov r0, r7
bl  lseek
addsr2, r4, #1
beq .L1
addsr3, r0, #1
beq .L4
subsr0, r0, r4
beq .L4
str r4, [r6]
str r0, [r5]
movsr4, #0
b   .L1
.L4:
mov r4, #-1
.L1:
mov r0, r4
pop {r3, r4, r5, r6, r7, pc}


The testcase fails because we now generate only one 'pop' instruction while we
expect two.

But the exit code sequence is actually longer now.
Before that patch we had either:
pop {r3, r4, r5, r6, r7, pc}
or
mov r0, #-1
pop {r3, r4, r5, r6, r7, pc}

We now have either:
mov r0, r4
pop {r3, r4, r5, r6, r7, pc}
or
mov r4, #-1
mov r0, r4
pop {r3, r4, r5, r6, r7, pc}

  1   2   3   4   5   6   7   8   9   10   >