[Bug target/69532] FAIL: gcc.target/arm/{vect-,}fmaxmin.c execution test on armv7

2016-02-02 Thread david.sherwood at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69532

--- Comment #4 from david.sherwood at arm dot com ---
(In reply to vries from comment #3)
> Also for the non-vect version:
> ...
> FAIL: gcc.target/arm/fmaxmin.c execution test
> ...

Hi, if you are not already fixing this, I can take a look if you want?

[Bug c++/63424] New: Octave -O3 build: internal compiler error: in prepare_cmp_insn, at optabs.c:4237

2014-10-01 Thread david.sherwood at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63424

Bug ID: 63424
   Summary: Octave -O3 build: internal compiler error: in
prepare_cmp_insn, at optabs.c:4237
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: david.sherwood at arm dot com

Created attachment 33634
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33634&action=edit
A reduced test case from the Octave build failure

Whilst building Octave with CFLAGS="-O3 -pipe" on target aarch64-linux-gnu I
got the following compiler error:

internal compiler error: in prepare_cmp_insn, at optabs.c:4237
 ival = truncate_int (static_cast (ival)   
 ^
0xb2b205 prepare_cmp_insn
/work/davshe01/oban-work/src/gcc/gcc/optabs.c:4237
0xb2b288 emit_cmp_and_jump_insns(rtx_def*, rtx_def*, rtx_code, rtx_def*,
machine_mode, int, rtx_def*, int)
/work/davshe01/oban-work/src/gcc/gcc/optabs.c:4381
0x8d3019 do_compare_rtx_and_jump(rtx_def*, rtx_def*, rtx_code, int,
machine_mode, rtx_def*, rtx_def*, rtx_def*, int)
/work/davshe01/oban-work/src/gcc/gcc/dojump.c:1135
0x95c9bf expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
/work/davshe01/oban-work/src/gcc/gcc/expr.c:8855
0x94ff1f expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
/work/davshe01/oban-work/src/gcc/gcc/expr.c:9428
0x958a00 expand_expr
/work/davshe01/oban-work/src/gcc/gcc/expr.h:451
0x958a00 expand_operands
/work/davshe01/oban-work/src/gcc/gcc/expr.c:7541
0x95bc92 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
/work/davshe01/oban-work/src/gcc/gcc/expr.c:9233
0x94ff1f expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
/work/davshe01/oban-work/src/gcc/gcc/expr.c:9428
0x959945 store_expr(tree_node*, rtx_def*, int, bool)
/work/davshe01/oban-work/src/gcc/gcc/expr.c:5337
0x960cbf expand_assignment(tree_node*, tree_node*, bool)
/work/davshe01/oban-work/src/gcc/gcc/expr.c:5123
0x876a0b expand_gimple_stmt_1
/work/davshe01/oban-work/src/gcc/gcc/cfgexpand.c:3274
0x876a0b expand_gimple_stmt
/work/davshe01/oban-work/src/gcc/gcc/cfgexpand.c:3370
0x87ca33 expand_gimple_basic_block
/work/davshe01/oban-work/src/gcc/gcc/cfgexpand.c:5209
0x87e5d6 execute
/work/davshe01/oban-work/src/gcc/gcc/cfgexpand.c:5815
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.

I have attached a reduced test case,
Thanks,
David Sherwood.


[Bug tree-optimization/66623] New: Unsafe FP math reduction used in strict math mode

2015-06-22 Thread david.sherwood at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66623

Bug ID: 66623
   Summary: Unsafe FP math reduction used in strict math mode
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: david.sherwood at arm dot com
  Target Milestone: ---

Created attachment 35825
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35825&action=edit
Unsafe FP math reduction example

I've found a bug with reductions for Neon whereby we change the ordering
of FP computation in strict math mode. The example looks like this:

float foo (float *__restrict__ i)
{
  float l = 0;

  for (int a = 0; a < 4; a++)
for (int b = 0; b < 4; b++)
  l += i[b];
  return l;
}

when compiled with the flags

-O2 -ftree-vectorize -fno-inline -march=armv8-a

we generate the asm:

moviv0.4s, 0
mov x1, x0
mov w0, 0
.L2:
ldr s1, [x1, w0, sxtw 2]
add w0, w0, 1
cmp w0, 4
dup v1.4s, v1.s[0]
faddv0.4s, v0.4s, v1.4s
bne .L2
faddp   v0.4s, v0.4s, v0.4s
faddp   v0.4s, v0.4s, v0.4s

which is (i[0] + i[1] + ...) + (i[0] + i[1] + ...) + ... We know that in
general
"(a + b) + (a + b)" is not guaranteed to be the same as "((a + b) + a) + b".