[Bug tree-optimization/69336] Constant value not detected

2016-01-26 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69336

Dominik Vogt  changed:

   What|Removed |Added

 CC||vogt at linux dot vnet.ibm.com

--- Comment #10 from Dominik Vogt  ---
The new test fails on s390x; what should I do about it?
(see https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01359.html )

[Bug go/69511] New: G.gcstack_size uses uintptr instead of size_t

2016-01-27 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69511

Bug ID: 69511
   Summary: G.gcstack_size uses uintptr instead of size_t
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: go
  Assignee: ian at airs dot com
  Reporter: vogt at linux dot vnet.ibm.com
CC: cmang at google dot com, krebbel at gcc dot gnu.org
  Target Milestone: ---
Target: s390 s390x

Created attachment 37488
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37488&action=edit
Proposed fix.

The field gcstack_size in the G structure in libgo/runtime/runtime.h has
"uintptr" as its type, but &G.gcstack_size is passed to a function expecting
"size_t *".  On S/390 this results in a warning and hence a bootstrap failure
with the split stack patches we're working on:

  error: passing argument 3 of
  ‘__splitstack_find’ from incompatible pointer type
[-Werror=incompatible-pointer-types]
  g->gcstack = __splitstack_find(nil, nil, &g->gcstack_size,

I believe it's safe to change the type to size_t which it should have been in
the first place.  But theoretically it's possible that size_t and unitptr are
of different bit size.  What do you think about the attached patch?

[Bug tree-optimization/69336] Constant value not detected

2016-01-27 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69336

--- Comment #12 from Dominik Vogt  ---
The test works now on s390x.  Thanks.

[Bug c++/69462] FLT_EVAL_METHOD and DECIMAL_DIG missing in float.h

2016-01-27 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69462

--- Comment #3 from Dominik Vogt  ---
Is this change fit to be posted on gcc-patches?  (I have a patch for that
anyway and can post it for you if you like.)

[Bug c++/69462] FLT_EVAL_METHOD and DECIMAL_DIG missing in float.h

2016-01-27 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69462

Dominik Vogt  changed:

   What|Removed |Added

Summary|stack overflow detected |FLT_EVAL_METHOD and
   ||DECIMAL_DIG missing in
   ||float.h

--- Comment #5 from Dominik Vogt  ---
(Sorry, acidentally typed a search string into the wrong field.)

[Bug c++/69528] New: s/s390: ext/special_functions/hyperg lots of failures

2016-01-28 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69528

Bug ID: 69528
   Summary: s/s390: ext/special_functions/hyperg lots of failures
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vogt at linux dot vnet.ibm.com
CC: krebbel at gcc dot gnu.org
  Target Milestone: ---
Target: s390x

The hyperg functions fails to stay inside the tolerance allowed by the new
test.  Either the function's precision is not good enough on s390x, or the
allowed tolerance is too small (or both).

(:  )

test(data167, toler167)
2: 4.82864e-13 2.5e-13
3: 2.72942e-13 2.5e-13
test(data171, toler171)
1: 5.15741e-12 2.5e-13
2: 5.87911e-13 2.5e-13
3: 2.05545e-12 2.5e-13
4: 2.78641e-13 2.5e-13
5: 2.78806e-13 2.5e-13
test(data172, toler172)
0: 3.10473e-11 2.5e-13
1: 1.28729e-11 2.5e-13
2: 5.93412e-12 2.5e-13
3: 1.25024e-12 2.5e-13
test(data173, toler173)
0: 1.09304e-12 2.5e-13
1: 8.62418e-13 2.5e-13
test(data197, toler197)
2: 4.82864e-13 2.5e-13
3: 2.72942e-13 2.5e-13
test(data201, toler201)
1: 1.86001e-12 2.5e-13
5: 1.79261e-12 2.5e-13
test(data202, toler202)
0: 2.33009e-12 2.5e-13
1: 3.21576e-12 2.5e-13
3: 5.41507e-13 2.5e-13
4: 4.36366e-13 2.5e-13
5: 4.40273e-13 2.5e-13
test(data203, toler203)
0: 2.15453e-12 2.5e-13
1: 1.90262e-12 2.5e-13
2: 7.14356e-13 2.5e-13
3: 2.58658e-12 2.5e-13
test(data204, toler204)
0: 6.15743e-13 2.5e-13
test(data206, toler206)
0: 1.87073e-10 2.5e-13
1: 6.94984e-12 2.5e-13
2: 9.47298e-12 2.5e-13
3: 3.09248e-12 2.5e-13
4: 5.35958e-13 2.5e-13
6: 5.9891e-13 2.5e-13
test(data207, toler207)
0: 4.38856e-10 2.5e-13
1: 7.63877e-11 2.5e-13
2: 7.72796e-10 2.5e-13
3: 1.09366e-12 2.5e-13
4: 6.68933e-13 2.5e-13
5: 3.71824e-12 2.5e-13
6: 9.15105e-13 2.5e-13
test(data208, toler208)
0: 5.19491e-09 2.5e-13
1: 2.6238e-09 2.5e-13
2: 6.29129e-10 2.5e-13
3: 7.664e-11 2.5e-13
4: 2.08562e-12 2.5e-13
5: 1.79497e-11 2.5e-13
6: 9.40163e-13 2.5e-13
7: 7.14083e-13 2.5e-13
test(data209, toler209)
0: 2.15517e-10 2.5e-13
1: 1.60923e-10 2.5e-13
2: 2.69645e-12 2.5e-13
3: 2.35945e-11 2.5e-13
4: 1.00825e-12 2.5e-13
5: 1.54649e-12 2.5e-13
test(data231, toler231)
1: 5.15741e-12 2.5e-13
2: 5.87911e-13 2.5e-13
3: 2.05545e-12 2.5e-13
4: 2.78641e-13 2.5e-13
5: 2.78806e-13 2.5e-13
test(data232, toler232)
0: 3.10473e-11 2.5e-13
1: 1.28729e-11 2.5e-13
2: 5.93412e-12 2.5e-13

[Bug c++/69528] s/s390: ext/special_functions/hyperg lots of failures

2016-01-28 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69528

--- Comment #1 from Dominik Vogt  ---
3: 1.25024e-12 2.5e-13
test(data233, toler233)
0: 1.09304e-12 2.5e-13
1: 8.62418e-13 2.5e-13
test(data236, toler236)
0: 1.87073e-10 2.5e-13
1: 6.94984e-12 2.5e-13
2: 9.47298e-12 2.5e-13
3: 3.09248e-12 2.5e-13
4: 5.35958e-13 2.5e-13
6: 5.9891e-13 2.5e-13
test(data237, toler237)
0: 4.38856e-10 2.5e-13
1: 7.63877e-11 2.5e-13
2: 7.72796e-10 2.5e-13
3: 1.09366e-12 2.5e-13
4: 6.68933e-13 2.5e-13
5: 3.71824e-12 2.5e-13
6: 9.15105e-13 2.5e-13
test(data238, toler238)
0: 5.19491e-09 2.5e-13
1: 2.6238e-09 2.5e-13
2: 6.29129e-10 2.5e-13
3: 7.664e-11 2.5e-13
4: 2.08562e-12 2.5e-13
5: 1.79497e-11 2.5e-13
6: 9.40163e-13 2.5e-13
7: 7.14083e-13 2.5e-13
test(data239, toler239)
0: 2.15517e-10 2.5e-13
1: 1.60923e-10 2.5e-13
2: 2.69645e-12 2.5e-13
3: 2.35945e-11 2.5e-13
4: 1.00825e-12 2.5e-13
5: 1.54649e-12 2.5e-13
test(data241, toler241)
0: 1.68813e-09 2.5e-13
1: 4.9753e-10 2.5e-13
2: 5.28903e-10 2.5e-13
3: 2.29304e-11 2.5e-13
4: 1.49182e-11 2.5e-13
5: 9.41266e-12 2.5e-13
6: 1.00424e-12 2.5e-13
7: 2.98427e-13 2.5e-13
test(data242, toler242)
0: 2.04779e-08 2.5e-13
1: 2.64594e-08 2.5e-13
2: 5.22149e-10 2.5e-13
3: 1.3217e-09 2.5e-13
4: 1.35025e-10 2.5e-13
5: 4.39245e-11 2.5e-13
6: 7.65459e-12 2.5e-13
7: 3.73768e-13 2.5e-13
test(data243, toler243)
0: 3.02697e-07 2.5e-13
1: 2.69153e-07 2.5e-13
2: 3.29237e-08 2.5e-13
3: 7.43965e-09 2.5e-13
4: 1.01678e-10 2.5e-13
5: 2.14138e-09 2.5e-13
6: 2.7467e-11 2.5e-13
7: 5.62027e-12 2.5e-13
test(data244, toler244)
0: 4.34529e-07 2.5e-13
1: 3.35718e-07 2.5e-13
2: 5.23978e-08 2.5e-13
3: 1.60894e-08 2.5e-13
4: 4.29353e-12 2.5e-13
5: 6.54579e-12 2.5e-13
6: 6.73518e-11 2.5e-13
7: 2.6793e-12 2.5e-13
test(data245, toler245)
0: 1.6676e-07 2.5e-13
1: 2.02176e-08 2.5e-13
2: 2.36511e-07 2.5e-13
3: 1.73515e-08 2.5e-13
4: 6.49563e-10 2.5e-13
5: 6.27143e-11 2.5e-13
6: 7.79656e-12 2.5e-13
7: 7.01158e-13 2.5e-13

[Bug c++/69529] New: s/390: special_functions/02_assoc_legendre failure

2016-01-28 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69529

Bug ID: 69529
   Summary: s/390: special_functions/02_assoc_legendre failure
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vogt at linux dot vnet.ibm.com
CC: krebbel at gcc dot gnu.org
  Target Milestone: ---
Target: s390x

The assoc_legendre function exceeds the allowed tolerance on s390x for
data033[19]:

  { 2.5643395957697341e+17, 100, 10, 
  0.89991 },

The actual deviation is 2.75283e-13 while the allowed tolerance is 2.5e-13. 
What should I do with this?

[Bug libstdc++/69528] s/390: ext/special_functions/hyperg lots of failures

2016-01-28 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69528

Dominik Vogt  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Dominik Vogt  ---
That was r232867.  With r232917 the test failures are gone.  Thanks.

[Bug c++/69529] s/390: special_functions/02_assoc_legendre failure

2016-01-28 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69529

Dominik Vogt  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Dominik Vogt  ---
With r232917 the test failure is gone.  Thanks.

[Bug libgomp/69555] New: libgomp.c++/target-6.C fails because of undefined behaviour

2016-01-29 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69555

Bug ID: 69555
   Summary: libgomp.c++/target-6.C fails because of undefined
behaviour
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vogt at linux dot vnet.ibm.com
CC: jakub at gcc dot gnu.org, krebbel at gcc dot gnu.org
  Target Milestone: ---
Target: s390x

The test case libgomp.c++/target-6.C fails on s390x, and I think that's because
it uses a reference type variable in a "private" construct:

-- snip --
...
  int a[y - 2], b[y - 2]; 
  int (&c)[y - 2] = a, (&d)[y - 2] = b;
  ^^^
  ...
  #pragma omp target private (x, u, s, c, i) firstprivate (y, v, t, d)
map(from\
:err)
  ^^^
  { 
...
for (i = 0; i < y - 2; i++) 
  c[i] = d[i];
...
  }
  ...
-- snip --

Depending on optimisations and the rest of the code, this leads to either
incorrect values in the array "a" or accessing a pointer to random memory.

As far as I understand it, the "OpenMP Application Program Interface, Version
4.0 - July 2013" explicitly forbids this on page 161:

28 • A variable that appears in a private clause must not have an incomplete
type or a
29   reference type.

[Bug libgomp/69555] libgomp.c++/target-6.C fails because of undefined behaviour

2016-01-29 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69555

--- Comment #2 from Dominik Vogt  ---
Does it work on other platforms?

[Bug libgomp/69555] libgomp.c++/target-6.C fails because of undefined behaviour

2016-01-29 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69555

--- Comment #4 from Dominik Vogt  ---
Sure.  Can I provide any debug information or another kind of help?

[Bug libgomp/69555] libgomp.c++/target-6.C fails because of undefined behaviour

2016-01-29 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69555

--- Comment #5 from Dominik Vogt  ---
Hm, actually the chapter about "private" says nothing about how to actually
*handle* a reference type whereas it states that for "firstprivate" and
"lastprivate" the reference must bind to the same object for all threads.  To
me it still looks as if using references in "private" is undefined.

[Bug libgomp/69555] libgomp.c++/target-6.C fails because of undefined behaviour

2016-01-29 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69555

--- Comment #6 from Dominik Vogt  ---
Example:

-- snip --
#include  
int main () 
{ 
  int a; 
  int &c = a; 
  printf("a %p\n", &a); 
  printf("g %p\n", &c); 
  #pragma omp target private (c) 
  { 
printf("t %p\n", &c); 
  } 
  return 0; 
} 
-- snip --

prints

  a 0x3a0edb4
  g 0x3a0edb4
  t 0x3a0ea24  <--- c in the loop points to different memory

[Bug c++/69089] C++11: alignas(0) causes an error

2016-02-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69089

--- Comment #5 from Dominik Vogt  ---
No, up to now you're the only one who commented on it.  I keep pinging it once
in a while.


[Bug libgomp/69625] New: deadlock in libgomp.c/doacross-1.c test

2016-02-02 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69625

Bug ID: 69625
   Summary: deadlock in libgomp.c/doacross-1.c test
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vogt at linux dot vnet.ibm.com
CC: jakub at gcc dot gnu.org
  Target Milestone: ---
Target: s390x

Created attachment 37554
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37554&action=edit
.s file of test program

On s390x with -march=z196 -O2/-O3 the test hangs with a deadlock (and also
doacross-[2.3].c and doacross-1.C, but I haven't looked at them yet).  I've
stripped down the test to this:

-- snip --
#include 
#define N 64
int b[N / 16][8][4];

int
main ()
{
  int i, j, k, l;
  (void)l;
  #pragma omp parallel
  {
printf("+++\n");
#pragma omp for schedule(static, 0) ordered (3) nowait
for (i = 2; i < N / 16 - 1; i++)
  for (j = 0; j < 8; j += 2)
for (k = 1; k <= 3; k++)
  {
#pragma omp atomic write
b[i][j][k] = 11;
#pragma omp ordered depend(sink: i, j - 2, k - 1) \
depend(sink: i - 2, j - 2, k + 1)
#pragma omp ordered depend(sink: i - 3, j + 2, k - 2)
if (j >= 2 && k > 1)
  {
#pragma omp atomic read
l = b[i][j - 2][k - 1];
  }
#pragma omp atomic write
b[i][j][k] = 22;
if (i >= 4 && j >= 2 && k < 3)
  {
#pragma omp atomic read
l = b[i - 2][j - 2][k + 1];
  }
#pragma omp ordered depend(source)
#pragma omp atomic write
b[i][j][k] = 33;
  }
printf("---\n");
  }
printf("done\n");
  return 0;
}
-- snip --

(See attachment for full .s file.)
(Running on an LPAR with 17 cores inside gdb.)

The function GOMP_parallel starts threads 2 to 17 which enter and leave the
parallel region (they print both "+++" and "---" then hang in a
team_barrier_wait_final() call in gomp_thread_start.  Only then thread 1 runs
the thread function.

  gomp_team_start (fn, data, num_threads, flags, gomp_new_team (num_threads));
  fn (data);

Thread 1 comes across

   0x8b7a <+522>:   brasl   %r14,0x87b0


with %r10 == 2 (which presumably contains k), then continues through 

   0x8cf6 <+902>:   brasl   %r14,0x86f0


and finally comes back to

   0x8b7a <+522>:   brasl   %r14,0x87b0


with %r10 == 3.  In GOMP_doacross_wait() it ends up calling doacross_spin() and
never gets out of that again:

   doacross_spin (array, flattened, cur);

   0x03fff7ef5562 <+282>:   lg  %r1,0(%r5)
   0x03fff7ef5568 <+288>:   clgr%r1,%r2
   0x03fff7ef556c <+292>:   jle 0x3fff7ef5562 

The value of r1 (= *r5 (= *array?)) remains 6 (since there's no other thread
left that could modify it) while the value of r2 is 0xfffb4a1.  To me this
looks as if doacross_spin() compares an integer value with an address or
rubbish.

Any ideas what's going on?

[Bug libgomp/69625] deadlock in libgomp.c/doacross-1.c test

2016-02-03 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69625

--- Comment #1 from Dominik Vogt  ---
It's a bug in the S/390 backend that sometimes trashes r6 in vararg functions. 
We're working on a fix.

[Bug fortran/67451] [5/6 Regression] [F08] ICE with sourced allocation from coarray.

2016-02-09 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67451

Dominik Vogt  changed:

   What|Removed |Added

 CC||vogt at linux dot vnet.ibm.com

--- Comment #8 from Dominik Vogt  ---
gfortran.dg/coarray_allocate_3.f08 crashed with an invalid free() on s390 and
s390x.

(gdb) run
Starting program: .../gcc/build/gcc/testsuite/coarray_allocate_3.exe 

Program received signal SIGSEGV, Segmentation fault.
0x03fff7cb6814 in free () from /lib64/libc.so.6
(gdb) bt
#0  0x03fff7cb6814 in free () from /lib64/libc.so.6
#1  0x8cae in MAIN__ ()
at .../gcc/testsuite/gfortran.dg/coarray_allocate_3.f08:26
#2  main (argc=, argv=)
at .../gcc/testsuite/gfortran.dg/coarray_allocate_3.f08:27
#3  0x03fff7c4e0a2 in __libc_start_main () from /lib64/libc.so.6
#4  0x8866 in _start ()

[Bug fortran/67451] [5/6 Regression] [F08] ICE with sourced allocation from coarray.

2016-02-09 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67451

--- Comment #9 from Dominik Vogt  ---
I.e. free(0x1) is called:

Load foobar.1497 to r12

   0x8998 <+40>:larl%r12,0x80002408 

   (gdb) p /x $r12
   0x80002408

First malloc call, store mem pointer in foobar.1497

   0x89c6 <+86>:brasl   %r14,0x8788 
   0x89cc <+92>:stg %r2,0(%r12)

Second malloc call, store mem pointer in some_local_object.1511

   0x8ae8 <+376>:   brasl   %r14,0x8788 
   0x8aee <+382>:   stgrl   %r2,0x800023d0 

Load address of some_local_object.1511 to r1

   0x8afa <+394>:   larl%r1,0x800023d0 

Write something to r1 + 16, r1 + 32, r1 + 40, r1 + 24

   0x8b00 <+400>:   mvghi   16(%r1),297
   0x8b06 <+406>:   stg %r11,32(%r1)
   0x8b0c <+412>:   stg %r8,40(%r1)
   0x8b12 <+418>:   mvghi   24(%r1),1

This overwrites foobar.1497 with the value 1:

   0x8b18 <+424>:   mvghi   56(%r1),1

   (gdb) p /x $r1 + 56
   0x80002408   <-- address of foobar.1497

[Bug libgomp/69625] S/390 deadlock in libgomp.c/doacross-1.c test (vararg function trashes r6)

2016-02-09 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69625

Dominik Vogt  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Dominik Vogt  ---
Fixed with above patch.

[Bug fortran/67451] [5/6 Regression] [F08] ICE with sourced allocation from coarray.

2016-02-10 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67451

--- Comment #12 from Dominik Vogt  ---
The patch works on s390x.

[Bug go/69766] New: go.test/test/env.go fails on biarch

2016-02-11 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69766

Bug ID: 69766
   Summary: go.test/test/env.go fails on biarch
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: go
  Assignee: ian at airs dot com
  Reporter: vogt at linux dot vnet.ibm.com
CC: cmang at google dot com
  Target Milestone: ---
  Host: s390x

When testing with

  make -k check-go RUNTESTFLAGS="--target_board=unix\{-m31,-m64\}"

The testgo.test/test/env.go fails with -m31 because runtime.GOARCH and the
GOARCH environment variable disagree:

  $GOARCH=s390x!= runtime.GOARCH=s390
  ^  
  FAIL: go.test/test/env.go execution,  -O2 -g

The compile command was

  $ .../gcc/build/gcc/testsuite/go/../../gccgo
-B.../gcc/build/gcc/testsuite/go/../../
.../gcc/gcc/testsuite/go.test/test/env.go -fno-diagnostics-show-caret
-fdiagnostics-color=never -I.../gcc/build/s390x-ibm-linux-gnu/32/libgo -w -O2
-g -L.../gcc/build/s390x-ibm-linux-gnu/32/libgo
-L.../gcc/build/s390x-ibm-linux-gnu/32/libgo/.libs -lm -m31 -o
.../gcc/build/gcc/testsuite/go/env.x

Unfortunately, the test does not keep the failing executable, and if I run this
command manually, $GOARCH is not set at all in the resulting executable.

[Bug go/69766] go.test/test/env.go fails on biarch

2016-02-11 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69766

--- Comment #1 from Dominik Vogt  ---
If I understand the GOARCH environtment variable right it's value is just the
architecture of the build system.  So, this test is bound to fail for any
multiarch target with the non-standard architecture, and for cross compilation?

[Bug go/69766] go.test/test/env.go fails on biarch

2016-02-11 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69766

--- Comment #2 from Dominik Vogt  ---
Created attachment 37663
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37663&action=edit
Experimental patch

Is the attached patch the right way to deal with this?

[Bug regression/69838] New: [regression] Lra deletes EH_REGION

2016-02-16 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838

Bug ID: 69838
   Summary: [regression] Lra deletes EH_REGION
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: regression
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vogt at linux dot vnet.ibm.com
  Target Milestone: ---
  Host: s390x
Target: s390x

Created attachment 37704
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37704&action=edit
Ira dump (ok)

It looks like Lra does not handle EH_REGION notes correctly.  There is at least
one Gnat testcase where Lra wrongly deletes all exception handling code
(gcc/testsuite/gnat.dg/null_pointer_deref1.adb):

-- snip --
procedure Null_Pointer_Deref1 is 
   type Int_Ptr is access all Integer; 

   function Ident return Int_Ptr is 
   begin 
 return null; 
   end; 

   Data : Int_Ptr := Ident; 
begin 
   Data.all := 1; 
exception 
   when Constraint_Error | Storage_Error => null; 
end; 
-- snip --

The exception handling code vanishes in the reload pass (see attached rtl
dumps).  As a consequence, the exception is not caugt by the function and the
program terminates with an error.  With -mno-lra the test case works fine, and
the code in reload1.c seems to have special treatment for EH_REGION notes that
is missing in ira.c.

[Bug regression/69838] [regression] Lra deletes EH_REGION

2016-02-16 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838

--- Comment #1 from Dominik Vogt  ---
Created attachment 37705
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37705&action=edit
Reload dump (broken)

[Bug bootstrap/69709] [6 Regression] profiled bootstrap error on s390x-linux-gnu with r233194

2016-02-17 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69709

Dominik Vogt  changed:

   What|Removed |Added

 CC||vogt at linux dot vnet.ibm.com

--- Comment #3 from Dominik Vogt  ---
Building the source Rpm you sent us (to build the Ada compiler) has the same
problem with profiledbootstrap, building on Fedora 20 (maybe other distros
too).  I'll try to isolate the problem.

[Bug regression/69838] [4.9/5/6 Regression] Lra deletes EH_REGION

2016-02-17 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838

Dominik Vogt  changed:

   What|Removed |Added

  Component|rtl-optimization|regression

--- Comment #3 from Dominik Vogt  ---
Bisecting fails to identify the exacty commit.  It's broken in this commit, and
the commits before fail to build Ada executalbes because they don't find the
Ada library.  It probably does not matter; the problem is present since at
least 21st of February, 2013.

PR bootstrap/56258
* doc/invoke.texi (-fdump-rtl-pro_and_epilogue): Use @item
instead of @itemx.

* gnat-style.texi (@title): Remove @hfill.
* projects.texi: Avoid line wrapping inside of @pxref or
@xref.

* doc/cp-tools.texinfo (Virtual Machine Options): Use just
one @gccoptlist instead of 3 separate ones.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@196196
138bc75d-0d04-0410-96

[Bug bootstrap/69709] [6 Regression] profiled bootstrap error on s390x-linux-gnu with r233194

2016-02-17 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69709

--- Comment #5 from Dominik Vogt  ---
@Matthias: So far it only happens for me when building a gcc rpm from source on
a (very slow VM), but not when compiling the same sources.  Is there anything
special about your build machine or environment on it?

[Bug middle-end/69838] [4.9/5/6 Regression] Lra deletes EH_REGION

2016-02-19 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838

--- Comment #7 from Dominik Vogt  ---
With the patch I get an Ice with -m31:

spawn -ignore SIGHUP .../build/gcc/xgcc -B.../build/gcc/
.../gcc/testsuite/gcc.dg/graphite/id-pr45230-1.c -fno-diagnostics-show-caret
-fdiagnostics-color=never -O2 -fgraphite-identity -ffast-math -S -m31 -o
id-pr45230-1.s^M 
.../gcc/testsuite/gcc.dg/graphite/id-pr45230-1.c: In function 'main':^M 
/home/vogt/src/git/gcc/gcc/testsuite/gcc.dg/graphite/id-pr45230-1.c:45:1:
internal compiler error: Segmentation fault^M 
0x806199b9 crash_signal^M 
../../gcc/toplev.c:335^M 
0x80a55d06 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*,
vec*)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1408^M 
0x80a55d51 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*,
vec*)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1418^M 
0x80a55d51 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*,
vec*)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1418^M 
0x80a55d51 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*,
vec*)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1418^M 
0x80a5748b translate_isl_ast_to_gimple::rename_all_uses(tree_node*,
basic_block\
_def*, basic_block_def*)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1569^M 
0x80a57631 translate_isl_ast_to_gimple::get_rename_from_scev(tree_node*,
gimple\
**, loop*, basic_block_def*, basic_block_def*, vec\
)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1623^M 
0x80a597a1 translate_isl_ast_to_gimple::rename_uses(gimple*,
gimple_stmt_iterat\
or*, basic_block_def*, loop*, vec)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1730^M
0x80a5b06d translate_isl_ast_to_gimple::graphite_copy_stmts_from_block(basic_b\
lock_def*, basic_block_def*, vec)^M 
../../gcc/graphite-isl-ast-to-gimple.c:2596^M 
0x80a5b5eb
translate_isl_ast_to_gimple::copy_bb_and_scalar_dependences(basic_bl\
ock_def*, edge_def*, vec)^M 
../../gcc/graphite-isl-ast-to-gimple.c:2809^M 
0x80a5bbf5
translate_isl_ast_to_gimple::translate_isl_ast_node_user(isl_ast_nod\
e*, edge_def*, std::map,
std::allocator\
 > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:935^M 
0x80a5bf95 translate_isl_ast_to_gimple::translate_isl_ast_for_loop(loop*,
isl_a\
st_node*, edge_def*, tree_node*, tree_node*, tree_node*, std::map, std::allocator\
 > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:685^M 
0x80a5c217 translate_isl_ast_to_gimple::translate_isl_ast_node_for(loop*,
isl_a\
st_node*, edge_def*, std::map,
std::all\
ocator > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:854^M 
0x80a5beb1 translate_isl_ast_to_gimple::translate_isl_ast(loop*,
isl_ast_node*,\
 edge_def*, std::map,
std::allocator > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1032^M 
0x80a5c359 translate_isl_ast_to_gimple::translate_isl_ast_node_block(loop*,
isl\
_ast_node*, edge_def*, std::map,
std::a\
llocator > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:964^M 
0x80a5be69 translate_isl_ast_to_gimple::translate_isl_ast(loop*,
isl_ast_node*,\
 edge_def*, std::map,
std::allocator > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1043^M 
0x80a5c359 translate_isl_ast_to_gimple::translate_isl_ast_node_block(loop*,
isl\
_ast_node*, edge_def*, std::map,
std::a\
llocator > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:964^M 
0x80a5be69 translate_isl_ast_to_gimple::translate_isl_ast(loop*,
isl_ast_node*,\
 edge_def*, std::map,
std::allocator > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1043^M 
0x80a5c359 translate_isl_ast_to_gimple::translate_isl_ast_node_block(loop*,
isl\
_ast_node*, edge_def*, std::map,
std::a\
llocator > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:964^M 
0x80a5be69 translate_isl_ast_to_gimple::translate_isl_ast(loop*,
isl_ast_node*,\
 edge_def*, std::map,
std::allocator > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1043^M

[Bug middle-end/69838] [4.9/5/6 Regression] Lra deletes EH_REGION

2016-02-19 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838

--- Comment #9 from Dominik Vogt  ---
I think I've already tested this commit without the patch and did not get that
Ice, but maybe my memory fails me.  I'm just running the test suite again with
the commit reverted to make sure ...

[Bug middle-end/69838] [4.9/5/6 Regression] Lra deletes EH_REGION

2016-02-19 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838

--- Comment #11 from Dominik Vogt  ---
If that is unrelated, the patch does not cause any regressions on a biarch
build.  Sould I also test it in a 31-bit changeroot?

[Bug middle-end/69838] [4.9/5/6 Regression] Lra deletes EH_REGION

2016-02-19 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838

--- Comment #12 from Dominik Vogt  ---
(The test just finished; the Ice is present without the patch too.)

[Bug middle-end/69838] [4.9/5 Regression] Lra deletes EH_REGION

2016-02-19 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838

Dominik Vogt  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #15 from Dominik Vogt  ---
Successfully tested and bootstrapped trunk on s390x (biarch).

[Bug bootstrap/69709] [6 Regression] profiled bootstrap error on s390x-linux-gnu with r233194

2016-02-24 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69709

--- Comment #7 from Dominik Vogt  ---
The stage1 compiler does something wrong when compiling gcc/real.c (with
-fprofile-generate).  The function div_significands() (inlined into
do_divide()) returns a wrong result due to bad register usage in this loop:

-- snip --
  do
{
  msb = u.sig[SIGSZ-1] & SIG_MSB;
  lshift_significand_1 (&u, &u);
start:
  if (msb || cmp_significands (&u, b) >= 0)
{
  sub_significands (&u, &u, b, 0);
  set_significand_bit (r, bit);
}
}
  while (--bit >= 0);
-- snip --

At loop entry ("start" label), r1 holds the highest 64 bits of the significand.
 The first pass through the loop seems to be correct; sub_significands() and
set_significand_bit() do the correct operations.  After that, r1 is decremented
by one as if it contained the variable "bit".  Later on r1 gets (eventually)
overwritten with zero.  After that, the loop always thinks that the remaining
significand is too smaller than b because its always zero.

In the end, the "result" of the division is one in the highest significand bit
and all other bits zero, eventually causing the observed assertion failure.

With a broken compiler (from stageprofile), the test program for triggering the
ICE is simply

-- snip --
int x = __DBL_MAX__;
-- snip --

All of this only happens on a Fedora 20 chroot for me.  I've tried to add
"-save-temps -dA -dP -fdump-rtl-all" to the OPT_FLAGS in the rpm spec file, but
then the package doesn't build at all.  Any hints how to get debug information
from the rembuild run is welcome.

[Bug bootstrap/69709] [6 Regression] profiled bootstrap error on s390x-linux-gnu with r233194

2016-02-25 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69709

--- Comment #8 from Dominik Vogt  ---
Created attachment 37790
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37790&action=edit
Test case

The option -fpeel-loops triggers the bug.  The attached program has a different
result with -fpeel-loops than without it.

 $ gcc -O2 -march=z10 -fpeel-loops pr69709.c && ./a.out
 1 bits set in result
 $ gcc -O2 -march=z10 pr69709.c && ./a.out
 2 bits set in result

(2 is the correct result).

[Bug bootstrap/69709] [6 Regression] profiled bootstrap error on s390x-linux-gnu with r233194

2016-02-25 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69709

--- Comment #9 from Dominik Vogt  ---
(-fpeel-loops is activated by -fprofile-use, so this is the connection to
profilesbootstrap.)

[Bug bootstrap/69709] [6 Regression] profiled bootstrap error on s390x-linux-gnu with r233194

2016-02-25 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69709

--- Comment #10 from Dominik Vogt  ---
We've located the bug in the s390 backend.  No further help is needed.

[Bug fortran/67451] [5/6 Regression] [F08] ICE with sourced allocation from coarray.

2016-02-26 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67451

--- Comment #15 from Dominik Vogt  ---
The problem is gone on today's trunk for s390 and s390x.

[Bug middle-end/69920] [6 Regression] FAIL: g++.dg/torture/pr42704.C -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error)

2016-02-26 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69920

Dominik Vogt  changed:

   What|Removed |Added

 CC||vogt at linux dot vnet.ibm.com

--- Comment #8 from Dominik Vogt  ---
Also failing on s390 and s390x; the same bug possibly causes several other test
failures:

FAIL: gcc.dg/graphite/id-pr45230-1.c (internal compiler error) 
FAIL: gcc.dg/tree-ssa/pr69666.c (internal compiler error)

Maybe these too:

FAIL: gfortran.dg/reassoc_6.f   -O   scan-tree-dump-not optimized "~"
FAIL: gcc.dg/graphite/scop-sor.c scan-tree-dump-times graphite "number of
SCoPs\
: 1" 1

[Bug middle-end/69920] [6 Regression] FAIL: g++.dg/torture/pr42704.C -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error)

2016-02-26 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69920

--- Comment #9 from Dominik Vogt  ---
(Fails only with -m31.)

[Bug middle-end/69920] [6 Regression] FAIL: g++.dg/torture/pr42704.C -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error)

2016-02-26 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69920

--- Comment #12 from Dominik Vogt  ---
The Ice in 42704.c is gone on s390[x] with trunk (but not the other FAILs).  Is
the Ice below related to this bug report or is it something totally different?

.../gcc/testsuite/gcc.dg/graphite/id-pr45230-1.c: In function 'main':^M 
.../gcc/testsuite/gcc.dg/graphite/id-pr45230-1.c:45:1: internal compiler error:
Segmentation fault^M 
0x8061ac19 crash_signal^M 
../../gcc/toplev.c:335^M 
0x80a6305e translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*,
vec*)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1408^M 
0x80a630a9 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*,
vec*)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1418^M 
0x80a630a9 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*,
vec*)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1418^M 
0x80a630a9 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*,
vec*)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1418^M 
0x80a647e3 translate_isl_ast_to_gimple::rename_all_uses(tree_node*,
basic_block_def*, basic_block_def*)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1569^M 
0x80a64989 translate_isl_ast_to_gimple::get_rename_from_scev(tree_node*,
gimple**, loop*, basic_block_def*, basic_block_def*, vec)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1623^M 
0x80a66af9 translate_isl_ast_to_gimple::rename_uses(gimple*,
gimple_stmt_iterator*, basic_block_def*, loop*, vec)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1730^M 
0x80a683c5
translate_isl_ast_to_gimple::graphite_copy_stmts_from_block(basic_block_def*,
basic_block_def*, vec\
)^M 
../../gcc/graphite-isl-ast-to-gimple.c:2596^M 
0x80a68943
translate_isl_ast_to_gimple::copy_bb_and_scalar_dependences(basic_block_def*,
edge_def*, vec)^M 
../../gcc/graphite-isl-ast-to-gimple.c:2809^M 
0x80a68f4d
translate_isl_ast_to_gimple::translate_isl_ast_node_user(isl_ast_node*,
edge_def*, std::map, std::allocator > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:935^M 
0x80a692ed translate_isl_ast_to_gimple::translate_isl_ast_for_loop(loop*,
isl_ast_node*, edge_def*, tree_node*, tree_node*, tree_node*, std\
::map, std::allocator > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:685^M 
0x80a6956f translate_isl_ast_to_gimple::translate_isl_ast_node_for(loop*,
isl_ast_node*, edge_def*, std::map, std::allocator > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:854^M 
0x80a69209 translate_isl_ast_to_gimple::translate_isl_ast(loop*, isl_ast_node*,
edge_def*, std::map\
, std::allocator > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:1032^M 
0x80a696b1 translate_isl_ast_to_gimple::translate_isl_ast_node_block(loop*,
isl_ast_node*, edge_def*, std::map, std::allocator > >&)^M 
../../gcc/graphite-isl-ast-to-gimple.c:964^M 
0x80a691c1 translate_isl_ast_to_gimple::translate_isl_ast(loop*, isl_ast_node*,
edge_def*, std::map\
, std::allocator > >

[Bug middle-end/69983] [6 Regression] FAIL: gcc.dg/graphite/scop-sor.c scan-tree-dump-times graphite "number of SCoPs: 1" 1

2016-02-29 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69983

Dominik Vogt  changed:

   What|Removed |Added

 CC||vogt at linux dot vnet.ibm.com

--- Comment #3 from Dominik Vogt  ---
Also fails on s390x with -m64 and -m31.

[Bug tree-optimization/68659] [6 regression] FAIL: gcc.dg/graphite/id-pr45230-1.c (internal compiler error)

2016-02-29 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68659

Dominik Vogt  changed:

   What|Removed |Added

 CC||vogt at linux dot vnet.ibm.com

--- Comment #18 from Dominik Vogt  ---
I've no opinion on wether the patch is good or not, but it does make the test
failure go away on s390x.

[Bug target/70009] test case libgomp.oacc-c-c++-common/vprop.c fails starting with its introduction in r233607

2016-02-29 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70009

Dominik Vogt  changed:

   What|Removed |Added

 CC||vogt at linux dot vnet.ibm.com

--- Comment #1 from Dominik Vogt  ---
Also fails on s390x with -m64 and -m31.

[Bug tree-optimization/69760] [4.9/5 Regression] Wrong 64-bit memory address caused by an unneeded overflowing 32-bit integer multiplication on x86_64 under -O2 and -O3 code optimization

2016-02-29 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69760

Dominik Vogt  changed:

   What|Removed |Added

 CC||vogt at linux dot vnet.ibm.com

--- Comment #13 from Dominik Vogt  ---
Created attachment 37824
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37824&action=edit
Dump file of reassoc_6.f

This commit introduces a regression on s390x (-m64):

FAIL: gfortran.dg/reassoc_6.f   -O   scan-tree-dump-not optimized "~" 

(Dump file of the test attached.)

[Bug ada/70017] New: Ada: c52103x test failure on s390x

2016-02-29 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017

Bug ID: 70017
   Summary: Ada: c52103x test failure on s390x
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: ada
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vogt at linux dot vnet.ibm.com
CC: ebotcazou at gcc dot gnu.org, krebbel at gcc dot gnu.org
  Target Milestone: ---
  Host: s390x
Target: s390x

My knowledge of Ada is practically zero, but I'm debugging a few Ada test
failures on s390x (gcc-4.7 or earlier).

-- snip --
,.,. C52103X ACATS 2.5 16-02-29 16:03:21
 C52103X CHECK THAT IN ARRAY ASSIGNMENTS AND IN SLICE ASSIGNMENTS,
THE LENGTHS MUST MATCH; ALSO CHECK WHETHER
CONSTRAINT_ERROR OR STORAGE_ERROR ARE RAISED FOR LARGE
ARRAYS.
   - C52103X NO CONSTRAINT_ERROR FOR TYPE WITH 'LENGTH = INTEGER'LAST + 
3.
raised STORAGE_ERROR : System.Stack_Checking.Operations.Stack_Check: stack
over\
flow detected
FAIL:   c52103x 
-- snip --

This happens here:

-- snip --
   TYPE  TA42  IS  ARRAY( 
INTEGER RANGE IDENT_INT(-2)..IDENT_INT(INTEGER'LAST) 
)  OF BOOLEAN ;
...
OBJ_DCL:   DECLARE   -- THIS BLOCK DECLARES TWO BOOLEAN ARRAYS THAT 
 -- HAVE INTEGER'LAST + 3 COMPONENTS; 
 -- STORAGE_ERROR MAY BE RAISED. 
ARR41  :  TA41 ; 
ARR42  :  TA42 ; 
-- snip --

This is a reduced test (that fails with ulimit -s 131072):
-- snip --
procedure c52103x is 
begin 
declare 
type T is array(integer range -2..1000) of boolean; 
begin 
declare 
A : T; 
begin 
null; 
end; 
end; 
end; 
-- snip --

As far as I understand this code, it assumes that only the first few memory
pages of the array are allocated in the stack initially and the rest is
allocated when actually accessed.  However, on s390x first a snippet of three
pages is allocated and checked, followed immediately by the rest of the array
plus another check that fails because the stack is too small for that:

-- snip --
_ada_c52103x: 
stmg%r11,%r15,88(%r15) 
larl%r13,.L4 
aghi%r15,-168 
lgr %r11,%r15 
lgr %r1,%r15 
aghi%r1,-12280  # <-- first three pages
lgr %r2,%r1 
brasl   %r14,_gnat_stack_check # <-- OK
lgr %r1,%r15 
lgr %r12,%r1 
lg  %r1,.L5-.L4(%r13) 
agr %r1,%r15# <-- rest of array
lgr %r2,%r1 
brasl   %r14,_gnat_stack_check # <-- FAIL
...

.section.rodata 
.align  8 
.L4: 
.L6: 
.quad   -1008 
.L5: 
.quad   -10012288 
-- snip --

(Stack on s390x grows down.)

I've no idea whether this is the intended behaviour (i.e. the test case has a
bug) or not, and if not whether I should look for the bug in the s390x backend
or somewhere else.

[Bug ada/70017] c52103x and c52104x test failure on s390x

2016-02-29 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017

Dominik Vogt  changed:

   What|Removed |Added

Summary|Ada: c52103x test failure   |c52103x and c52104x test
   |on s390x|failure on s390x

--- Comment #1 from Dominik Vogt  ---
c52104x has similar code and fails too.

[Bug middle-end/70025] [6 Regression] Miscompilation of gc-7.4.2 on s390x starting with r227382

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70025

--- Comment #2 from Dominik Vogt  ---
This is related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

[Bug target/61578] [4.9 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #35 from Dominik Vogt  ---
Looks like the extra condition in that patch is still not good enough:

--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -945,6 +945,12 @@ match_reload (signed char out, signed char *ins, enum
reg_c
= (ins[1] < 0 && REG_P (in_rtx)
   && (int) REGNO (in_rtx) < lra_new_regno_start
   && find_regno_note (curr_insn, REG_DEAD, REGNO (in_rtx))
+  /* We can not use the same value if the pseudo is mentioned
+ in the output, e.g. as an address part in memory,
+ becuase output reload will actually extend the pseudo
+ liveness.  We don't care about eliminable hard regs here
+ as we are interesting only in pseudos.  */
+  && (out < 0 || regno_use_in (REGNO (in_rtx), out_rtx) == NULL_RTX)
   ? lra_create_new_reg (inmode, in_rtx, goal_class, "")
   : lra_create_new_reg_with_unique_value (outmode, out_rtx,
   goal_class, ""));

[Bug target/61578] [4.9 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578

--- Comment #36 from Dominik Vogt  ---
(Sorry, comment 35 belongs to the follow-up report
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70025 )

[Bug middle-end/70025] [6 Regression] Miscompilation of gc-7.4.2 on s390x starting with r227382

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70025

--- Comment #3 from Dominik Vogt  ---
Looks like the extra condition in that patch is still not good enough:

--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -945,6 +945,12 @@ match_reload (signed char out, signed char *ins, enum
reg_c
= (ins[1] < 0 && REG_P (in_rtx)
   && (int) REGNO (in_rtx) < lra_new_regno_start
   && find_regno_note (curr_insn, REG_DEAD, REGNO (in_rtx))
+  /* We can not use the same value if the pseudo is mentioned
+ in the output, e.g. as an address part in memory,
+ becuase output reload will actually extend the pseudo
+ liveness.  We don't care about eliminable hard regs here
+ as we are interesting only in pseudos.  */
+  && (out < 0 || regno_use_in (REGNO (in_rtx), out_rtx) == NULL_RTX)
   ? lra_create_new_reg (inmode, in_rtx, goal_class, "")
   : lra_create_new_reg_with_unique_value (outmode, out_rtx,
   goal_class, ""));

[Bug middle-end/70025] [6 Regression] Miscompilation of gc-7.4.2 on s390x starting with r227382

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70025

--- Comment #5 from Dominik Vogt  ---
Yup.

  debug_rtx(out_rtx) = (mem/f:DI (plus:DI (reg/v/f:DI 164 [orig:129 p ] [129])
  (const_int 16 [0x10])) [4 p_8(D)->d3+0 S8 A64])

  debug_rtx(in_rtx) = (reg/v/f:DI 151 [orig:129 p ] [129])

Because in_rtx doesn't appear in out_rtx the condition "regno_use_in (REGNO
(in_rtx), out_rtx) == 0" misses its mark.

[Bug ada/70017] c52103x and c52104x test failure on s390x

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017

--- Comment #3 from Dominik Vogt  ---
It looks like no more than activating Stack_Check_Probes is required.  Thanks!

[Bug ada/70017] c52103x and c52104x test failure on s390x

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017

--- Comment #5 from Dominik Vogt  ---
We have zero test failures with the patched code.  Is that good enough or
should I still take a closer look?

[Bug ada/70017] c52103x and c52104x test failure on s390x

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017

--- Comment #6 from Dominik Vogt  ---
S390 does have stack checking support, so the question is really just whether
Ada has extra requirements.

[Bug ada/70017] c52103x and c52104x test failure on s390x

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017

--- Comment #7 from Dominik Vogt  ---
Sorry, comment 6 is wrong, I was thinking about stack *guard* support.

[Bug middle-end/70025] [6 Regression] Miscompilation of gc-7.4.2 on s390x starting with r227382

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70025

--- Comment #6 from Dominik Vogt  ---
Shouldn't this rather check whether the *value* of the register in in_rtx
appears in out_rtx?

[Bug middle-end/70025] [6 Regression] Miscompilation of gc-7.4.2 on s390x starting with r227382

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70025

--- Comment #10 from Dominik Vogt  ---
Successfully bootstrapped and regression tested on s390x (-m31 and -m64).

[Bug tree-optimization/69196] [5/6 Regression] code size regression with jump threading at -O2

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69196

Dominik Vogt  changed:

   What|Removed |Added

 CC||vogt at linux dot vnet.ibm.com

--- Comment #15 from Dominik Vogt  ---
The new test fails on s390x:

.../build/gcc/xgcc -B.../build/gcc/
.../gcc/testsuite/gcc.dg/tree-ssa/pr69196-1.c -fno-diagnostics-show-caret
-fdiagnostics-color=never -O2 -fdump-tree-vrp1-details -S -m31 -o pr69196-1.s
PASS: gcc.dg/tree-ssa/pr69196-1.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/pr69196-1.c scan-tree-dump vrp1 "FSM did not thread
around loop and would copy too many statements"

(same with -m64 instead of -m31).

[Bug tree-optimization/69196] [5/6 Regression] code size regression with jump threading at -O2

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69196

--- Comment #16 from Dominik Vogt  ---
(In the ChangeLog entry, the "-1" is missing from the name of the new
testfile.)

[Bug middle-end/69983] [6 Regression] FAIL: gcc.dg/graphite/scop-sor.c scan-tree-dump-times graphite "number of SCoPs: 1" 1

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69983

--- Comment #8 from Dominik Vogt  ---
Successfully bootstrapped and regression tested on s390x (biarch).

[Bug tree-optimization/69760] [4.9/5 Regression] Wrong 64-bit memory address caused by an unneeded overflowing 32-bit integer multiplication on x86_64 under -O2 and -O3 code optimization

2016-03-01 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69760

--- Comment #14 from Dominik Vogt  ---
The regression is fixed with the latest patch for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69983

[Bug middle-end/69987] [6 Regression] internal compiler error: in verify_loop_structure, at cfgloop.c:1639

2016-03-02 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69987

Dominik Vogt  changed:

   What|Removed |Added

 CC||vogt at linux dot vnet.ibm.com

--- Comment #5 from Dominik Vogt  ---
The new test fails on s390x with -m31 (but works with -m64).  (Without trying
it I assume it also fails on s390).

-- snip --
FAIL: gfortran.dg/pr69987.f90   -O  (test for excess errors)
Excess errors:
f951: Warning: -fprefetch-loop-arrays not supported for this target (try -march
switches)
-- snip --

[Bug ada/70017] c52103x and c52104x test failure on s390x

2016-03-02 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017

Dominik Vogt  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Dominik Vogt  ---
Fixed.

[Bug tree-optimization/69196] [5/6 Regression] code size regression with jump threading at -O2

2016-03-02 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69196

--- Comment #18 from Dominik Vogt  ---
Which dumps do you need?

[Bug libgomp/69555] libgomp.c++/target-6.C fails because of undefined behaviour

2016-03-03 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69555

--- Comment #11 from Dominik Vogt  ---
Successfully bootstrapped and regression tested on s390x biarch.
Thanks.

[Bug tree-optimization/68659] [6 regression] FAIL: gcc.dg/graphite/id-pr45230-1.c (internal compiler error)

2016-03-03 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68659

--- Comment #22 from Dominik Vogt  ---
Successfully bootstrapped and regression tested on s390x biarch.
Thanks.

[Bug middle-end/69987] [6 Regression] internal compiler error: in verify_loop_structure, at cfgloop.c:1639

2016-03-03 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69987

--- Comment #7 from Dominik Vogt  ---
Fixed on s390x.
Thanks.

[Bug tree-optimization/69196] [5/6 Regression] code size regression with jump threading at -O2

2016-03-03 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69196

--- Comment #20 from Dominik Vogt  ---
Created attachment 37860
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37860&action=edit
vrp1 dump for s390x (-m64)

vrp1 dump for s390x attached (-m64, give me a shout if you need the -m31 dump).

[Bug other/70078] New: gccint: define_split "not" allowed to create pseudos

2016-03-04 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70078

Bug ID: 70078
   Summary: gccint: define_split "not" allowed to create pseudos
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vogt at linux dot vnet.ibm.com
  Target Milestone: ---

The section "Defining How to Split Instructions" in the gccint manual claims

  The preparation-statements are similar to those statements that are
  specified for define_expand.
  ...
  Unlike those in define_expand, however, these statements must not
  generate any new pseudo-registers.  Once reload has completed, they
  also must not allocate any space in the stack frame.

Splitters seem to be allowed to generate new pseudos under certain
circumstances (some splitters call can_create_psudo_p()).  So, is this correct
instead?

  ...
  Unlike those in define_expand, however, once reload has completed
  these statements must neither generate any new pseudo-registers nor
  allocate any space in the stack frame.  This can be checked by calling
  can_create_pseudo_p.

[Bug other/70078] gccint: define_split "not" allowed to create pseudos

2016-03-04 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70078

--- Comment #1 from Dominik Vogt  ---
Hijacking this bug report for more unclear documentation in that section;
proposed changes in marked with <...>.

Apart from the bad grammar, the meaning of this sentence is a mystery:

  Splitting of jump instruction into sequence that over by another jump 
  instruction is always valid, as compiler expect identical behavior of
  new jump.

=>

  Splitting of jump instruction into  sequence that 
  another jump instruction is always valid, as  compiler
  expect .

Anybody able to fill in the gaps?

[Bug other/70078] gccint: define_split "not" allowed to create pseudos

2016-03-04 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70078

--- Comment #2 from Dominik Vogt  ---
(I'll make a patch with these and some more corrections once it's clear how the
wording should be.)

[Bug middle-end/70236] New: Register allocation and loop unrolling lead to waste of registers

2016-03-15 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70236

Bug ID: 70236
   Summary: Register allocation and loop unrolling lead to waste
of registers
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vogt at linux dot vnet.ibm.com
CC: vmakarov at gcc dot gnu.org
  Target Milestone: ---

Created attachment 37966
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37966&action=edit
ira dump

A new s390x pattern for shift-and-xor does not yield a satisfying result with
this code when compiled with "-O3 -funroll-loops":

-- snip --
unsigned long hash(unsigned long l)
{
  unsigned long v = 0;
  unsigned long i;

  for (i = 0; i < 8; i++)
{
  v <<= 1;
  v ^= l;
}
  return v;
}
-- snip --

=>

  lgr %r1,%r2
  lgr %r3,%r2
  rxsbg   %r1,%r2,0,62,1   # (shift r2 by one bit left and xor with r1)
  rxsbg   %r3,%r1,0,62,1
  lgr %r1,%r2
  rxsbg   %r1,%r3,0,62,1
  lgr %r4,%r1  <- unnecessary
  lgr %r1,%r2
  rxsbg   %r1,%r4,0,62,1
  lgr %r5,%r1  <- unnecessary
  lgr %r1,%r2
  rxsbg   %r1,%r5,0,62,1
  lgr %r0,%r1  <- unnecessary
  lgr %r1,%r2
  rxsbg   %r1,%r0,0,62,1
  rxsbg   %r2,%r1,0,62,1
  br  %r14

("%r1,%r2,0,62,1" means "r1 := r1 ^ (r2 << 1)"; the ",0,62,1" part of the
instruction effectively means "shift left by one".)

(gets worse with more loop passes).  The code got unrolled in tree:

  v_16 = l_4(D) << 1;
  v_17 = l_4(D) ^ v_16;
  v_21 = v_17 << 1;
  v_22 = l_4(D) ^ v_21;
  v_26 = v_22 << 1;
  v_27 = l_4(D) ^ v_26;
  v_31 = v_27 << 1;
  v_32 = l_4(D) ^ v_31;
  v_36 = v_32 << 1;
  v_37 = l_4(D) ^ v_36;
  v_41 = v_37 << 1;
  v_42 = l_4(D) ^ v_41;
  v_3 = v_42 << 1;
  v_5 = v_3 ^ l_4(D);
  return v_5;

Register allocation insists on having the value of "l" in r1.  As the result of
the previous pass through the loop is in r1, it's necessary to move that value
out of the way first.  Later on, regrename fails to clean up this situation,
probaby because the problem is too complex with many sequential overlapping
register use chains.

This:

  lgr %r1,%r2
  rxsbg   %r1,%r3,0,62,1
  lgr %r4,%r1
  lgr %r1,%r2
  rxsbg   %r1,%r4,0,62,1
  lgr %r5,%r1
  lgr %r1,%r2
  rxsbg   %r1,%r5,0,62,1
  lgr %r0,%r1
  lgr %r1,%r2
  rxsbg   %r1,%r0,0,62,1

could be rewritten to

  lgr %r1,%r2
  rxsbg   %r1,%r3,0,62,1
  lgr %r3,%r2
  rxsbg   %r3,%r1,0,62,1
  lgr %r1,%r2
  rxsbg   %r1,%r3,0,62,1
  lgr %r3,%r2
  rxsbg   %r3,%r1,0,62,1

just using three registers.

The question is whether this situation can be improved, either in the register
allocator, or regrename, or in the pattern.

[Bug middle-end/70236] Register allocation and loop unrolling lead to waste of registers

2016-03-15 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70236

--- Comment #1 from Dominik Vogt  ---
Created attachment 37967
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37967&action=edit
rnreg dump

[Bug target/70404] New: pr71074.c fails on s390x

2016-03-24 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70404

Bug ID: 70404
   Summary: pr71074.c fails on s390x
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vogt at linux dot vnet.ibm.com
CC: krebbel at gcc dot gnu.org
  Target Milestone: ---
  Host: s390x
Target: s390x

The new test case from #70174 triggers an ICE on s390x (svn rev 234414):

.../build/gcc/xgcc -B...//gcc/ .../gcc/testsuite/gcc.dg/pr70174.c
-fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -S -m64 -o pr70174.s
.../gcc/testsuite/gcc.dg/pr70174.c: In function 'foo':
.../gcc/testsuite/gcc.dg/pr70174.c:10:7: warning: assignment makes integer from
pointer without a cast [-Wint-conversion]
/home/vogt/src/git/gcc/gcc/testsuite/gcc.dg/pr70174.c:11:1: error:
unrecognizab\
le insn:
(insn 9 8 10 2 (set (zero_extract:DI (subreg:DI (reg:QI 66) 0)
(const_int 4 [0x4])
(const_int 56 [0x38]))
(symbol_ref:DI ("foo") [flags 0x3] )
.../gcc/testsuite/gcc.dg/pr70174.c:10 -1
 (nil))
.../gcc/testsuite/gcc.dg/pr70174.c:11:1: internal compiler error: in
extract_insn, at recog.c:2287
0x805b40dd _fatal_insn(char const*, rtx_def const*, char const*, int, char
cons\
t*)
.../gcc/rtl-error.c:108
0x805b411d _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
.../gcc/rtl-error.c:116
0x80582a2d extract_insn(rtx_insn*)
.../gcc/recog.c:2287
0x803b6af3 instantiate_virtual_regs_in_insn
.../gcc/function.c:1582
0x803b6af3 instantiate_virtual_regs
.../gcc/function.c:1950
0x803b6af3 execute
.../gcc/function.c:1999

[Bug rtl-optimization/70174] [6 Regression] ICE at -O1 and above on x86_64-linux-gnu in gen_lowpart_general, at rtlhooks.c:63

2016-03-24 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70174

Dominik Vogt  changed:

   What|Removed |Added

 CC||vogt at linux dot vnet.ibm.com

--- Comment #12 from Dominik Vogt  ---
The new test case triggers an ICE on s390x.  See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70404

[Bug target/70404] pr70174.c fails on s390x

2016-03-30 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70404

--- Comment #1 from Dominik Vogt  ---
Configured with --with-arch=zEC12

[Bug target/70404] pr70174.c fails on s390x

2016-03-31 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70404

--- Comment #3 from Dominik Vogt  ---
Andreas is already working on the issue, so before anybody spends any more work
on this, you should probably coordinate your efforts.

[Bug middle-end/70561] New: Crash in recog_for_combine_1

2016-04-06 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70561

Bug ID: 70561
   Summary: Crash in recog_for_combine_1
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vogt at linux dot vnet.ibm.com
CC: krebbel at gcc dot gnu.org
  Target Milestone: ---
  Host: s390x
Target: s390x

This code in recog_for_combine_1 doesn't look right:

--
  if (num_clobbers_to_add)
{
  rtx newpat = gen_rtx_PARALLEL (VOIDmode,
 rtvec_alloc (GET_CODE (pat) == PARALLEL
  ? (XVECLEN (pat, 0)
 + num_clobbers_to_add)
  : num_clobbers_to_add + 1));

  if (GET_CODE (pat) == PARALLEL)
for (i = 0; i < XVECLEN (pat, 0); i++)
  XVECEXP (newpat, 0, i) = XVECEXP (pat, 0, i);
  else
XVECEXP (newpat, 0, 0) = pat;

  add_clobbers (newpat, insn_code_number);

  for (i = XVECLEN (newpat, 0) - num_clobbers_to_add;
   i < XVECLEN (newpat, 0); i++)
{
  if (REG_P (XEXP (XVECEXP (newpat, 0, i), 0))  <=== crash
  && ! reg_dead_at_p (XEXP (XVECEXP (newpat, 0, i), 0), insn))
return -1;
  ...
--

For me, there is a crash in the marked line (for some pattern I'm working on)
with "i == 1" because "XVECEXP (newpat, 0, 1)" is "(nil)".  If
"num_clobbers_to_add" is > 0, and the original "pat" is not a parallel, only
the first element of newpat is initialised, but the remaining elements are
still accessed.  There probably should be something like this in the for loop?

  for (...)
{
  if (XVECEXP (newpat, 0, i))
/* generate clobber from scratch and store it in XVECEXP (newpat, 0, i)
*/

--

Probably triggered by this splitter:

  [(parallel
[(set (match_operand:GPR 0 "nonimmediate_operand" "")
  (and:GPR (not:GPR (match_operand:GPR 1 "nonimmediate_operand" ""))
   (match_operand:GPR 2 "nonimmediate_operand" "")))
(clobber (reg:CC CC_REGNUM))])]

==>

  [
  (parallel
   [(set (match_dup 3) (and:GPR (match_dup 1) (match_dup 2)))
   (clobber (reg:CC CC_REGNUM))])
  (parallel
   [(set (match_dup 0) (xor:GPR (match_dup 3) (match_dup 2)))
   (clobber (reg:CC CC_REGNUM))])]

[Bug middle-end/70561] Crash in recog_for_combine_1

2016-04-06 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70561

--- Comment #1 from Dominik Vogt  ---
P.S.:

(gdb) p debug_rtx(pat)
(set (reg:SI 67 [+4 ])
(and:SI (not:SI (subreg:SI (reg/v:DI 65 [ b+-4 ]) 4))
(mem:SI (plus:DI (reg:DI 2 %r2 [ a ])
(const_int 4 [0x4])) [1 *a_2(D)+4 S4 A32])))
$13 = void
(gdb) p debug_rtx(newpat)
(parallel [
(set (reg:SI 67 [+4 ])
(and:SI (not:SI (subreg:SI (reg/v:DI 65 [ b+-4 ]) 4))
(mem:SI (plus:DI (reg:DI 2 %r2 [ a ])
(const_int 4 [0x4])) [1 *a_2(D)+4 S4 A32])))
(nil)
])
$14 = void

[Bug middle-end/70561] Crash in recog_for_combine_1

2016-04-06 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70561

--- Comment #2 from Dominik Vogt  ---
(Ah, probably add_clobbers should have added the clobber, but it hasn't.  It
doesn't have any code for that pattern.)

[Bug middle-end/70561] Crash in recog_for_combine_1

2016-04-06 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70561

Dominik Vogt  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Dominik Vogt  ---
Solved with Uli's help by removing the "parallal" from the
define_insn_and_split.

[Bug target/69148] [5 Regression] ICE (floating point exception) on s390x-linux-gnu

2016-04-18 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69148

Dominik Vogt  changed:

   What|Removed |Added

 CC||vogt at linux dot vnet.ibm.com

--- Comment #7 from Dominik Vogt  ---
(Need to backport this to 5.3 for Ubuntu.)

[Bug go/70787] New: No time and child info with -pg and gccgo

2016-04-25 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70787

Bug ID: 70787
   Summary: No time and child info with -pg and gccgo
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: go
  Assignee: ian at airs dot com
  Reporter: vogt at linux dot vnet.ibm.com
CC: cmang at google dot com, krebbel at gcc dot gnu.org
  Target Milestone: ---

It looks like the -pg option does something wrong for Go programs.  Example:

This program just wastes time in sub functions:
-- main.go --
package main
func foo () {
var i int
i = 0
for (i < 1000) { i++ }
}
func bar () {
var i int
i = 0
for (i < 1000) { i++ }
}
func main () {
var i int
i = 0
for (i < 100) { foo(); foo(); bar(); i++ }
}
-- snip --

  $ gccgo -pg -O0 main.go
  $ ./a.out
  $ prof ./a.out gmoun.out 

=>

  index % timeself  childrencalled name
  0.000.00 300/300 main.main [8]
  [1]  0.00.000.00 300 frame_dummy [1]
   ^^^

(actual run time was about 5 seconds)

Even for this very simple program without Go library dependencies, no timing
information seems to be dumped into the gmon.out file.  Function calls have all
been counted in the "frame_dummy" bucket (double checked that functios have not
been inlied).

My vague first guess is that maybe the timing information is written to to some
place in memory but is read from a different place when generating gmon.out
because the profiling code is not aware of Gccgo's threading model(?).

[Bug go/70787] No time and child info with -pg and gccgo

2016-04-25 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70787

--- Comment #1 from Dominik Vogt  ---
(I've also tried setting GMON_OUT_PREFIX so that the gmon.out file does not get
overwritten by different threads, but in either case only one dump file is
created.)

[Bug go/70787] No time and child info with -pg and gccgo

2016-04-26 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70787

--- Comment #2 from Dominik Vogt  ---
The Go runtime seems to register a handler for SIGPROF even if it does not want
to profile.  So it always uninstalls the handler installed by Glibc on behalf
of the -pg option.  To me it looks like -pg actually enables the profiling from
libgo instead.  Some ways to circumvent this:

1) Don't install a SIGPROF handler in the Go runtime if another is already
installed (possibly emit a warning or a fatal error if the program attempts to
enable the Go profiling).
=> Simple to implement.

2) Install the SIGPROF handler on the fly when it's needed instead of
unconditionally at Go runtime startup.  Possibly emit a warning if an existing
signal handler is uninstalled in the process.
=> Cleanest solution.

3) Store the previous signal handler and call it at the start of the Go runtime
signal handler.  However, this introduces a number several problems (the Go
runtime won't notice if the original profiling code wants to uninstall the
handler or install a new one or it might overwrite the Go runtime handler;
also, the two profiling systems will probably not agree on a common timing
interval).
=> May allow to run Glibc and libgo profiling in parallel but probably has some
unfixable issues.

[Bug debug/68860] [6/7 regression] FAIL: gcc.dg/guality/pr36728-1.c -flto -O3 -g line 16/7 arg1 == 1

2016-04-28 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68860

--- Comment #12 from Dominik Vogt  ---
We've just been looking at this today for s390x which fails these tests for
various reasons too (actually we've located at least four different Gcc bugs by
looking at this test case).  Some of the calculations in
allocate_dynamic_stack_space are weird, but that isn't the issue at hand (I'm
currently working on that).

We were planning to create a new bug report for this, but if it's already being
discussed ... S390x fails the checks "y == 2" probably because the
cprop_hardreg pass does something wrong with the var_location information. 
We've only debugged this for "x" yet, but it's probably the same cause for "y".
 After reload we have (s390x, -O3 -m64):

-- snip --
(insn 27 26 99 2 (parallel [
(set (reg/f:DI 15 %r15)
(minus:DI (reg/f:DI 15 %r15)
(reg:DI 2 %r2 [73])))
(clobber (reg:CC 33 %cc))
]) pr36728-1.c:12 1409 {*subdi3}
 (expr_list:REG_DEAD (reg:DI 2 %r2 [73])
(expr_list:REG_UNUSED (reg:CC 33 %cc)
(nil
(insn 99 27 57 2 (set (reg/f:DI 1 %r1 [65])
(plus:DI (reg/f:DI 11 %r11)
(const_int 191 [0xbf]))) pr36728-1.c:10 1075 {*la_64}
 (nil))
(insn 57 99 33 2 (set (reg/f:DI 3 %r3 [77])
(reg/f:DI 15 %r15)) pr36728-1.c:12 1073 {*movdi_64}
 (nil))
(debug_insn 33 57 6 2 (var_location:DI x (plus:DI (reg/f:DI 3 %r3 [77])
(const_int 160 [0xa0]))) pr36728-1.c:12 -1
 (nil))
-- snip --

Insn 27 adjusts the stack pointer, insn 57 copies it to r3 and insn 33 says
that "x" is at "r3 + 160".  The following constant propagation pass
(cprop_hardreg) results in

-- snip --
(insn 27 26 99 2 (parallel [
(set (reg/f:DI 15 %r15)
(minus:DI (reg/f:DI 15 %r15)
(reg:DI 2 %r2 [73])))
(clobber (reg:CC 33 %cc))
]) pr36728-1.c:12 1409 {*subdi3}
 (expr_list:REG_DEAD (reg:DI 2 %r2 [73])
(expr_list:REG_UNUSED (reg:CC 33 %cc)
(nil
(insn 99 27 57 2 (set (reg/f:DI 1 %r1 [65])
(plus:DI (reg/f:DI 11 %r11)
(const_int 191 [0xbf]))) pr36728-1.c:10 1075 {*la_64}
 (nil))
(insn 57 99 33 2 (set (reg/f:DI 3 %r3 [77])
(reg/f:DI 15 %r15)) pr36728-1.c:12 1073 {*movdi_64}
 (nil))
(debug_insn 33 57 6 2 (var_location:DI x (plus:DI (reg/f:DI 15 %r15 [77])
(const_int 160 [0xa0]))) pr36728-1.c:12 -1
 (nil))
-- snip --

It has propagated the value of r15 into insn 33, so now the var_location is now
separated from the place when it actually becomes valid (after insn 27), and
further passes result in bogus DWARF location list for "x".

(This is assembly output with a patch I'm working on; y does not use alloca for
aligmnent; I think this is independent of the bug.)
-- snip --
.LVL0:
stmg%r11,%r15,88(%r15)
aghi%r15,-200
lgr %r11,%r15
.loc 1 12 0
aghi%r2,14
.LVL1:
nill%r2,65528
sgr %r15,%r2  <== set final value of stack pointer
.loc 1 15 0   <== location list for "x" should start here
lhi %r2,2
.loc 1 10 0
la  %r1,191(%r11)
.LVL2:<== where location list for "x" actually starts
nill%r1,65504 <== 
.loc 1 16 0   <== location list for "y" should start here
larl%r4,b
.loc 1 17 0
mvi 160(%r15),25
.loc 1 12 0
la  %r3,160(%r15)
.LVL3:
.loc 1 18 0
larl%r5,a
.loc 1 15 0
st  %r2,0(%r1)
.loc 1 16 0
...
-- snip --

Without checking the details for "y" yet we've noticed that there is no
location list for y in the DWARF info, so gdb happily prints random data from
the stack slot with "p y" when stopping at the first ".loc 1 16 0".

[Bug debug/68860] [6/7 regression] FAIL: gcc.dg/guality/pr36728-1.c -flto -O3 -g line 16/7 arg1 == 1

2016-04-29 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68860

--- Comment #13 from Dominik Vogt  ---
By the way, I think the value of y should be tested *after* the asm statement
in line 17 not before it in line 16.  At higher optimization levels the
assignement may not have happened yet when gdb reaches line 16.  (And x should
be tested in line 19 for the same reason).

[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL

2016-11-22 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468

--- Comment #4 from Dominik Vogt  ---
Could you provide assembly dumps of the function foo() in the testcase, both,
with and without the "culprit" patch?

[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL

2016-11-22 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468

--- Comment #7 from Dominik Vogt  ---
The dumps show some differences I'd expect, but debugging libgomp testcases is
awkward because they are so complicated.  In the pre-patched era, Gcc's dynamic
allocation on the stack was a bit too large most of the time (roughly by one
allocated element, but not always).  This served as some kind of "saftey"
padding where programs with off-by-one bugs would write the "excess" data.

In reduction-10.c there are just two dynamic allications (for a and b in foo)
that seem to be good.  However, there are more differences in the assembler
dumps, probably generated by libgomp:

--- reduction-10.s.242589   2016-11-22 15:20:27.421251695 +0100
+++ reduction-10.s.242590   2016-11-22 15:20:35.842210558 +0100
@@ -8,7 +8,7 @@
ld  [%i0+16], %i2
add %i2, 1, %l5
sll %l5, 2, %g1
-   add %g1, 10, %g1
+   add %g1, 7, %g1
and %g1, -8, %g1
mov 0, %g2
sub %sp, %g1, %sp
@@ -42,7 +42,7 @@
 stb%g0, [%i2+%g1]
add %i3, 1, %l6
sll %l6, 2, %g1
-   add %g1, 10, %g1
+   add %g1, 7, %g1
and %g1, -8, %g1
mov 0, %g2
sub %sp, %g1, %sp
@@ -57,7 +57,6 @@
 add%g1, 4, %g1
add %i4, 1, %l7
sll %l7, 3, %g1
-   add %g1, 8, %g1   <--- somewhat suspicious
mov 0, %g2
sub %sp, %g1, %sp
add %sp, 96, %i3
@@ -70,7 +69,7 @@
 add%g1, 8, %g1
add %i5, 1, %o5
sll %o5, 2, %g1
-   add %g1, 10, %g1
+   add %g1, 7, %g1
and %g1, -8, %g1
mov 0, %g2
sub %sp, %g1, %sp
@@ -87,7 +86,7 @@
mov 0, %g1
add %l4, %l4, %g2
mov -6, %g4
-   add %g2, 8, %g2
+   add %g2, 7, %g2
and %g2, -8, %g2
sub %sp, %g2, %sp
add %sp, 92, %i5
@@ -427,12 +426,11 @@
add %g4, 4, %o7
add %g4, %g4, %o4
sll %o7, 3, %o3
-   add %o4, 8, %g1
-   add %o3, 8, %g2   <--- somewhat suspicious
+   add %o4, 7, %g1
+   sub %sp, %o3, %sp
and %g1, -8, %g1
-   sub %sp, %g2, %sp
...

Note that some allocation sizes were reduces from x+10 or x+8 to x+7.  This is
what the patch is about.  The two "add ... 8 ..." that have vanished may or may
not have something to do with the problem.  Possible causes of the symptom are:

1) The patch does not handle some corener case correctly.
2) There is an off-by-one bug in foo() that I've missed.
3) Off-by-one in libgomp.
4) 32 bit stack layout on SPARC is slightly broken.  (32 bit AIX had such a
problem caused by bad alignment of the dynamic stack variables.)

To pin it down, it would help to have some simpler failing testcase than the
ones from libgomp, and if possible reduced to the minimum.  Is this limited to
libgomp or are there other testcases that started failing?  Also, access to
such a SPARC system would help.

[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL

2016-11-22 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468

--- Comment #8 from Dominik Vogt  ---
Some things to try with reduction-10.c:

1) Remove all OMP pragmas from the code.  If it still fails it's not a limbgomp
bug.
2) Replace "p7" in foo with just "7".  If it still fails we know the bug is not
triggered by the dynamic allocation of a or b.

[Bug target/77822] [6 Regression] arm64 Error: immediate value out of range 0 to 63 at operand 3

2016-11-23 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77822

--- Comment #31 from Dominik Vogt  ---
No more backports, but the S390 fix for trunk is still in the queue.  After it
gets the bug can be resolved.

[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL

2016-11-24 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468

--- Comment #11 from Dominik Vogt  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #9)
> > 2) Replace "p7" in foo with just "7".  If it still fails we know the bug is 
> > not
> > triggered by the dynamic allocation of a or b.
> 
> ... but stays this way.

Good, the assembly diff has shrunk a lot:

--
@@ -8,7 +8,7 @@
ld  [%i0+4], %g4
add %g4, 1, %i3
sll %i3, 2, %g1
-   add %g1, 10, %g1
+   add %g1, 7, %g1 <--- add (8 - 1) bytes
and %g1, -8, %g1<--- round down to multiple of 8
mov 0, %g2
sub %sp, %g1, %sp
@@ -25,7 +25,6 @@
 add%g1, 4, %g1
add %g3, 1, %i2
sll %i2, 3, %g1
-   add %g1, 8, %g1 < -- what was this good for?
mov 0, %g2
sub %sp, %g1, %sp
add %sp, 96, %i5
--

The marked instructions in the first chunk do look like the calculations of the
dynamic stack area's address.  The reduced source code does not have dynamic
stack allocation, so that must come from libgomp.  The next step is to figure
out how libgomp generates instructions.  Can you provide tree dumps for both
Gccs?

[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL

2016-11-24 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468

--- Comment #14 from Dominik Vogt  ---
Is the dynamic variable stack area properly aligned?  Since sparc.h does not
define STACK_DYNAMIC_OFFSET it should be aligned to STACK_BONDARY, i.e. 64
bits.

[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL

2016-11-25 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468

--- Comment #16 from Dominik Vogt  ---
In emit-rtl.c:init_emit(), the alignment of the virtual_stack_dynamic pointer
is hard coded to STACK_BOUNDARY:

  REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM) = STACK_BOUNDARY; 

The backend must make sure that this promise is kept.  If that's what's
happening the Sparc backend then needs a fix similar to this Aix patch:

https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01036.html
(r242589)

The idea (on AIX) is to round up the allocation size of the parameters area if
the function does dynamic allocation (calls_alloca is true).  This logic had to
be replicated in some macros in aix.h.  A solution for sparc probably looks
similar.

[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL

2016-11-25 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468

--- Comment #18 from Dominik Vogt  ---
Another approach may be to make the middleend ask the backend for the actual
value of REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM).  Since on Sparc
the address is always 4 mod 8, we'd get an additional gap for *each* alloca()
if the size is still required to be a multiple of STACK_BOUNDARY.

To prevent this it would also be necessary to adapt the logic in
explow.c:get_dynamic_stack_size().  Since a recent patch this function also
uses REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM) as the alignment of the
beginning of that block, but still rounds the size up to a multiple of
STACK_BOUNDARY (explow.c:round_push()):

-- get_dynamic_stack_size() --
  /* Round the size to a multiple of the required stack alignment. 
 Since the stack is presumed to be rounded before this allocation, 
 this will maintain the required alignment. 

 If the stack grows downward, we could save an insn by subtracting 
 SIZE from the stack pointer and then aligning the stack pointer. 
 The problem with this is that the stack pointer may be unaligned 
 between the execution of the subtraction and alignment insns and 
 some machines do not allow this.  Even on those that do, some 
 signal handlers malfunction if a signal should occur between those 
 insns.  Since this is an extremely rare event, we have no reliable 
 way of knowing which systems have this problem.  So we avoid even 
 momentarily mis-aligning the stack.  */ 
  if (size_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0) 
{ 
  size = round_push (size); 
-- END --

-- round_push() --
/* Round the size of a block to be pushed up to the boundary required 
   by this machine.  SIZE is the desired size, which need not be constant.  */ 

static rtx 
round_push (rtx size) 
{ 
  rtx align_rtx, alignm1_rtx; 

  if (!SUPPORTS_STACK_ALIGNMENT 
  || crtl->preferred_stack_boundary == MAX_SUPPORTED_STACK_ALIGNMENT) 
{ 
  int align = crtl->preferred_stack_boundary / BITS_PER_UNIT; 

  if (align == 1) 
return size; 

  if (CONST_INT_P (size)) 
...

  align_rtx = GEN_INT (align); 
  alignm1_rtx = GEN_INT (align - 1); 

-- END --

It looks quite tricky to change this code to deal with preferred_stack_boundary
and REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM) at the same time.  What
if REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM) is maller than
STACK_BOUNDARY and preferred_stack_boundary is larger than STACK_BOUNDARY?

In the end, both approaches result in the same amount of memory being
allocated.

[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL

2016-11-29 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468

--- Comment #20 from Dominik Vogt  ---
(In reply to Eric Botcazou from comment #19)
> I think that the patch is simply incorrect and should be reverted, it very
> likely breaks other ports than PowerPC and SPARC and the failure more is
> quite nasty.

It does not break anything that wasn't broken before.  The Sparc backend was
just _lucky_ that the allocation code in the middlend was _broken_.  Otherwise
Gcc for Sparc (and Aix) would have generated code that makes dynamic
allocations with alloca() overlap.

(Actually this patch already helped to identify dynamic array bounds violations
in some Gcc library and Glibc that were real bugs that were hidden by Gcc's
over-allocation but possibly not by other compilers.  The unpatched Gcc
promotes array bounds violations in user code by providing some surprising
extra space that covers programming bugs most of the time.)

> IMO it's fundamentally backwards: instead of making it so that the alignment
> of VIRTUAL_STACK_DYNAMIC_REGNUM is honored by every dynamic allocation, it
> assumes that it is already honored to optimize the dynamic allocation.

The patch fixes the bug that causes dynamic stack allocation to overestimate
the needed space on the stack most of the time.  To do this, it uses
information available from elsewhere in the middleend.

It turns out that the backend (or middlend, depends on the point of view) lies
about the alignment of VIRTUAL_STACK_DYNAMIC_REGNUM.  There may be _other_
users users of that value that fail to do their job because they think the
stored alignment is correct.  Such users may do worse things than wasting some
stack space - we may just have not noticed them yet.

So, there is _another_ bug in the backends (or the middleend) that needs to be
fixed.  It's not "one fix instead of another" - there are two bugs that need
two separate fixes.

--

You say this should rather be fixed in the middleend, but actually it (i.e.
both bugs) _cannot_ be fixed in the middleend without correct alignment
information from the backend:

Consider this program:

-- snip --
__attribute__ ((noinline))
int *foo(int a1, int a2, int a3, int a4, int a5, int a6,
 int *pl, int *px, int *d, int *e)
{
  return d + a1 + a2 + a3 + a4 + a5 + a6;
}

int main(int argc, char **argv)
{
  int l;
  int x[argc];
  int *p;
  __attribute__ ((aligned(4))) int d[argc];
  __attribute__ ((aligned(8))) int e[argc];

  p = foo(argc + 1, argc + 2, argc + 3, argc + 4, argc + 5, argc + 6,
  &l, x, d, e);

  return (int)p;
}
-- snip --

Compiling it on Sparc (without the discussed patch) with "gcc -O3 -m32 -S
test.c" produces this assembly output:

-- snip --
main:
save%sp, -120, %sp
sll %i0, 2, %g1 ; i0 = 2 -> g1 = 8
add %g1, 10, %g2; g2 = 18
add %g1, 14, %g1; g1 = 22
and %g2, -8, %g2; g2 = 16
and %g1, -8, %g1; g1 = 16
sub %sp, %g2, %sp
add %sp, 108, %g3   ; g3 = fp - 28 (x)
sub %sp, %g2, %sp
add %sp, 108, %g2   ; g2 = fp - 44 (d)
sub %sp, %g1, %sp
add %sp, 112, %g1   ; g1 = fp - 56 (e)
st  %g1, [%sp+104]
add %fp, -4, %g1; g1 = fp - 4 (&l)
... (set %o registers)
st  %g2, [%sp+100]
st  %g3, [%sp+96]
callfoo, 0
 st %g1, [%sp+92]
-- snip --

So, the unpatched stack layout is:

 fp   ++ sp0
  | l  |
 fp - 4   ++
  |// wasted //| <--- where does this come from?
  |// space  //|
 fp - 12  ++ <--- start of dynamic allocation area
  |// wasted //|
  |// space  //|
 fp - 20  ++
  |x[1]|
  |x[0]|
 fp - 28  ++ <---
  |// wasted //| \
  |// space  //|  |
 fp - 36  ++  |
  |d[1]|  |
  |d[0]|  |
 fp - 44  ++ <--
  |# padding ##|  | \
 fp - 48  ++  |  |
  |e[1]|  |  |
  |e[0]|  |  |
 fp - 56  ++ <-
  |/// wasted space ///|  |  | \
 fp - 60  ++  |  |  |
  | outarg 10 (e)  |  |  |  |
  | outarg 9 (d)   |  |  |  |
  | outarg 8 (x)   |  |  |  |
  | outarg 7 (&l)  |  |  |  |
 fp - 76  ++  |  |  |
   ...|  |  |
 fp - 120 +- dynamic --+ sp1  |  |  |
  | allocation |  |  |  |
  | |  | /   |  

[Bug target/78633] [7 Regression] [SH] libgcc/fp-bit.c:944:1: error: invalid rtl sharing found in the insn

2016-12-05 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78633

Dominik Vogt  changed:

   What|Removed |Added

 CC||vogt at linux dot vnet.ibm.com

--- Comment #8 from Dominik Vogt  ---
There's a typo in the patch.  Should be reverted in a minute.  Sorry for the
trouble.

  1   2   3   4   5   >