[Bug tree-optimization/69336] Constant value not detected
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69336 Dominik Vogt changed: What|Removed |Added CC||vogt at linux dot vnet.ibm.com --- Comment #10 from Dominik Vogt --- The new test fails on s390x; what should I do about it? (see https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01359.html )
[Bug go/69511] New: G.gcstack_size uses uintptr instead of size_t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69511 Bug ID: 69511 Summary: G.gcstack_size uses uintptr instead of size_t Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: go Assignee: ian at airs dot com Reporter: vogt at linux dot vnet.ibm.com CC: cmang at google dot com, krebbel at gcc dot gnu.org Target Milestone: --- Target: s390 s390x Created attachment 37488 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37488&action=edit Proposed fix. The field gcstack_size in the G structure in libgo/runtime/runtime.h has "uintptr" as its type, but &G.gcstack_size is passed to a function expecting "size_t *". On S/390 this results in a warning and hence a bootstrap failure with the split stack patches we're working on: error: passing argument 3 of ‘__splitstack_find’ from incompatible pointer type [-Werror=incompatible-pointer-types] g->gcstack = __splitstack_find(nil, nil, &g->gcstack_size, I believe it's safe to change the type to size_t which it should have been in the first place. But theoretically it's possible that size_t and unitptr are of different bit size. What do you think about the attached patch?
[Bug tree-optimization/69336] Constant value not detected
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69336 --- Comment #12 from Dominik Vogt --- The test works now on s390x. Thanks.
[Bug c++/69462] FLT_EVAL_METHOD and DECIMAL_DIG missing in float.h
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69462 --- Comment #3 from Dominik Vogt --- Is this change fit to be posted on gcc-patches? (I have a patch for that anyway and can post it for you if you like.)
[Bug c++/69462] FLT_EVAL_METHOD and DECIMAL_DIG missing in float.h
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69462 Dominik Vogt changed: What|Removed |Added Summary|stack overflow detected |FLT_EVAL_METHOD and ||DECIMAL_DIG missing in ||float.h --- Comment #5 from Dominik Vogt --- (Sorry, acidentally typed a search string into the wrong field.)
[Bug c++/69528] New: s/s390: ext/special_functions/hyperg lots of failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69528 Bug ID: 69528 Summary: s/s390: ext/special_functions/hyperg lots of failures Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vogt at linux dot vnet.ibm.com CC: krebbel at gcc dot gnu.org Target Milestone: --- Target: s390x The hyperg functions fails to stay inside the tolerance allowed by the new test. Either the function's precision is not good enough on s390x, or the allowed tolerance is too small (or both). (: ) test(data167, toler167) 2: 4.82864e-13 2.5e-13 3: 2.72942e-13 2.5e-13 test(data171, toler171) 1: 5.15741e-12 2.5e-13 2: 5.87911e-13 2.5e-13 3: 2.05545e-12 2.5e-13 4: 2.78641e-13 2.5e-13 5: 2.78806e-13 2.5e-13 test(data172, toler172) 0: 3.10473e-11 2.5e-13 1: 1.28729e-11 2.5e-13 2: 5.93412e-12 2.5e-13 3: 1.25024e-12 2.5e-13 test(data173, toler173) 0: 1.09304e-12 2.5e-13 1: 8.62418e-13 2.5e-13 test(data197, toler197) 2: 4.82864e-13 2.5e-13 3: 2.72942e-13 2.5e-13 test(data201, toler201) 1: 1.86001e-12 2.5e-13 5: 1.79261e-12 2.5e-13 test(data202, toler202) 0: 2.33009e-12 2.5e-13 1: 3.21576e-12 2.5e-13 3: 5.41507e-13 2.5e-13 4: 4.36366e-13 2.5e-13 5: 4.40273e-13 2.5e-13 test(data203, toler203) 0: 2.15453e-12 2.5e-13 1: 1.90262e-12 2.5e-13 2: 7.14356e-13 2.5e-13 3: 2.58658e-12 2.5e-13 test(data204, toler204) 0: 6.15743e-13 2.5e-13 test(data206, toler206) 0: 1.87073e-10 2.5e-13 1: 6.94984e-12 2.5e-13 2: 9.47298e-12 2.5e-13 3: 3.09248e-12 2.5e-13 4: 5.35958e-13 2.5e-13 6: 5.9891e-13 2.5e-13 test(data207, toler207) 0: 4.38856e-10 2.5e-13 1: 7.63877e-11 2.5e-13 2: 7.72796e-10 2.5e-13 3: 1.09366e-12 2.5e-13 4: 6.68933e-13 2.5e-13 5: 3.71824e-12 2.5e-13 6: 9.15105e-13 2.5e-13 test(data208, toler208) 0: 5.19491e-09 2.5e-13 1: 2.6238e-09 2.5e-13 2: 6.29129e-10 2.5e-13 3: 7.664e-11 2.5e-13 4: 2.08562e-12 2.5e-13 5: 1.79497e-11 2.5e-13 6: 9.40163e-13 2.5e-13 7: 7.14083e-13 2.5e-13 test(data209, toler209) 0: 2.15517e-10 2.5e-13 1: 1.60923e-10 2.5e-13 2: 2.69645e-12 2.5e-13 3: 2.35945e-11 2.5e-13 4: 1.00825e-12 2.5e-13 5: 1.54649e-12 2.5e-13 test(data231, toler231) 1: 5.15741e-12 2.5e-13 2: 5.87911e-13 2.5e-13 3: 2.05545e-12 2.5e-13 4: 2.78641e-13 2.5e-13 5: 2.78806e-13 2.5e-13 test(data232, toler232) 0: 3.10473e-11 2.5e-13 1: 1.28729e-11 2.5e-13 2: 5.93412e-12 2.5e-13
[Bug c++/69528] s/s390: ext/special_functions/hyperg lots of failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69528 --- Comment #1 from Dominik Vogt --- 3: 1.25024e-12 2.5e-13 test(data233, toler233) 0: 1.09304e-12 2.5e-13 1: 8.62418e-13 2.5e-13 test(data236, toler236) 0: 1.87073e-10 2.5e-13 1: 6.94984e-12 2.5e-13 2: 9.47298e-12 2.5e-13 3: 3.09248e-12 2.5e-13 4: 5.35958e-13 2.5e-13 6: 5.9891e-13 2.5e-13 test(data237, toler237) 0: 4.38856e-10 2.5e-13 1: 7.63877e-11 2.5e-13 2: 7.72796e-10 2.5e-13 3: 1.09366e-12 2.5e-13 4: 6.68933e-13 2.5e-13 5: 3.71824e-12 2.5e-13 6: 9.15105e-13 2.5e-13 test(data238, toler238) 0: 5.19491e-09 2.5e-13 1: 2.6238e-09 2.5e-13 2: 6.29129e-10 2.5e-13 3: 7.664e-11 2.5e-13 4: 2.08562e-12 2.5e-13 5: 1.79497e-11 2.5e-13 6: 9.40163e-13 2.5e-13 7: 7.14083e-13 2.5e-13 test(data239, toler239) 0: 2.15517e-10 2.5e-13 1: 1.60923e-10 2.5e-13 2: 2.69645e-12 2.5e-13 3: 2.35945e-11 2.5e-13 4: 1.00825e-12 2.5e-13 5: 1.54649e-12 2.5e-13 test(data241, toler241) 0: 1.68813e-09 2.5e-13 1: 4.9753e-10 2.5e-13 2: 5.28903e-10 2.5e-13 3: 2.29304e-11 2.5e-13 4: 1.49182e-11 2.5e-13 5: 9.41266e-12 2.5e-13 6: 1.00424e-12 2.5e-13 7: 2.98427e-13 2.5e-13 test(data242, toler242) 0: 2.04779e-08 2.5e-13 1: 2.64594e-08 2.5e-13 2: 5.22149e-10 2.5e-13 3: 1.3217e-09 2.5e-13 4: 1.35025e-10 2.5e-13 5: 4.39245e-11 2.5e-13 6: 7.65459e-12 2.5e-13 7: 3.73768e-13 2.5e-13 test(data243, toler243) 0: 3.02697e-07 2.5e-13 1: 2.69153e-07 2.5e-13 2: 3.29237e-08 2.5e-13 3: 7.43965e-09 2.5e-13 4: 1.01678e-10 2.5e-13 5: 2.14138e-09 2.5e-13 6: 2.7467e-11 2.5e-13 7: 5.62027e-12 2.5e-13 test(data244, toler244) 0: 4.34529e-07 2.5e-13 1: 3.35718e-07 2.5e-13 2: 5.23978e-08 2.5e-13 3: 1.60894e-08 2.5e-13 4: 4.29353e-12 2.5e-13 5: 6.54579e-12 2.5e-13 6: 6.73518e-11 2.5e-13 7: 2.6793e-12 2.5e-13 test(data245, toler245) 0: 1.6676e-07 2.5e-13 1: 2.02176e-08 2.5e-13 2: 2.36511e-07 2.5e-13 3: 1.73515e-08 2.5e-13 4: 6.49563e-10 2.5e-13 5: 6.27143e-11 2.5e-13 6: 7.79656e-12 2.5e-13 7: 7.01158e-13 2.5e-13
[Bug c++/69529] New: s/390: special_functions/02_assoc_legendre failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69529 Bug ID: 69529 Summary: s/390: special_functions/02_assoc_legendre failure Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vogt at linux dot vnet.ibm.com CC: krebbel at gcc dot gnu.org Target Milestone: --- Target: s390x The assoc_legendre function exceeds the allowed tolerance on s390x for data033[19]: { 2.5643395957697341e+17, 100, 10, 0.89991 }, The actual deviation is 2.75283e-13 while the allowed tolerance is 2.5e-13. What should I do with this?
[Bug libstdc++/69528] s/390: ext/special_functions/hyperg lots of failures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69528 Dominik Vogt changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from Dominik Vogt --- That was r232867. With r232917 the test failures are gone. Thanks.
[Bug c++/69529] s/390: special_functions/02_assoc_legendre failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69529 Dominik Vogt changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from Dominik Vogt --- With r232917 the test failure is gone. Thanks.
[Bug libgomp/69555] New: libgomp.c++/target-6.C fails because of undefined behaviour
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69555 Bug ID: 69555 Summary: libgomp.c++/target-6.C fails because of undefined behaviour Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: vogt at linux dot vnet.ibm.com CC: jakub at gcc dot gnu.org, krebbel at gcc dot gnu.org Target Milestone: --- Target: s390x The test case libgomp.c++/target-6.C fails on s390x, and I think that's because it uses a reference type variable in a "private" construct: -- snip -- ... int a[y - 2], b[y - 2]; int (&c)[y - 2] = a, (&d)[y - 2] = b; ^^^ ... #pragma omp target private (x, u, s, c, i) firstprivate (y, v, t, d) map(from\ :err) ^^^ { ... for (i = 0; i < y - 2; i++) c[i] = d[i]; ... } ... -- snip -- Depending on optimisations and the rest of the code, this leads to either incorrect values in the array "a" or accessing a pointer to random memory. As far as I understand it, the "OpenMP Application Program Interface, Version 4.0 - July 2013" explicitly forbids this on page 161: 28 • A variable that appears in a private clause must not have an incomplete type or a 29 reference type.
[Bug libgomp/69555] libgomp.c++/target-6.C fails because of undefined behaviour
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69555 --- Comment #2 from Dominik Vogt --- Does it work on other platforms?
[Bug libgomp/69555] libgomp.c++/target-6.C fails because of undefined behaviour
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69555 --- Comment #4 from Dominik Vogt --- Sure. Can I provide any debug information or another kind of help?
[Bug libgomp/69555] libgomp.c++/target-6.C fails because of undefined behaviour
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69555 --- Comment #5 from Dominik Vogt --- Hm, actually the chapter about "private" says nothing about how to actually *handle* a reference type whereas it states that for "firstprivate" and "lastprivate" the reference must bind to the same object for all threads. To me it still looks as if using references in "private" is undefined.
[Bug libgomp/69555] libgomp.c++/target-6.C fails because of undefined behaviour
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69555 --- Comment #6 from Dominik Vogt --- Example: -- snip -- #include int main () { int a; int &c = a; printf("a %p\n", &a); printf("g %p\n", &c); #pragma omp target private (c) { printf("t %p\n", &c); } return 0; } -- snip -- prints a 0x3a0edb4 g 0x3a0edb4 t 0x3a0ea24 <--- c in the loop points to different memory
[Bug c++/69089] C++11: alignas(0) causes an error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69089 --- Comment #5 from Dominik Vogt --- No, up to now you're the only one who commented on it. I keep pinging it once in a while.
[Bug libgomp/69625] New: deadlock in libgomp.c/doacross-1.c test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69625 Bug ID: 69625 Summary: deadlock in libgomp.c/doacross-1.c test Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: vogt at linux dot vnet.ibm.com CC: jakub at gcc dot gnu.org Target Milestone: --- Target: s390x Created attachment 37554 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37554&action=edit .s file of test program On s390x with -march=z196 -O2/-O3 the test hangs with a deadlock (and also doacross-[2.3].c and doacross-1.C, but I haven't looked at them yet). I've stripped down the test to this: -- snip -- #include #define N 64 int b[N / 16][8][4]; int main () { int i, j, k, l; (void)l; #pragma omp parallel { printf("+++\n"); #pragma omp for schedule(static, 0) ordered (3) nowait for (i = 2; i < N / 16 - 1; i++) for (j = 0; j < 8; j += 2) for (k = 1; k <= 3; k++) { #pragma omp atomic write b[i][j][k] = 11; #pragma omp ordered depend(sink: i, j - 2, k - 1) \ depend(sink: i - 2, j - 2, k + 1) #pragma omp ordered depend(sink: i - 3, j + 2, k - 2) if (j >= 2 && k > 1) { #pragma omp atomic read l = b[i][j - 2][k - 1]; } #pragma omp atomic write b[i][j][k] = 22; if (i >= 4 && j >= 2 && k < 3) { #pragma omp atomic read l = b[i - 2][j - 2][k + 1]; } #pragma omp ordered depend(source) #pragma omp atomic write b[i][j][k] = 33; } printf("---\n"); } printf("done\n"); return 0; } -- snip -- (See attachment for full .s file.) (Running on an LPAR with 17 cores inside gdb.) The function GOMP_parallel starts threads 2 to 17 which enter and leave the parallel region (they print both "+++" and "---" then hang in a team_barrier_wait_final() call in gomp_thread_start. Only then thread 1 runs the thread function. gomp_team_start (fn, data, num_threads, flags, gomp_new_team (num_threads)); fn (data); Thread 1 comes across 0x8b7a <+522>: brasl %r14,0x87b0 with %r10 == 2 (which presumably contains k), then continues through 0x8cf6 <+902>: brasl %r14,0x86f0 and finally comes back to 0x8b7a <+522>: brasl %r14,0x87b0 with %r10 == 3. In GOMP_doacross_wait() it ends up calling doacross_spin() and never gets out of that again: doacross_spin (array, flattened, cur); 0x03fff7ef5562 <+282>: lg %r1,0(%r5) 0x03fff7ef5568 <+288>: clgr%r1,%r2 0x03fff7ef556c <+292>: jle 0x3fff7ef5562 The value of r1 (= *r5 (= *array?)) remains 6 (since there's no other thread left that could modify it) while the value of r2 is 0xfffb4a1. To me this looks as if doacross_spin() compares an integer value with an address or rubbish. Any ideas what's going on?
[Bug libgomp/69625] deadlock in libgomp.c/doacross-1.c test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69625 --- Comment #1 from Dominik Vogt --- It's a bug in the S/390 backend that sometimes trashes r6 in vararg functions. We're working on a fix.
[Bug fortran/67451] [5/6 Regression] [F08] ICE with sourced allocation from coarray.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67451 Dominik Vogt changed: What|Removed |Added CC||vogt at linux dot vnet.ibm.com --- Comment #8 from Dominik Vogt --- gfortran.dg/coarray_allocate_3.f08 crashed with an invalid free() on s390 and s390x. (gdb) run Starting program: .../gcc/build/gcc/testsuite/coarray_allocate_3.exe Program received signal SIGSEGV, Segmentation fault. 0x03fff7cb6814 in free () from /lib64/libc.so.6 (gdb) bt #0 0x03fff7cb6814 in free () from /lib64/libc.so.6 #1 0x8cae in MAIN__ () at .../gcc/testsuite/gfortran.dg/coarray_allocate_3.f08:26 #2 main (argc=, argv=) at .../gcc/testsuite/gfortran.dg/coarray_allocate_3.f08:27 #3 0x03fff7c4e0a2 in __libc_start_main () from /lib64/libc.so.6 #4 0x8866 in _start ()
[Bug fortran/67451] [5/6 Regression] [F08] ICE with sourced allocation from coarray.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67451 --- Comment #9 from Dominik Vogt --- I.e. free(0x1) is called: Load foobar.1497 to r12 0x8998 <+40>:larl%r12,0x80002408 (gdb) p /x $r12 0x80002408 First malloc call, store mem pointer in foobar.1497 0x89c6 <+86>:brasl %r14,0x8788 0x89cc <+92>:stg %r2,0(%r12) Second malloc call, store mem pointer in some_local_object.1511 0x8ae8 <+376>: brasl %r14,0x8788 0x8aee <+382>: stgrl %r2,0x800023d0 Load address of some_local_object.1511 to r1 0x8afa <+394>: larl%r1,0x800023d0 Write something to r1 + 16, r1 + 32, r1 + 40, r1 + 24 0x8b00 <+400>: mvghi 16(%r1),297 0x8b06 <+406>: stg %r11,32(%r1) 0x8b0c <+412>: stg %r8,40(%r1) 0x8b12 <+418>: mvghi 24(%r1),1 This overwrites foobar.1497 with the value 1: 0x8b18 <+424>: mvghi 56(%r1),1 (gdb) p /x $r1 + 56 0x80002408 <-- address of foobar.1497
[Bug libgomp/69625] S/390 deadlock in libgomp.c/doacross-1.c test (vararg function trashes r6)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69625 Dominik Vogt changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from Dominik Vogt --- Fixed with above patch.
[Bug fortran/67451] [5/6 Regression] [F08] ICE with sourced allocation from coarray.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67451 --- Comment #12 from Dominik Vogt --- The patch works on s390x.
[Bug go/69766] New: go.test/test/env.go fails on biarch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69766 Bug ID: 69766 Summary: go.test/test/env.go fails on biarch Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: go Assignee: ian at airs dot com Reporter: vogt at linux dot vnet.ibm.com CC: cmang at google dot com Target Milestone: --- Host: s390x When testing with make -k check-go RUNTESTFLAGS="--target_board=unix\{-m31,-m64\}" The testgo.test/test/env.go fails with -m31 because runtime.GOARCH and the GOARCH environment variable disagree: $GOARCH=s390x!= runtime.GOARCH=s390 ^ FAIL: go.test/test/env.go execution, -O2 -g The compile command was $ .../gcc/build/gcc/testsuite/go/../../gccgo -B.../gcc/build/gcc/testsuite/go/../../ .../gcc/gcc/testsuite/go.test/test/env.go -fno-diagnostics-show-caret -fdiagnostics-color=never -I.../gcc/build/s390x-ibm-linux-gnu/32/libgo -w -O2 -g -L.../gcc/build/s390x-ibm-linux-gnu/32/libgo -L.../gcc/build/s390x-ibm-linux-gnu/32/libgo/.libs -lm -m31 -o .../gcc/build/gcc/testsuite/go/env.x Unfortunately, the test does not keep the failing executable, and if I run this command manually, $GOARCH is not set at all in the resulting executable.
[Bug go/69766] go.test/test/env.go fails on biarch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69766 --- Comment #1 from Dominik Vogt --- If I understand the GOARCH environtment variable right it's value is just the architecture of the build system. So, this test is bound to fail for any multiarch target with the non-standard architecture, and for cross compilation?
[Bug go/69766] go.test/test/env.go fails on biarch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69766 --- Comment #2 from Dominik Vogt --- Created attachment 37663 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37663&action=edit Experimental patch Is the attached patch the right way to deal with this?
[Bug regression/69838] New: [regression] Lra deletes EH_REGION
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838 Bug ID: 69838 Summary: [regression] Lra deletes EH_REGION Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: regression Assignee: unassigned at gcc dot gnu.org Reporter: vogt at linux dot vnet.ibm.com Target Milestone: --- Host: s390x Target: s390x Created attachment 37704 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37704&action=edit Ira dump (ok) It looks like Lra does not handle EH_REGION notes correctly. There is at least one Gnat testcase where Lra wrongly deletes all exception handling code (gcc/testsuite/gnat.dg/null_pointer_deref1.adb): -- snip -- procedure Null_Pointer_Deref1 is type Int_Ptr is access all Integer; function Ident return Int_Ptr is begin return null; end; Data : Int_Ptr := Ident; begin Data.all := 1; exception when Constraint_Error | Storage_Error => null; end; -- snip -- The exception handling code vanishes in the reload pass (see attached rtl dumps). As a consequence, the exception is not caugt by the function and the program terminates with an error. With -mno-lra the test case works fine, and the code in reload1.c seems to have special treatment for EH_REGION notes that is missing in ira.c.
[Bug regression/69838] [regression] Lra deletes EH_REGION
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838 --- Comment #1 from Dominik Vogt --- Created attachment 37705 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37705&action=edit Reload dump (broken)
[Bug bootstrap/69709] [6 Regression] profiled bootstrap error on s390x-linux-gnu with r233194
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69709 Dominik Vogt changed: What|Removed |Added CC||vogt at linux dot vnet.ibm.com --- Comment #3 from Dominik Vogt --- Building the source Rpm you sent us (to build the Ada compiler) has the same problem with profiledbootstrap, building on Fedora 20 (maybe other distros too). I'll try to isolate the problem.
[Bug regression/69838] [4.9/5/6 Regression] Lra deletes EH_REGION
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838 Dominik Vogt changed: What|Removed |Added Component|rtl-optimization|regression --- Comment #3 from Dominik Vogt --- Bisecting fails to identify the exacty commit. It's broken in this commit, and the commits before fail to build Ada executalbes because they don't find the Ada library. It probably does not matter; the problem is present since at least 21st of February, 2013. PR bootstrap/56258 * doc/invoke.texi (-fdump-rtl-pro_and_epilogue): Use @item instead of @itemx. * gnat-style.texi (@title): Remove @hfill. * projects.texi: Avoid line wrapping inside of @pxref or @xref. * doc/cp-tools.texinfo (Virtual Machine Options): Use just one @gccoptlist instead of 3 separate ones. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@196196 138bc75d-0d04-0410-96
[Bug bootstrap/69709] [6 Regression] profiled bootstrap error on s390x-linux-gnu with r233194
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69709 --- Comment #5 from Dominik Vogt --- @Matthias: So far it only happens for me when building a gcc rpm from source on a (very slow VM), but not when compiling the same sources. Is there anything special about your build machine or environment on it?
[Bug middle-end/69838] [4.9/5/6 Regression] Lra deletes EH_REGION
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838 --- Comment #7 from Dominik Vogt --- With the patch I get an Ice with -m31: spawn -ignore SIGHUP .../build/gcc/xgcc -B.../build/gcc/ .../gcc/testsuite/gcc.dg/graphite/id-pr45230-1.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -fgraphite-identity -ffast-math -S -m31 -o id-pr45230-1.s^M .../gcc/testsuite/gcc.dg/graphite/id-pr45230-1.c: In function 'main':^M /home/vogt/src/git/gcc/gcc/testsuite/gcc.dg/graphite/id-pr45230-1.c:45:1: internal compiler error: Segmentation fault^M 0x806199b9 crash_signal^M ../../gcc/toplev.c:335^M 0x80a55d06 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*, vec*)^M ../../gcc/graphite-isl-ast-to-gimple.c:1408^M 0x80a55d51 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*, vec*)^M ../../gcc/graphite-isl-ast-to-gimple.c:1418^M 0x80a55d51 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*, vec*)^M ../../gcc/graphite-isl-ast-to-gimple.c:1418^M 0x80a55d51 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*, vec*)^M ../../gcc/graphite-isl-ast-to-gimple.c:1418^M 0x80a5748b translate_isl_ast_to_gimple::rename_all_uses(tree_node*, basic_block\ _def*, basic_block_def*)^M ../../gcc/graphite-isl-ast-to-gimple.c:1569^M 0x80a57631 translate_isl_ast_to_gimple::get_rename_from_scev(tree_node*, gimple\ **, loop*, basic_block_def*, basic_block_def*, vec\ )^M ../../gcc/graphite-isl-ast-to-gimple.c:1623^M 0x80a597a1 translate_isl_ast_to_gimple::rename_uses(gimple*, gimple_stmt_iterat\ or*, basic_block_def*, loop*, vec)^M ../../gcc/graphite-isl-ast-to-gimple.c:1730^M 0x80a5b06d translate_isl_ast_to_gimple::graphite_copy_stmts_from_block(basic_b\ lock_def*, basic_block_def*, vec)^M ../../gcc/graphite-isl-ast-to-gimple.c:2596^M 0x80a5b5eb translate_isl_ast_to_gimple::copy_bb_and_scalar_dependences(basic_bl\ ock_def*, edge_def*, vec)^M ../../gcc/graphite-isl-ast-to-gimple.c:2809^M 0x80a5bbf5 translate_isl_ast_to_gimple::translate_isl_ast_node_user(isl_ast_nod\ e*, edge_def*, std::map, std::allocator\ > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:935^M 0x80a5bf95 translate_isl_ast_to_gimple::translate_isl_ast_for_loop(loop*, isl_a\ st_node*, edge_def*, tree_node*, tree_node*, tree_node*, std::map, std::allocator\ > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:685^M 0x80a5c217 translate_isl_ast_to_gimple::translate_isl_ast_node_for(loop*, isl_a\ st_node*, edge_def*, std::map, std::all\ ocator > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:854^M 0x80a5beb1 translate_isl_ast_to_gimple::translate_isl_ast(loop*, isl_ast_node*,\ edge_def*, std::map, std::allocator > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:1032^M 0x80a5c359 translate_isl_ast_to_gimple::translate_isl_ast_node_block(loop*, isl\ _ast_node*, edge_def*, std::map, std::a\ llocator > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:964^M 0x80a5be69 translate_isl_ast_to_gimple::translate_isl_ast(loop*, isl_ast_node*,\ edge_def*, std::map, std::allocator > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:1043^M 0x80a5c359 translate_isl_ast_to_gimple::translate_isl_ast_node_block(loop*, isl\ _ast_node*, edge_def*, std::map, std::a\ llocator > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:964^M 0x80a5be69 translate_isl_ast_to_gimple::translate_isl_ast(loop*, isl_ast_node*,\ edge_def*, std::map, std::allocator > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:1043^M 0x80a5c359 translate_isl_ast_to_gimple::translate_isl_ast_node_block(loop*, isl\ _ast_node*, edge_def*, std::map, std::a\ llocator > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:964^M 0x80a5be69 translate_isl_ast_to_gimple::translate_isl_ast(loop*, isl_ast_node*,\ edge_def*, std::map, std::allocator > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:1043^M
[Bug middle-end/69838] [4.9/5/6 Regression] Lra deletes EH_REGION
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838 --- Comment #9 from Dominik Vogt --- I think I've already tested this commit without the patch and did not get that Ice, but maybe my memory fails me. I'm just running the test suite again with the commit reverted to make sure ...
[Bug middle-end/69838] [4.9/5/6 Regression] Lra deletes EH_REGION
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838 --- Comment #11 from Dominik Vogt --- If that is unrelated, the patch does not cause any regressions on a biarch build. Sould I also test it in a 31-bit changeroot?
[Bug middle-end/69838] [4.9/5/6 Regression] Lra deletes EH_REGION
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838 --- Comment #12 from Dominik Vogt --- (The test just finished; the Ice is present without the patch too.)
[Bug middle-end/69838] [4.9/5 Regression] Lra deletes EH_REGION
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69838 Dominik Vogt changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #15 from Dominik Vogt --- Successfully tested and bootstrapped trunk on s390x (biarch).
[Bug bootstrap/69709] [6 Regression] profiled bootstrap error on s390x-linux-gnu with r233194
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69709 --- Comment #7 from Dominik Vogt --- The stage1 compiler does something wrong when compiling gcc/real.c (with -fprofile-generate). The function div_significands() (inlined into do_divide()) returns a wrong result due to bad register usage in this loop: -- snip -- do { msb = u.sig[SIGSZ-1] & SIG_MSB; lshift_significand_1 (&u, &u); start: if (msb || cmp_significands (&u, b) >= 0) { sub_significands (&u, &u, b, 0); set_significand_bit (r, bit); } } while (--bit >= 0); -- snip -- At loop entry ("start" label), r1 holds the highest 64 bits of the significand. The first pass through the loop seems to be correct; sub_significands() and set_significand_bit() do the correct operations. After that, r1 is decremented by one as if it contained the variable "bit". Later on r1 gets (eventually) overwritten with zero. After that, the loop always thinks that the remaining significand is too smaller than b because its always zero. In the end, the "result" of the division is one in the highest significand bit and all other bits zero, eventually causing the observed assertion failure. With a broken compiler (from stageprofile), the test program for triggering the ICE is simply -- snip -- int x = __DBL_MAX__; -- snip -- All of this only happens on a Fedora 20 chroot for me. I've tried to add "-save-temps -dA -dP -fdump-rtl-all" to the OPT_FLAGS in the rpm spec file, but then the package doesn't build at all. Any hints how to get debug information from the rembuild run is welcome.
[Bug bootstrap/69709] [6 Regression] profiled bootstrap error on s390x-linux-gnu with r233194
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69709 --- Comment #8 from Dominik Vogt --- Created attachment 37790 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37790&action=edit Test case The option -fpeel-loops triggers the bug. The attached program has a different result with -fpeel-loops than without it. $ gcc -O2 -march=z10 -fpeel-loops pr69709.c && ./a.out 1 bits set in result $ gcc -O2 -march=z10 pr69709.c && ./a.out 2 bits set in result (2 is the correct result).
[Bug bootstrap/69709] [6 Regression] profiled bootstrap error on s390x-linux-gnu with r233194
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69709 --- Comment #9 from Dominik Vogt --- (-fpeel-loops is activated by -fprofile-use, so this is the connection to profilesbootstrap.)
[Bug bootstrap/69709] [6 Regression] profiled bootstrap error on s390x-linux-gnu with r233194
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69709 --- Comment #10 from Dominik Vogt --- We've located the bug in the s390 backend. No further help is needed.
[Bug fortran/67451] [5/6 Regression] [F08] ICE with sourced allocation from coarray.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67451 --- Comment #15 from Dominik Vogt --- The problem is gone on today's trunk for s390 and s390x.
[Bug middle-end/69920] [6 Regression] FAIL: g++.dg/torture/pr42704.C -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69920 Dominik Vogt changed: What|Removed |Added CC||vogt at linux dot vnet.ibm.com --- Comment #8 from Dominik Vogt --- Also failing on s390 and s390x; the same bug possibly causes several other test failures: FAIL: gcc.dg/graphite/id-pr45230-1.c (internal compiler error) FAIL: gcc.dg/tree-ssa/pr69666.c (internal compiler error) Maybe these too: FAIL: gfortran.dg/reassoc_6.f -O scan-tree-dump-not optimized "~" FAIL: gcc.dg/graphite/scop-sor.c scan-tree-dump-times graphite "number of SCoPs\ : 1" 1
[Bug middle-end/69920] [6 Regression] FAIL: g++.dg/torture/pr42704.C -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69920 --- Comment #9 from Dominik Vogt --- (Fails only with -m31.)
[Bug middle-end/69920] [6 Regression] FAIL: g++.dg/torture/pr42704.C -O2 -flto -fno-use-linker-plugin -flto-partition=none (internal compiler error)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69920 --- Comment #12 from Dominik Vogt --- The Ice in 42704.c is gone on s390[x] with trunk (but not the other FAILs). Is the Ice below related to this bug report or is it something totally different? .../gcc/testsuite/gcc.dg/graphite/id-pr45230-1.c: In function 'main':^M .../gcc/testsuite/gcc.dg/graphite/id-pr45230-1.c:45:1: internal compiler error: Segmentation fault^M 0x8061ac19 crash_signal^M ../../gcc/toplev.c:335^M 0x80a6305e translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*, vec*)^M ../../gcc/graphite-isl-ast-to-gimple.c:1408^M 0x80a630a9 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*, vec*)^M ../../gcc/graphite-isl-ast-to-gimple.c:1418^M 0x80a630a9 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*, vec*)^M ../../gcc/graphite-isl-ast-to-gimple.c:1418^M 0x80a630a9 translate_isl_ast_to_gimple::collect_all_ssa_names(tree_node*, vec*)^M ../../gcc/graphite-isl-ast-to-gimple.c:1418^M 0x80a647e3 translate_isl_ast_to_gimple::rename_all_uses(tree_node*, basic_block_def*, basic_block_def*)^M ../../gcc/graphite-isl-ast-to-gimple.c:1569^M 0x80a64989 translate_isl_ast_to_gimple::get_rename_from_scev(tree_node*, gimple**, loop*, basic_block_def*, basic_block_def*, vec)^M ../../gcc/graphite-isl-ast-to-gimple.c:1623^M 0x80a66af9 translate_isl_ast_to_gimple::rename_uses(gimple*, gimple_stmt_iterator*, basic_block_def*, loop*, vec)^M ../../gcc/graphite-isl-ast-to-gimple.c:1730^M 0x80a683c5 translate_isl_ast_to_gimple::graphite_copy_stmts_from_block(basic_block_def*, basic_block_def*, vec\ )^M ../../gcc/graphite-isl-ast-to-gimple.c:2596^M 0x80a68943 translate_isl_ast_to_gimple::copy_bb_and_scalar_dependences(basic_block_def*, edge_def*, vec)^M ../../gcc/graphite-isl-ast-to-gimple.c:2809^M 0x80a68f4d translate_isl_ast_to_gimple::translate_isl_ast_node_user(isl_ast_node*, edge_def*, std::map, std::allocator > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:935^M 0x80a692ed translate_isl_ast_to_gimple::translate_isl_ast_for_loop(loop*, isl_ast_node*, edge_def*, tree_node*, tree_node*, tree_node*, std\ ::map, std::allocator > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:685^M 0x80a6956f translate_isl_ast_to_gimple::translate_isl_ast_node_for(loop*, isl_ast_node*, edge_def*, std::map, std::allocator > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:854^M 0x80a69209 translate_isl_ast_to_gimple::translate_isl_ast(loop*, isl_ast_node*, edge_def*, std::map\ , std::allocator > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:1032^M 0x80a696b1 translate_isl_ast_to_gimple::translate_isl_ast_node_block(loop*, isl_ast_node*, edge_def*, std::map, std::allocator > >&)^M ../../gcc/graphite-isl-ast-to-gimple.c:964^M 0x80a691c1 translate_isl_ast_to_gimple::translate_isl_ast(loop*, isl_ast_node*, edge_def*, std::map\ , std::allocator > >
[Bug middle-end/69983] [6 Regression] FAIL: gcc.dg/graphite/scop-sor.c scan-tree-dump-times graphite "number of SCoPs: 1" 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69983 Dominik Vogt changed: What|Removed |Added CC||vogt at linux dot vnet.ibm.com --- Comment #3 from Dominik Vogt --- Also fails on s390x with -m64 and -m31.
[Bug tree-optimization/68659] [6 regression] FAIL: gcc.dg/graphite/id-pr45230-1.c (internal compiler error)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68659 Dominik Vogt changed: What|Removed |Added CC||vogt at linux dot vnet.ibm.com --- Comment #18 from Dominik Vogt --- I've no opinion on wether the patch is good or not, but it does make the test failure go away on s390x.
[Bug target/70009] test case libgomp.oacc-c-c++-common/vprop.c fails starting with its introduction in r233607
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70009 Dominik Vogt changed: What|Removed |Added CC||vogt at linux dot vnet.ibm.com --- Comment #1 from Dominik Vogt --- Also fails on s390x with -m64 and -m31.
[Bug tree-optimization/69760] [4.9/5 Regression] Wrong 64-bit memory address caused by an unneeded overflowing 32-bit integer multiplication on x86_64 under -O2 and -O3 code optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69760 Dominik Vogt changed: What|Removed |Added CC||vogt at linux dot vnet.ibm.com --- Comment #13 from Dominik Vogt --- Created attachment 37824 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37824&action=edit Dump file of reassoc_6.f This commit introduces a regression on s390x (-m64): FAIL: gfortran.dg/reassoc_6.f -O scan-tree-dump-not optimized "~" (Dump file of the test attached.)
[Bug ada/70017] New: Ada: c52103x test failure on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017 Bug ID: 70017 Summary: Ada: c52103x test failure on s390x Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: ada Assignee: unassigned at gcc dot gnu.org Reporter: vogt at linux dot vnet.ibm.com CC: ebotcazou at gcc dot gnu.org, krebbel at gcc dot gnu.org Target Milestone: --- Host: s390x Target: s390x My knowledge of Ada is practically zero, but I'm debugging a few Ada test failures on s390x (gcc-4.7 or earlier). -- snip -- ,.,. C52103X ACATS 2.5 16-02-29 16:03:21 C52103X CHECK THAT IN ARRAY ASSIGNMENTS AND IN SLICE ASSIGNMENTS, THE LENGTHS MUST MATCH; ALSO CHECK WHETHER CONSTRAINT_ERROR OR STORAGE_ERROR ARE RAISED FOR LARGE ARRAYS. - C52103X NO CONSTRAINT_ERROR FOR TYPE WITH 'LENGTH = INTEGER'LAST + 3. raised STORAGE_ERROR : System.Stack_Checking.Operations.Stack_Check: stack over\ flow detected FAIL: c52103x -- snip -- This happens here: -- snip -- TYPE TA42 IS ARRAY( INTEGER RANGE IDENT_INT(-2)..IDENT_INT(INTEGER'LAST) ) OF BOOLEAN ; ... OBJ_DCL: DECLARE -- THIS BLOCK DECLARES TWO BOOLEAN ARRAYS THAT -- HAVE INTEGER'LAST + 3 COMPONENTS; -- STORAGE_ERROR MAY BE RAISED. ARR41 : TA41 ; ARR42 : TA42 ; -- snip -- This is a reduced test (that fails with ulimit -s 131072): -- snip -- procedure c52103x is begin declare type T is array(integer range -2..1000) of boolean; begin declare A : T; begin null; end; end; end; -- snip -- As far as I understand this code, it assumes that only the first few memory pages of the array are allocated in the stack initially and the rest is allocated when actually accessed. However, on s390x first a snippet of three pages is allocated and checked, followed immediately by the rest of the array plus another check that fails because the stack is too small for that: -- snip -- _ada_c52103x: stmg%r11,%r15,88(%r15) larl%r13,.L4 aghi%r15,-168 lgr %r11,%r15 lgr %r1,%r15 aghi%r1,-12280 # <-- first three pages lgr %r2,%r1 brasl %r14,_gnat_stack_check # <-- OK lgr %r1,%r15 lgr %r12,%r1 lg %r1,.L5-.L4(%r13) agr %r1,%r15# <-- rest of array lgr %r2,%r1 brasl %r14,_gnat_stack_check # <-- FAIL ... .section.rodata .align 8 .L4: .L6: .quad -1008 .L5: .quad -10012288 -- snip -- (Stack on s390x grows down.) I've no idea whether this is the intended behaviour (i.e. the test case has a bug) or not, and if not whether I should look for the bug in the s390x backend or somewhere else.
[Bug ada/70017] c52103x and c52104x test failure on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017 Dominik Vogt changed: What|Removed |Added Summary|Ada: c52103x test failure |c52103x and c52104x test |on s390x|failure on s390x --- Comment #1 from Dominik Vogt --- c52104x has similar code and fails too.
[Bug middle-end/70025] [6 Regression] Miscompilation of gc-7.4.2 on s390x starting with r227382
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70025 --- Comment #2 from Dominik Vogt --- This is related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578
[Bug target/61578] [4.9 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578 --- Comment #35 from Dominik Vogt --- Looks like the extra condition in that patch is still not good enough: --- a/gcc/lra-constraints.c +++ b/gcc/lra-constraints.c @@ -945,6 +945,12 @@ match_reload (signed char out, signed char *ins, enum reg_c = (ins[1] < 0 && REG_P (in_rtx) && (int) REGNO (in_rtx) < lra_new_regno_start && find_regno_note (curr_insn, REG_DEAD, REGNO (in_rtx)) + /* We can not use the same value if the pseudo is mentioned + in the output, e.g. as an address part in memory, + becuase output reload will actually extend the pseudo + liveness. We don't care about eliminable hard regs here + as we are interesting only in pseudos. */ + && (out < 0 || regno_use_in (REGNO (in_rtx), out_rtx) == NULL_RTX) ? lra_create_new_reg (inmode, in_rtx, goal_class, "") : lra_create_new_reg_with_unique_value (outmode, out_rtx, goal_class, ""));
[Bug target/61578] [4.9 regression] Code size increase for ARM thumb compared to 4.8.x when compiling with -Os
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61578 --- Comment #36 from Dominik Vogt --- (Sorry, comment 35 belongs to the follow-up report https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70025 )
[Bug middle-end/70025] [6 Regression] Miscompilation of gc-7.4.2 on s390x starting with r227382
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70025 --- Comment #3 from Dominik Vogt --- Looks like the extra condition in that patch is still not good enough: --- a/gcc/lra-constraints.c +++ b/gcc/lra-constraints.c @@ -945,6 +945,12 @@ match_reload (signed char out, signed char *ins, enum reg_c = (ins[1] < 0 && REG_P (in_rtx) && (int) REGNO (in_rtx) < lra_new_regno_start && find_regno_note (curr_insn, REG_DEAD, REGNO (in_rtx)) + /* We can not use the same value if the pseudo is mentioned + in the output, e.g. as an address part in memory, + becuase output reload will actually extend the pseudo + liveness. We don't care about eliminable hard regs here + as we are interesting only in pseudos. */ + && (out < 0 || regno_use_in (REGNO (in_rtx), out_rtx) == NULL_RTX) ? lra_create_new_reg (inmode, in_rtx, goal_class, "") : lra_create_new_reg_with_unique_value (outmode, out_rtx, goal_class, ""));
[Bug middle-end/70025] [6 Regression] Miscompilation of gc-7.4.2 on s390x starting with r227382
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70025 --- Comment #5 from Dominik Vogt --- Yup. debug_rtx(out_rtx) = (mem/f:DI (plus:DI (reg/v/f:DI 164 [orig:129 p ] [129]) (const_int 16 [0x10])) [4 p_8(D)->d3+0 S8 A64]) debug_rtx(in_rtx) = (reg/v/f:DI 151 [orig:129 p ] [129]) Because in_rtx doesn't appear in out_rtx the condition "regno_use_in (REGNO (in_rtx), out_rtx) == 0" misses its mark.
[Bug ada/70017] c52103x and c52104x test failure on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017 --- Comment #3 from Dominik Vogt --- It looks like no more than activating Stack_Check_Probes is required. Thanks!
[Bug ada/70017] c52103x and c52104x test failure on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017 --- Comment #5 from Dominik Vogt --- We have zero test failures with the patched code. Is that good enough or should I still take a closer look?
[Bug ada/70017] c52103x and c52104x test failure on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017 --- Comment #6 from Dominik Vogt --- S390 does have stack checking support, so the question is really just whether Ada has extra requirements.
[Bug ada/70017] c52103x and c52104x test failure on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017 --- Comment #7 from Dominik Vogt --- Sorry, comment 6 is wrong, I was thinking about stack *guard* support.
[Bug middle-end/70025] [6 Regression] Miscompilation of gc-7.4.2 on s390x starting with r227382
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70025 --- Comment #6 from Dominik Vogt --- Shouldn't this rather check whether the *value* of the register in in_rtx appears in out_rtx?
[Bug middle-end/70025] [6 Regression] Miscompilation of gc-7.4.2 on s390x starting with r227382
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70025 --- Comment #10 from Dominik Vogt --- Successfully bootstrapped and regression tested on s390x (-m31 and -m64).
[Bug tree-optimization/69196] [5/6 Regression] code size regression with jump threading at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69196 Dominik Vogt changed: What|Removed |Added CC||vogt at linux dot vnet.ibm.com --- Comment #15 from Dominik Vogt --- The new test fails on s390x: .../build/gcc/xgcc -B.../build/gcc/ .../gcc/testsuite/gcc.dg/tree-ssa/pr69196-1.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -fdump-tree-vrp1-details -S -m31 -o pr69196-1.s PASS: gcc.dg/tree-ssa/pr69196-1.c (test for excess errors) FAIL: gcc.dg/tree-ssa/pr69196-1.c scan-tree-dump vrp1 "FSM did not thread around loop and would copy too many statements" (same with -m64 instead of -m31).
[Bug tree-optimization/69196] [5/6 Regression] code size regression with jump threading at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69196 --- Comment #16 from Dominik Vogt --- (In the ChangeLog entry, the "-1" is missing from the name of the new testfile.)
[Bug middle-end/69983] [6 Regression] FAIL: gcc.dg/graphite/scop-sor.c scan-tree-dump-times graphite "number of SCoPs: 1" 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69983 --- Comment #8 from Dominik Vogt --- Successfully bootstrapped and regression tested on s390x (biarch).
[Bug tree-optimization/69760] [4.9/5 Regression] Wrong 64-bit memory address caused by an unneeded overflowing 32-bit integer multiplication on x86_64 under -O2 and -O3 code optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69760 --- Comment #14 from Dominik Vogt --- The regression is fixed with the latest patch for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69983
[Bug middle-end/69987] [6 Regression] internal compiler error: in verify_loop_structure, at cfgloop.c:1639
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69987 Dominik Vogt changed: What|Removed |Added CC||vogt at linux dot vnet.ibm.com --- Comment #5 from Dominik Vogt --- The new test fails on s390x with -m31 (but works with -m64). (Without trying it I assume it also fails on s390). -- snip -- FAIL: gfortran.dg/pr69987.f90 -O (test for excess errors) Excess errors: f951: Warning: -fprefetch-loop-arrays not supported for this target (try -march switches) -- snip --
[Bug ada/70017] c52103x and c52104x test failure on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70017 Dominik Vogt changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #10 from Dominik Vogt --- Fixed.
[Bug tree-optimization/69196] [5/6 Regression] code size regression with jump threading at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69196 --- Comment #18 from Dominik Vogt --- Which dumps do you need?
[Bug libgomp/69555] libgomp.c++/target-6.C fails because of undefined behaviour
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69555 --- Comment #11 from Dominik Vogt --- Successfully bootstrapped and regression tested on s390x biarch. Thanks.
[Bug tree-optimization/68659] [6 regression] FAIL: gcc.dg/graphite/id-pr45230-1.c (internal compiler error)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68659 --- Comment #22 from Dominik Vogt --- Successfully bootstrapped and regression tested on s390x biarch. Thanks.
[Bug middle-end/69987] [6 Regression] internal compiler error: in verify_loop_structure, at cfgloop.c:1639
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69987 --- Comment #7 from Dominik Vogt --- Fixed on s390x. Thanks.
[Bug tree-optimization/69196] [5/6 Regression] code size regression with jump threading at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69196 --- Comment #20 from Dominik Vogt --- Created attachment 37860 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37860&action=edit vrp1 dump for s390x (-m64) vrp1 dump for s390x attached (-m64, give me a shout if you need the -m31 dump).
[Bug other/70078] New: gccint: define_split "not" allowed to create pseudos
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70078 Bug ID: 70078 Summary: gccint: define_split "not" allowed to create pseudos Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: vogt at linux dot vnet.ibm.com Target Milestone: --- The section "Defining How to Split Instructions" in the gccint manual claims The preparation-statements are similar to those statements that are specified for define_expand. ... Unlike those in define_expand, however, these statements must not generate any new pseudo-registers. Once reload has completed, they also must not allocate any space in the stack frame. Splitters seem to be allowed to generate new pseudos under certain circumstances (some splitters call can_create_psudo_p()). So, is this correct instead? ... Unlike those in define_expand, however, once reload has completed these statements must neither generate any new pseudo-registers nor allocate any space in the stack frame. This can be checked by calling can_create_pseudo_p.
[Bug other/70078] gccint: define_split "not" allowed to create pseudos
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70078 --- Comment #1 from Dominik Vogt --- Hijacking this bug report for more unclear documentation in that section; proposed changes in marked with <...>. Apart from the bad grammar, the meaning of this sentence is a mystery: Splitting of jump instruction into sequence that over by another jump instruction is always valid, as compiler expect identical behavior of new jump. => Splitting of jump instruction into sequence that another jump instruction is always valid, as compiler expect . Anybody able to fill in the gaps?
[Bug other/70078] gccint: define_split "not" allowed to create pseudos
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70078 --- Comment #2 from Dominik Vogt --- (I'll make a patch with these and some more corrections once it's clear how the wording should be.)
[Bug middle-end/70236] New: Register allocation and loop unrolling lead to waste of registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70236 Bug ID: 70236 Summary: Register allocation and loop unrolling lead to waste of registers Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: vogt at linux dot vnet.ibm.com CC: vmakarov at gcc dot gnu.org Target Milestone: --- Created attachment 37966 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37966&action=edit ira dump A new s390x pattern for shift-and-xor does not yield a satisfying result with this code when compiled with "-O3 -funroll-loops": -- snip -- unsigned long hash(unsigned long l) { unsigned long v = 0; unsigned long i; for (i = 0; i < 8; i++) { v <<= 1; v ^= l; } return v; } -- snip -- => lgr %r1,%r2 lgr %r3,%r2 rxsbg %r1,%r2,0,62,1 # (shift r2 by one bit left and xor with r1) rxsbg %r3,%r1,0,62,1 lgr %r1,%r2 rxsbg %r1,%r3,0,62,1 lgr %r4,%r1 <- unnecessary lgr %r1,%r2 rxsbg %r1,%r4,0,62,1 lgr %r5,%r1 <- unnecessary lgr %r1,%r2 rxsbg %r1,%r5,0,62,1 lgr %r0,%r1 <- unnecessary lgr %r1,%r2 rxsbg %r1,%r0,0,62,1 rxsbg %r2,%r1,0,62,1 br %r14 ("%r1,%r2,0,62,1" means "r1 := r1 ^ (r2 << 1)"; the ",0,62,1" part of the instruction effectively means "shift left by one".) (gets worse with more loop passes). The code got unrolled in tree: v_16 = l_4(D) << 1; v_17 = l_4(D) ^ v_16; v_21 = v_17 << 1; v_22 = l_4(D) ^ v_21; v_26 = v_22 << 1; v_27 = l_4(D) ^ v_26; v_31 = v_27 << 1; v_32 = l_4(D) ^ v_31; v_36 = v_32 << 1; v_37 = l_4(D) ^ v_36; v_41 = v_37 << 1; v_42 = l_4(D) ^ v_41; v_3 = v_42 << 1; v_5 = v_3 ^ l_4(D); return v_5; Register allocation insists on having the value of "l" in r1. As the result of the previous pass through the loop is in r1, it's necessary to move that value out of the way first. Later on, regrename fails to clean up this situation, probaby because the problem is too complex with many sequential overlapping register use chains. This: lgr %r1,%r2 rxsbg %r1,%r3,0,62,1 lgr %r4,%r1 lgr %r1,%r2 rxsbg %r1,%r4,0,62,1 lgr %r5,%r1 lgr %r1,%r2 rxsbg %r1,%r5,0,62,1 lgr %r0,%r1 lgr %r1,%r2 rxsbg %r1,%r0,0,62,1 could be rewritten to lgr %r1,%r2 rxsbg %r1,%r3,0,62,1 lgr %r3,%r2 rxsbg %r3,%r1,0,62,1 lgr %r1,%r2 rxsbg %r1,%r3,0,62,1 lgr %r3,%r2 rxsbg %r3,%r1,0,62,1 just using three registers. The question is whether this situation can be improved, either in the register allocator, or regrename, or in the pattern.
[Bug middle-end/70236] Register allocation and loop unrolling lead to waste of registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70236 --- Comment #1 from Dominik Vogt --- Created attachment 37967 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37967&action=edit rnreg dump
[Bug target/70404] New: pr71074.c fails on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70404 Bug ID: 70404 Summary: pr71074.c fails on s390x Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vogt at linux dot vnet.ibm.com CC: krebbel at gcc dot gnu.org Target Milestone: --- Host: s390x Target: s390x The new test case from #70174 triggers an ICE on s390x (svn rev 234414): .../build/gcc/xgcc -B...//gcc/ .../gcc/testsuite/gcc.dg/pr70174.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O2 -S -m64 -o pr70174.s .../gcc/testsuite/gcc.dg/pr70174.c: In function 'foo': .../gcc/testsuite/gcc.dg/pr70174.c:10:7: warning: assignment makes integer from pointer without a cast [-Wint-conversion] /home/vogt/src/git/gcc/gcc/testsuite/gcc.dg/pr70174.c:11:1: error: unrecognizab\ le insn: (insn 9 8 10 2 (set (zero_extract:DI (subreg:DI (reg:QI 66) 0) (const_int 4 [0x4]) (const_int 56 [0x38])) (symbol_ref:DI ("foo") [flags 0x3] ) .../gcc/testsuite/gcc.dg/pr70174.c:10 -1 (nil)) .../gcc/testsuite/gcc.dg/pr70174.c:11:1: internal compiler error: in extract_insn, at recog.c:2287 0x805b40dd _fatal_insn(char const*, rtx_def const*, char const*, int, char cons\ t*) .../gcc/rtl-error.c:108 0x805b411d _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) .../gcc/rtl-error.c:116 0x80582a2d extract_insn(rtx_insn*) .../gcc/recog.c:2287 0x803b6af3 instantiate_virtual_regs_in_insn .../gcc/function.c:1582 0x803b6af3 instantiate_virtual_regs .../gcc/function.c:1950 0x803b6af3 execute .../gcc/function.c:1999
[Bug rtl-optimization/70174] [6 Regression] ICE at -O1 and above on x86_64-linux-gnu in gen_lowpart_general, at rtlhooks.c:63
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70174 Dominik Vogt changed: What|Removed |Added CC||vogt at linux dot vnet.ibm.com --- Comment #12 from Dominik Vogt --- The new test case triggers an ICE on s390x. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70404
[Bug target/70404] pr70174.c fails on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70404 --- Comment #1 from Dominik Vogt --- Configured with --with-arch=zEC12
[Bug target/70404] pr70174.c fails on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70404 --- Comment #3 from Dominik Vogt --- Andreas is already working on the issue, so before anybody spends any more work on this, you should probably coordinate your efforts.
[Bug middle-end/70561] New: Crash in recog_for_combine_1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70561 Bug ID: 70561 Summary: Crash in recog_for_combine_1 Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: vogt at linux dot vnet.ibm.com CC: krebbel at gcc dot gnu.org Target Milestone: --- Host: s390x Target: s390x This code in recog_for_combine_1 doesn't look right: -- if (num_clobbers_to_add) { rtx newpat = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (GET_CODE (pat) == PARALLEL ? (XVECLEN (pat, 0) + num_clobbers_to_add) : num_clobbers_to_add + 1)); if (GET_CODE (pat) == PARALLEL) for (i = 0; i < XVECLEN (pat, 0); i++) XVECEXP (newpat, 0, i) = XVECEXP (pat, 0, i); else XVECEXP (newpat, 0, 0) = pat; add_clobbers (newpat, insn_code_number); for (i = XVECLEN (newpat, 0) - num_clobbers_to_add; i < XVECLEN (newpat, 0); i++) { if (REG_P (XEXP (XVECEXP (newpat, 0, i), 0)) <=== crash && ! reg_dead_at_p (XEXP (XVECEXP (newpat, 0, i), 0), insn)) return -1; ... -- For me, there is a crash in the marked line (for some pattern I'm working on) with "i == 1" because "XVECEXP (newpat, 0, 1)" is "(nil)". If "num_clobbers_to_add" is > 0, and the original "pat" is not a parallel, only the first element of newpat is initialised, but the remaining elements are still accessed. There probably should be something like this in the for loop? for (...) { if (XVECEXP (newpat, 0, i)) /* generate clobber from scratch and store it in XVECEXP (newpat, 0, i) */ -- Probably triggered by this splitter: [(parallel [(set (match_operand:GPR 0 "nonimmediate_operand" "") (and:GPR (not:GPR (match_operand:GPR 1 "nonimmediate_operand" "")) (match_operand:GPR 2 "nonimmediate_operand" ""))) (clobber (reg:CC CC_REGNUM))])] ==> [ (parallel [(set (match_dup 3) (and:GPR (match_dup 1) (match_dup 2))) (clobber (reg:CC CC_REGNUM))]) (parallel [(set (match_dup 0) (xor:GPR (match_dup 3) (match_dup 2))) (clobber (reg:CC CC_REGNUM))])]
[Bug middle-end/70561] Crash in recog_for_combine_1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70561 --- Comment #1 from Dominik Vogt --- P.S.: (gdb) p debug_rtx(pat) (set (reg:SI 67 [+4 ]) (and:SI (not:SI (subreg:SI (reg/v:DI 65 [ b+-4 ]) 4)) (mem:SI (plus:DI (reg:DI 2 %r2 [ a ]) (const_int 4 [0x4])) [1 *a_2(D)+4 S4 A32]))) $13 = void (gdb) p debug_rtx(newpat) (parallel [ (set (reg:SI 67 [+4 ]) (and:SI (not:SI (subreg:SI (reg/v:DI 65 [ b+-4 ]) 4)) (mem:SI (plus:DI (reg:DI 2 %r2 [ a ]) (const_int 4 [0x4])) [1 *a_2(D)+4 S4 A32]))) (nil) ]) $14 = void
[Bug middle-end/70561] Crash in recog_for_combine_1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70561 --- Comment #2 from Dominik Vogt --- (Ah, probably add_clobbers should have added the clobber, but it hasn't. It doesn't have any code for that pattern.)
[Bug middle-end/70561] Crash in recog_for_combine_1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70561 Dominik Vogt changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #3 from Dominik Vogt --- Solved with Uli's help by removing the "parallal" from the define_insn_and_split.
[Bug target/69148] [5 Regression] ICE (floating point exception) on s390x-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69148 Dominik Vogt changed: What|Removed |Added CC||vogt at linux dot vnet.ibm.com --- Comment #7 from Dominik Vogt --- (Need to backport this to 5.3 for Ubuntu.)
[Bug go/70787] New: No time and child info with -pg and gccgo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70787 Bug ID: 70787 Summary: No time and child info with -pg and gccgo Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: go Assignee: ian at airs dot com Reporter: vogt at linux dot vnet.ibm.com CC: cmang at google dot com, krebbel at gcc dot gnu.org Target Milestone: --- It looks like the -pg option does something wrong for Go programs. Example: This program just wastes time in sub functions: -- main.go -- package main func foo () { var i int i = 0 for (i < 1000) { i++ } } func bar () { var i int i = 0 for (i < 1000) { i++ } } func main () { var i int i = 0 for (i < 100) { foo(); foo(); bar(); i++ } } -- snip -- $ gccgo -pg -O0 main.go $ ./a.out $ prof ./a.out gmoun.out => index % timeself childrencalled name 0.000.00 300/300 main.main [8] [1] 0.00.000.00 300 frame_dummy [1] ^^^ (actual run time was about 5 seconds) Even for this very simple program without Go library dependencies, no timing information seems to be dumped into the gmon.out file. Function calls have all been counted in the "frame_dummy" bucket (double checked that functios have not been inlied). My vague first guess is that maybe the timing information is written to to some place in memory but is read from a different place when generating gmon.out because the profiling code is not aware of Gccgo's threading model(?).
[Bug go/70787] No time and child info with -pg and gccgo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70787 --- Comment #1 from Dominik Vogt --- (I've also tried setting GMON_OUT_PREFIX so that the gmon.out file does not get overwritten by different threads, but in either case only one dump file is created.)
[Bug go/70787] No time and child info with -pg and gccgo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70787 --- Comment #2 from Dominik Vogt --- The Go runtime seems to register a handler for SIGPROF even if it does not want to profile. So it always uninstalls the handler installed by Glibc on behalf of the -pg option. To me it looks like -pg actually enables the profiling from libgo instead. Some ways to circumvent this: 1) Don't install a SIGPROF handler in the Go runtime if another is already installed (possibly emit a warning or a fatal error if the program attempts to enable the Go profiling). => Simple to implement. 2) Install the SIGPROF handler on the fly when it's needed instead of unconditionally at Go runtime startup. Possibly emit a warning if an existing signal handler is uninstalled in the process. => Cleanest solution. 3) Store the previous signal handler and call it at the start of the Go runtime signal handler. However, this introduces a number several problems (the Go runtime won't notice if the original profiling code wants to uninstall the handler or install a new one or it might overwrite the Go runtime handler; also, the two profiling systems will probably not agree on a common timing interval). => May allow to run Glibc and libgo profiling in parallel but probably has some unfixable issues.
[Bug debug/68860] [6/7 regression] FAIL: gcc.dg/guality/pr36728-1.c -flto -O3 -g line 16/7 arg1 == 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68860 --- Comment #12 from Dominik Vogt --- We've just been looking at this today for s390x which fails these tests for various reasons too (actually we've located at least four different Gcc bugs by looking at this test case). Some of the calculations in allocate_dynamic_stack_space are weird, but that isn't the issue at hand (I'm currently working on that). We were planning to create a new bug report for this, but if it's already being discussed ... S390x fails the checks "y == 2" probably because the cprop_hardreg pass does something wrong with the var_location information. We've only debugged this for "x" yet, but it's probably the same cause for "y". After reload we have (s390x, -O3 -m64): -- snip -- (insn 27 26 99 2 (parallel [ (set (reg/f:DI 15 %r15) (minus:DI (reg/f:DI 15 %r15) (reg:DI 2 %r2 [73]))) (clobber (reg:CC 33 %cc)) ]) pr36728-1.c:12 1409 {*subdi3} (expr_list:REG_DEAD (reg:DI 2 %r2 [73]) (expr_list:REG_UNUSED (reg:CC 33 %cc) (nil (insn 99 27 57 2 (set (reg/f:DI 1 %r1 [65]) (plus:DI (reg/f:DI 11 %r11) (const_int 191 [0xbf]))) pr36728-1.c:10 1075 {*la_64} (nil)) (insn 57 99 33 2 (set (reg/f:DI 3 %r3 [77]) (reg/f:DI 15 %r15)) pr36728-1.c:12 1073 {*movdi_64} (nil)) (debug_insn 33 57 6 2 (var_location:DI x (plus:DI (reg/f:DI 3 %r3 [77]) (const_int 160 [0xa0]))) pr36728-1.c:12 -1 (nil)) -- snip -- Insn 27 adjusts the stack pointer, insn 57 copies it to r3 and insn 33 says that "x" is at "r3 + 160". The following constant propagation pass (cprop_hardreg) results in -- snip -- (insn 27 26 99 2 (parallel [ (set (reg/f:DI 15 %r15) (minus:DI (reg/f:DI 15 %r15) (reg:DI 2 %r2 [73]))) (clobber (reg:CC 33 %cc)) ]) pr36728-1.c:12 1409 {*subdi3} (expr_list:REG_DEAD (reg:DI 2 %r2 [73]) (expr_list:REG_UNUSED (reg:CC 33 %cc) (nil (insn 99 27 57 2 (set (reg/f:DI 1 %r1 [65]) (plus:DI (reg/f:DI 11 %r11) (const_int 191 [0xbf]))) pr36728-1.c:10 1075 {*la_64} (nil)) (insn 57 99 33 2 (set (reg/f:DI 3 %r3 [77]) (reg/f:DI 15 %r15)) pr36728-1.c:12 1073 {*movdi_64} (nil)) (debug_insn 33 57 6 2 (var_location:DI x (plus:DI (reg/f:DI 15 %r15 [77]) (const_int 160 [0xa0]))) pr36728-1.c:12 -1 (nil)) -- snip -- It has propagated the value of r15 into insn 33, so now the var_location is now separated from the place when it actually becomes valid (after insn 27), and further passes result in bogus DWARF location list for "x". (This is assembly output with a patch I'm working on; y does not use alloca for aligmnent; I think this is independent of the bug.) -- snip -- .LVL0: stmg%r11,%r15,88(%r15) aghi%r15,-200 lgr %r11,%r15 .loc 1 12 0 aghi%r2,14 .LVL1: nill%r2,65528 sgr %r15,%r2 <== set final value of stack pointer .loc 1 15 0 <== location list for "x" should start here lhi %r2,2 .loc 1 10 0 la %r1,191(%r11) .LVL2:<== where location list for "x" actually starts nill%r1,65504 <== .loc 1 16 0 <== location list for "y" should start here larl%r4,b .loc 1 17 0 mvi 160(%r15),25 .loc 1 12 0 la %r3,160(%r15) .LVL3: .loc 1 18 0 larl%r5,a .loc 1 15 0 st %r2,0(%r1) .loc 1 16 0 ... -- snip -- Without checking the details for "y" yet we've noticed that there is no location list for y in the DWARF info, so gdb happily prints random data from the stack slot with "p y" when stopping at the first ".loc 1 16 0".
[Bug debug/68860] [6/7 regression] FAIL: gcc.dg/guality/pr36728-1.c -flto -O3 -g line 16/7 arg1 == 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68860 --- Comment #13 from Dominik Vogt --- By the way, I think the value of y should be tested *after* the asm statement in line 17 not before it in line 16. At higher optimization levels the assignement may not have happened yet when gdb reaches line 16. (And x should be tested in line 19 for the same reason).
[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468 --- Comment #4 from Dominik Vogt --- Could you provide assembly dumps of the function foo() in the testcase, both, with and without the "culprit" patch?
[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468 --- Comment #7 from Dominik Vogt --- The dumps show some differences I'd expect, but debugging libgomp testcases is awkward because they are so complicated. In the pre-patched era, Gcc's dynamic allocation on the stack was a bit too large most of the time (roughly by one allocated element, but not always). This served as some kind of "saftey" padding where programs with off-by-one bugs would write the "excess" data. In reduction-10.c there are just two dynamic allications (for a and b in foo) that seem to be good. However, there are more differences in the assembler dumps, probably generated by libgomp: --- reduction-10.s.242589 2016-11-22 15:20:27.421251695 +0100 +++ reduction-10.s.242590 2016-11-22 15:20:35.842210558 +0100 @@ -8,7 +8,7 @@ ld [%i0+16], %i2 add %i2, 1, %l5 sll %l5, 2, %g1 - add %g1, 10, %g1 + add %g1, 7, %g1 and %g1, -8, %g1 mov 0, %g2 sub %sp, %g1, %sp @@ -42,7 +42,7 @@ stb%g0, [%i2+%g1] add %i3, 1, %l6 sll %l6, 2, %g1 - add %g1, 10, %g1 + add %g1, 7, %g1 and %g1, -8, %g1 mov 0, %g2 sub %sp, %g1, %sp @@ -57,7 +57,6 @@ add%g1, 4, %g1 add %i4, 1, %l7 sll %l7, 3, %g1 - add %g1, 8, %g1 <--- somewhat suspicious mov 0, %g2 sub %sp, %g1, %sp add %sp, 96, %i3 @@ -70,7 +69,7 @@ add%g1, 8, %g1 add %i5, 1, %o5 sll %o5, 2, %g1 - add %g1, 10, %g1 + add %g1, 7, %g1 and %g1, -8, %g1 mov 0, %g2 sub %sp, %g1, %sp @@ -87,7 +86,7 @@ mov 0, %g1 add %l4, %l4, %g2 mov -6, %g4 - add %g2, 8, %g2 + add %g2, 7, %g2 and %g2, -8, %g2 sub %sp, %g2, %sp add %sp, 92, %i5 @@ -427,12 +426,11 @@ add %g4, 4, %o7 add %g4, %g4, %o4 sll %o7, 3, %o3 - add %o4, 8, %g1 - add %o3, 8, %g2 <--- somewhat suspicious + add %o4, 7, %g1 + sub %sp, %o3, %sp and %g1, -8, %g1 - sub %sp, %g2, %sp ... Note that some allocation sizes were reduces from x+10 or x+8 to x+7. This is what the patch is about. The two "add ... 8 ..." that have vanished may or may not have something to do with the problem. Possible causes of the symptom are: 1) The patch does not handle some corener case correctly. 2) There is an off-by-one bug in foo() that I've missed. 3) Off-by-one in libgomp. 4) 32 bit stack layout on SPARC is slightly broken. (32 bit AIX had such a problem caused by bad alignment of the dynamic stack variables.) To pin it down, it would help to have some simpler failing testcase than the ones from libgomp, and if possible reduced to the minimum. Is this limited to libgomp or are there other testcases that started failing? Also, access to such a SPARC system would help.
[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468 --- Comment #8 from Dominik Vogt --- Some things to try with reduction-10.c: 1) Remove all OMP pragmas from the code. If it still fails it's not a limbgomp bug. 2) Replace "p7" in foo with just "7". If it still fails we know the bug is not triggered by the dynamic allocation of a or b.
[Bug target/77822] [6 Regression] arm64 Error: immediate value out of range 0 to 63 at operand 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77822 --- Comment #31 from Dominik Vogt --- No more backports, but the S390 fix for trunk is still in the queue. After it gets the bug can be resolved.
[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468 --- Comment #11 from Dominik Vogt --- (In reply to r...@cebitec.uni-bielefeld.de from comment #9) > > 2) Replace "p7" in foo with just "7". If it still fails we know the bug is > > not > > triggered by the dynamic allocation of a or b. > > ... but stays this way. Good, the assembly diff has shrunk a lot: -- @@ -8,7 +8,7 @@ ld [%i0+4], %g4 add %g4, 1, %i3 sll %i3, 2, %g1 - add %g1, 10, %g1 + add %g1, 7, %g1 <--- add (8 - 1) bytes and %g1, -8, %g1<--- round down to multiple of 8 mov 0, %g2 sub %sp, %g1, %sp @@ -25,7 +25,6 @@ add%g1, 4, %g1 add %g3, 1, %i2 sll %i2, 3, %g1 - add %g1, 8, %g1 < -- what was this good for? mov 0, %g2 sub %sp, %g1, %sp add %sp, 96, %i5 -- The marked instructions in the first chunk do look like the calculations of the dynamic stack area's address. The reduced source code does not have dynamic stack allocation, so that must come from libgomp. The next step is to figure out how libgomp generates instructions. Can you provide tree dumps for both Gccs?
[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468 --- Comment #14 from Dominik Vogt --- Is the dynamic variable stack area properly aligned? Since sparc.h does not define STACK_DYNAMIC_OFFSET it should be aligned to STACK_BONDARY, i.e. 64 bits.
[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468 --- Comment #16 from Dominik Vogt --- In emit-rtl.c:init_emit(), the alignment of the virtual_stack_dynamic pointer is hard coded to STACK_BOUNDARY: REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM) = STACK_BOUNDARY; The backend must make sure that this promise is kept. If that's what's happening the Sparc backend then needs a fix similar to this Aix patch: https://gcc.gnu.org/ml/gcc-patches/2016-11/msg01036.html (r242589) The idea (on AIX) is to round up the allocation size of the parameters area if the function does dynamic allocation (calls_alloca is true). This logic had to be replicated in some macros in aix.h. A solution for sparc probably looks similar.
[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468 --- Comment #18 from Dominik Vogt --- Another approach may be to make the middleend ask the backend for the actual value of REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM). Since on Sparc the address is always 4 mod 8, we'd get an additional gap for *each* alloca() if the size is still required to be a multiple of STACK_BOUNDARY. To prevent this it would also be necessary to adapt the logic in explow.c:get_dynamic_stack_size(). Since a recent patch this function also uses REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM) as the alignment of the beginning of that block, but still rounds the size up to a multiple of STACK_BOUNDARY (explow.c:round_push()): -- get_dynamic_stack_size() -- /* Round the size to a multiple of the required stack alignment. Since the stack is presumed to be rounded before this allocation, this will maintain the required alignment. If the stack grows downward, we could save an insn by subtracting SIZE from the stack pointer and then aligning the stack pointer. The problem with this is that the stack pointer may be unaligned between the execution of the subtraction and alignment insns and some machines do not allow this. Even on those that do, some signal handlers malfunction if a signal should occur between those insns. Since this is an extremely rare event, we have no reliable way of knowing which systems have this problem. So we avoid even momentarily mis-aligning the stack. */ if (size_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0) { size = round_push (size); -- END -- -- round_push() -- /* Round the size of a block to be pushed up to the boundary required by this machine. SIZE is the desired size, which need not be constant. */ static rtx round_push (rtx size) { rtx align_rtx, alignm1_rtx; if (!SUPPORTS_STACK_ALIGNMENT || crtl->preferred_stack_boundary == MAX_SUPPORTED_STACK_ALIGNMENT) { int align = crtl->preferred_stack_boundary / BITS_PER_UNIT; if (align == 1) return size; if (CONST_INT_P (size)) ... align_rtx = GEN_INT (align); alignm1_rtx = GEN_INT (align - 1); -- END -- It looks quite tricky to change this code to deal with preferred_stack_boundary and REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM) at the same time. What if REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM) is maller than STACK_BOUNDARY and preferred_stack_boundary is larger than STACK_BOUNDARY? In the end, both approaches result in the same amount of memory being allocated.
[Bug libgomp/78468] [7 regression] libgomp.c/reduction-10.c and many more FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468 --- Comment #20 from Dominik Vogt --- (In reply to Eric Botcazou from comment #19) > I think that the patch is simply incorrect and should be reverted, it very > likely breaks other ports than PowerPC and SPARC and the failure more is > quite nasty. It does not break anything that wasn't broken before. The Sparc backend was just _lucky_ that the allocation code in the middlend was _broken_. Otherwise Gcc for Sparc (and Aix) would have generated code that makes dynamic allocations with alloca() overlap. (Actually this patch already helped to identify dynamic array bounds violations in some Gcc library and Glibc that were real bugs that were hidden by Gcc's over-allocation but possibly not by other compilers. The unpatched Gcc promotes array bounds violations in user code by providing some surprising extra space that covers programming bugs most of the time.) > IMO it's fundamentally backwards: instead of making it so that the alignment > of VIRTUAL_STACK_DYNAMIC_REGNUM is honored by every dynamic allocation, it > assumes that it is already honored to optimize the dynamic allocation. The patch fixes the bug that causes dynamic stack allocation to overestimate the needed space on the stack most of the time. To do this, it uses information available from elsewhere in the middleend. It turns out that the backend (or middlend, depends on the point of view) lies about the alignment of VIRTUAL_STACK_DYNAMIC_REGNUM. There may be _other_ users users of that value that fail to do their job because they think the stored alignment is correct. Such users may do worse things than wasting some stack space - we may just have not noticed them yet. So, there is _another_ bug in the backends (or the middleend) that needs to be fixed. It's not "one fix instead of another" - there are two bugs that need two separate fixes. -- You say this should rather be fixed in the middleend, but actually it (i.e. both bugs) _cannot_ be fixed in the middleend without correct alignment information from the backend: Consider this program: -- snip -- __attribute__ ((noinline)) int *foo(int a1, int a2, int a3, int a4, int a5, int a6, int *pl, int *px, int *d, int *e) { return d + a1 + a2 + a3 + a4 + a5 + a6; } int main(int argc, char **argv) { int l; int x[argc]; int *p; __attribute__ ((aligned(4))) int d[argc]; __attribute__ ((aligned(8))) int e[argc]; p = foo(argc + 1, argc + 2, argc + 3, argc + 4, argc + 5, argc + 6, &l, x, d, e); return (int)p; } -- snip -- Compiling it on Sparc (without the discussed patch) with "gcc -O3 -m32 -S test.c" produces this assembly output: -- snip -- main: save%sp, -120, %sp sll %i0, 2, %g1 ; i0 = 2 -> g1 = 8 add %g1, 10, %g2; g2 = 18 add %g1, 14, %g1; g1 = 22 and %g2, -8, %g2; g2 = 16 and %g1, -8, %g1; g1 = 16 sub %sp, %g2, %sp add %sp, 108, %g3 ; g3 = fp - 28 (x) sub %sp, %g2, %sp add %sp, 108, %g2 ; g2 = fp - 44 (d) sub %sp, %g1, %sp add %sp, 112, %g1 ; g1 = fp - 56 (e) st %g1, [%sp+104] add %fp, -4, %g1; g1 = fp - 4 (&l) ... (set %o registers) st %g2, [%sp+100] st %g3, [%sp+96] callfoo, 0 st %g1, [%sp+92] -- snip -- So, the unpatched stack layout is: fp ++ sp0 | l | fp - 4 ++ |// wasted //| <--- where does this come from? |// space //| fp - 12 ++ <--- start of dynamic allocation area |// wasted //| |// space //| fp - 20 ++ |x[1]| |x[0]| fp - 28 ++ <--- |// wasted //| \ |// space //| | fp - 36 ++ | |d[1]| | |d[0]| | fp - 44 ++ <-- |# padding ##| | \ fp - 48 ++ | | |e[1]| | | |e[0]| | | fp - 56 ++ <- |/// wasted space ///| | | \ fp - 60 ++ | | | | outarg 10 (e) | | | | | outarg 9 (d) | | | | | outarg 8 (x) | | | | | outarg 7 (&l) | | | | fp - 76 ++ | | | ...| | | fp - 120 +- dynamic --+ sp1 | | | | allocation | | | | | | | / |
[Bug target/78633] [7 Regression] [SH] libgcc/fp-bit.c:944:1: error: invalid rtl sharing found in the insn
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78633 Dominik Vogt changed: What|Removed |Added CC||vogt at linux dot vnet.ibm.com --- Comment #8 from Dominik Vogt --- There's a typo in the patch. Should be reverted in a minute. Sorry for the trouble.