[Bug d/91628] libdruntime uses glibc internal symbol on s390
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91628 --- Comment #15 from rdapp at linux dot ibm.com --- Any feedback on the two options I proposed? Is the .S file variant (I posted last) ok?
[Bug go/89123] Too many go test failures on s390x-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89123 rdapp at linux dot ibm.com changed: What|Removed |Added CC||rdapp at linux dot ibm.com --- Comment #4 from rdapp at linux dot ibm.com --- I'm going to have a look.
[Bug go/89123] Too many go test failures on s390x-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89123 --- Comment #5 from rdapp at linux dot ibm.com --- I performed a bisect using const-1.go as check and got the following likely culprit: b0751b120f1b83d8e48a7c2cac831aabbb0bc048 is the first bad commit commit b0751b120f1b83d8e48a7c2cac831aabbb0bc048 Author: ian Date: Mon Sep 24 21:46:21 2018 + libgo: update to Go 1.11 Reviewed-on: https://go-review.googlesource.com/136435 (rev. 264546) Dominik Vogt is no longer with IBM, so I'm going to look into it. I have no experience with go yet, though. Might this simply be a case of an oversight regarding big endian? Do we have another big-endian backend where go works?
[Bug go/89123] Too many go test failures on s390x-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89123 --- Comment #7 from rdapp at linux dot ibm.com --- I did a full debug build of libgo and noticed that this changes the behavior of the executable. When it would segfault with default -O2 before, it now seems to rapidly allocate gigabytes of memory. This happens in doInit() in cpu_s390x.go:121 where we detect the CPU facilities. Apparently the stfle call used to be in cpu_s390x.s which does not exist anymore. Hence, the { panic("not implemented for gccgo") } gets triggers and we end up in panic.go:133 if len(pp.deferpool) == 0 && sched.deferpool != nil { pp is 0x0 here and "__go_runtime_error" tries to handle this by a runtime_panicstring which itself tries to defer again and so on. Is cpu_s390x.s missing on purpose i.e. should it have been replaced by something else?
[Bug go/89123] Too many go test failures on s390x-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89123 --- Comment #9 from rdapp at linux dot ibm.com --- Thanks for the pointer, I implemented the functions and now the startup seems to be fully functional again. I'm still checking whether the remaining 50ish libgo test suite failures I see are due to my changes or something else.
[Bug go/89123] Too many go test failures on s390x-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89123 --- Comment #10 from rdapp at linux dot ibm.com --- Created attachment 45621 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45621&action=edit Tentative patch for libgo on s390x I didn't manage to make much progress with analyzing the remaining FAILs but I guess this can wait until after this bug. Is there an easy/preferred way to build and debug a single test case without having to manually add a plethora of dependency arguments? Attached is a tentative patch that works for me on s390x and reduces the number of FAILs significantly. Does it look reasonable?
[Bug c++/85600] [9 Regression] CPU2006 471.omnetpp fails starting with r259771
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85600 rdapp at linux dot ibm.com changed: What|Removed |Added CC||rdapp at linux dot ibm.com --- Comment #4 from rdapp at linux dot ibm.com --- This also happens on s390x, even with -O0 and no new -march setting. The following line in EtherMac.cc delete outputbuffer.pop() actually emits two calls to cQueue::pop (). On the second call, the queue is empty and the program terminates. Already in the .original pass, a SAVE_EXPR that was there before is not present anymore: if (SAVE_EXPR
[Bug c++/85600] [9 Regression] CPU2006 471.omnetpp fails starting with r259771
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85600 --- Comment #6 from rdapp at linux dot ibm.com --- This hunk causes the double pop(): @@ -4650,8 +4648,6 @@ build_delete (tree otype, tree addr, special_function_kind auto_delete, } } } - if (TREE_SIDE_EFFECTS (addr)) - addr = save_expr (addr); /* Throw away const and volatile on target type of addr. */ addr = convert_force (build_pointer_type (type), addr, 0, complain); addr has the side-effects bit set at that point.
[Bug testsuite/91549] [10 regression] gcc.dg/wrapped-binop-simplify.c fails starting with r274925
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91549 rdapp at linux dot ibm.com changed: What|Removed |Added CC||rdapp at linux dot ibm.com --- Comment #1 from rdapp at linux dot ibm.com --- Mhm, probably ivopts does not generate the pattern we try to simplify in the first place on powerpc. Maybe we should rather opt in this test case to s390 and i386 for now and not run it on other targets?
[Bug testsuite/91549] [10 regression] gcc.dg/wrapped-binop-simplify.c fails starting with r274925
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91549 --- Comment #3 from rdapp at linux dot ibm.com --- This fails a lot more than it should. I'm looking into it.
[Bug testsuite/91549] [10 regression] gcc.dg/wrapped-binop-simplify.c fails starting with r274925
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91549 --- Comment #4 from rdapp at linux dot ibm.com --- Would this be ok? diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify.c index 44d85c04bfb..0d83384cd0a 100644 --- a/gcc/testsuite/gcc.dg/wrapped-binop-simplify.c +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify.c @@ -1,5 +1,5 @@ -/* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-vrp2-details" } */ +/* { dg-do compile { target x86_64-*-* s390x-*-* } } */ +/* { dg-options "-O2 -fdump-tree-vrp2-details -m64" } */ /* { dg-final { scan-tree-dump-times "gimple_simplified to" 4 "vrp2" } } */ void v1 (unsigned long *in, unsigned long *out, unsigned int n)
[Bug testsuite/91549] [10 regression] gcc.dg/wrapped-binop-simplify.c fails starting with r274925
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91549 --- Comment #6 from rdapp at linux dot ibm.com --- (In reply to Uroš Bizjak from comment #5) > This approach is not OK, you should use lp64 effective target instead of > -m64 option. Please see many examples in testsuite/gcc.dg. > > Also, x86 is a multilib target, described as { i?86-*-* x86_64-*-* }. Would this one do? diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify.c b/gcc/testsuite/gcc.dg/wrapped-binop-simplify.c index 44d85c04bfb..a5d953b46c7 100644 --- a/gcc/testsuite/gcc.dg/wrapped-binop-simplify.c +++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify.c @@ -1,4 +1,4 @@ -/* { dg-do compile } */ +/* { dg-do compile { target { { i?86-*-* x86_64-*-* s390*-*-* } && lp64 } } } */ /* { dg-options "-O2 -fdump-tree-vrp2-details" } */ /* { dg-final { scan-tree-dump-times "gimple_simplified to" 4 "vrp2" } } */
[Bug testsuite/91549] [10 regression] gcc.dg/wrapped-binop-simplify.c fails starting with r274925
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91549 --- Comment #8 from rdapp at linux dot ibm.com --- > Yes, I think so. Is this an OK to commit? :) I checked it on s390 and x86_64 with -m64 and -m31/-m32.
[Bug d/91628] libdruntime uses glibc internal symbol on s390
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91628 rdapp at linux dot ibm.com changed: What|Removed |Added CC||rdapp at linux dot ibm.com --- Comment #7 from rdapp at linux dot ibm.com --- Created attachment 46817 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46817&action=edit Proposed patch using __tls_get_offset I drafted a patch that uses __tls_get_offset instead of the internal symbol following Florian's idea. Test suite looks similar to before. Is it OK? Apart from that patch, however, I noticed that we FAIL in libphobos.allocations/tls_gc_integration.d libphobos.thread/tlsgc_sections.d but I'm not sure yet how serious that is.
[Bug d/91628] libdruntime uses glibc internal symbol on s390
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91628 --- Comment #9 from rdapp at linux dot ibm.com --- (In reply to Florian Weimer from comment #8) > Calling functions from inline assembly is always a bit iffy. For example, > your code lacks clobbers for the vector registers (if present) and the > condition code register. I don't know if s390/s390x has a red zone, or > specific call frame setup requirements (the psABI is ambiguous regarding the > latter for functions whose arguments all fit into registers, as it is the > case here). > > I would suggest not to use inline assembly for this purpose. Right about the missing registers. I was wary about building with an older GCC but the support only started with GCC9 so this should not be a problem. Apparently we can also add vector clobbers with -mno-vx. I opted for inline assembly to make sure r12 is not changed directly before the function call. Do you have an idea to guarantee this in another way?
[Bug d/91628] libdruntime uses glibc internal symbol on s390
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91628 --- Comment #13 from rdapp at linux dot ibm.com --- Created attachment 46859 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46859&action=edit __tls_get_offset in separate .S files As there were no further remarks as to which version is preferred (separate files or as before) I also prepared a patch using separate .S files. I included common/threadasm.s for "csym" but I guess threadasm should be named differently in the future.
[Bug go/89277] [9 Regression] libgo memory hogs in libgo testsuite (at least on s390x-linux-gnu)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89277 rdapp at linux dot ibm.com changed: What|Removed |Added CC||rdapp at linux dot ibm.com --- Comment #1 from rdapp at linux dot ibm.com --- Guessing this is the same problem as in PR89123 (triggering a panic in a panic, never stopping to allocate memory). I already proposed a patch that might also help here.
[Bug go/89123] Too many go test failures on s390x-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89123 --- Comment #11 from rdapp at linux dot ibm.com --- Ping.
[Bug ipa/85103] [8/9 Regression] Performance regressions on SPEC with r257582
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85103 rdapp at linux dot ibm.com changed: What|Removed |Added CC||rdapp at linux dot ibm.com --- Comment #13 from rdapp at linux dot ibm.com --- Also confirmed on s390x (checked on z13 and z14). We see a 10% slowdown for 458.sjeng which first occurred with this patch. Any plans on how to continue with this?
[Bug tree-optimization/100756] New: vect: Superfluous epilog created on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100756 Bug ID: 100756 Summary: vect: Superfluous epilog created on s390x Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rdapp at linux dot ibm.com Target Milestone: --- Since g:d846f225c25c5885250c303c8d118caa08c447ab we create an epilog loop on s390 for the following test case: /* { dg-do compile } */ /* { dg-options "-O3 -mzarch -march=z13" } */ /* { dg-require-effective-target s390_vx } */ int foo (int * restrict a, int n) { int i, result = 0; for (i = 0; i < n * 4; i++) result += a[i]; return result; } vec.c:10:17: note: epilog loop required The following check in tree-vect-loop.c:vect_need_peeling_or_partial_vectors_p() is now true: || ((tree_ctz (LOOP_VINFO_NITERS (loop_vinfo)) < (unsigned) exact_log2 (const_vf)) We now have LOOP_VINFO_NITERS (loop_vinfo) = _15 > 0 ? (unsigned int) _15 : 1 as compared to (unsigned int) _15 before. tree_ctz() returns 0 for the conditional and 2 before which did not trigger the epilog requirement. may_be_zero is _15 > 0 so it looks to me like we rather want to check the not-may_be_zero part of niter for alignment. Not sure if this is the right/safe thing to do, though.
[Bug tree-optimization/100756] vect: Superfluous epilog created on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100756 --- Comment #2 from rdapp at linux dot ibm.com --- I did not get back to this until now. The patch works, of course and a testsuite run looks good so far. I assume we're too late in the cycle to still get this in, right?
[Bug middle-end/104026] [12 Regression] ICE in wide_int_to_tree_1, at tree.c:1755 via tree-vect-loop-manip.c:673
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104026 rdapp at linux dot ibm.com changed: What|Removed |Added CC||rdapp at linux dot ibm.com --- Comment #4 from rdapp at linux dot ibm.com --- This code path should only be active when the backend has len_load/len_store patterns. signed char partial_load_store_bias; is new in the "middle" of loop_vect_info. Could this become overwritten/corrupted by something?
[Bug middle-end/104026] [12 Regression] ICE in wide_int_to_tree_1, at tree.c:1755 via tree-vect-loop-manip.c:673
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104026 --- Comment #8 from rdapp at linux dot ibm.com --- I think you're right. In one of the last iterations of the patch I moved + LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) = partial_load_bias; after the unsupported check. It is now only set to something meaningful if it is not unsupported. The following should help: diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 8748b1a5593..3c87920090b 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -1170,6 +1170,9 @@ vect_verify_loop_lens (loop_vec_info loop_vinfo) machine_mode len_store_mode = get_len_load_store_mode (loop_vinfo->vector_mode, false).require (); + LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo) = +VECT_PARTIAL_BIAS_UNSUPPORTED; + signed char partial_load_bias = internal_len_load_store_bias (IFN_LEN_LOAD, len_load_mode);
[Bug middle-end/104026] [12 Regression] ICE in wide_int_to_tree_1, at tree.c:1755 via tree-vect-loop-manip.c:673
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104026 --- Comment #10 from rdapp at linux dot ibm.com --- Created attachment 52192 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52192&action=edit Proposed patch Could you try the proposed patch? Bootstraps cleanly for me and no regressions on Power or x86.
[Bug rtl-optimization/104153] [12 Regression] ICE due to recent ifcvt changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104153 --- Comment #1 from rdapp at linux dot ibm.com --- I was able to reproduce the ICE on a cross compiler. Curiously, we do not even succeed with if-conversion here but nevertheless emit an insn (jump_insn 78 9 79 6 (set (pc) (if_then_else (ne:BI (reg:BI 34 ?sr_f) (reg:SI 160)) (label_ref:SI 218) (pc))) "../../../../../../..//newlib-cygwin/newlib/libm/math/s_floor.c":90:14 77 {*cbranch} (expr_list:REG_DEAD (reg:BI 34 ?sr_f) (int_list:REG_BR_PROB 536870916 (nil))) -> 218) while reg:SI 160 is nowhere to be found. Most likely the way I emit the sequences has side effects that I failed to properly revert. Everything should be inside begin_sequence () end_sequence () blocks but that is not sufficient. I guess if the optabs interfaces are exposed and used like in the patch we also need delete_insns_since () or similar? The problem is most likely latent on other targets as well.
[Bug rtl-optimization/104153] [12 Regression] ICE due to recent ifcvt changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104153 --- Comment #3 from rdapp at linux dot ibm.com --- Both of your guesses are correct :) or1k_expand_compare () indeed modifies the condition/comparison in-place. As I use cc_cmp directly from the condition of the jump it is changed but never reverted. The following helps for me: diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index fe250d508e1..20debc0822a 100644 --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc @@ -3390,7 +3390,7 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, /* Decompose the condition attached to the jump. */ rtx cond = noce_get_condition (jump, &cond_earliest, false); - rtx cc_cmp = cond_exec_get_condition (jump); + rtx cc_cmp = copy_rtx (cond_exec_get_condition (jump)); rtx rev_cc_cmp = cond_exec_get_condition (jump, /* get_reversed */ true); rtx_insn *insn;
[Bug rtl-optimization/104198] [12 regression] ifcvf change breaks 64-bit SPARC bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104198 rdapp at linux dot ibm.com changed: What|Removed |Added CC||rdapp at linux dot ibm.com --- Comment #6 from rdapp at linux dot ibm.com --- I can take a look - would one of the compile farm SPARC machines be ok to reproduce this? I'm currently trying on gcc64, but that's an OpenBSD.
[Bug rtl-optimization/104198] [12 regression] ifcvf change breaks 64-bit SPARC bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104198 --- Comment #8 from rdapp at linux dot ibm.com --- One of the s390-"specific" execution testcases fails on SPARC using the wrong order of operands. I'm pretty sure this is the problem. Going to investigate further. Seeing this, it might make sense to make the testcases target independent if the testsuite was indeed clean after the ifcvt patches.
[Bug rtl-optimization/104198] [12 regression] ifcvf change breaks 64-bit SPARC bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104198 --- Comment #9 from rdapp at linux dot ibm.com --- I believe I know what's happening and it's indeed something that could also happen on other targets. I did not anticipate backends re-creating the initial condition for the case when we pass it a cc comparison. The SPARC backend does this, using a (now) already overwritten value for comparison. Other targets could also do this but maybe do not currently but it would always lead to wrong code. Not yet sure about a proper solution. Since we go over the insns twice anyway we could check the created sequences for overlap with the condition again and mark them. In the second iteration we would make sure to not overwrite the condition unless there is no insn later that uses it. Hope to come up with something testable by today.
[Bug rtl-optimization/104198] [12 regression] ifcvf change breaks 64-bit SPARC bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104198 --- Comment #10 from rdapp at linux dot ibm.com --- Created attachment 52297 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52297&action=edit Tentative patch I now have something that successfully bootstraps on s390x, PowerPC and SPARC but I'm not really happy with it. The idea is basically to check if the newly introduced sequence (getting passed a cc comparison) emits an instruction other than the movcc that reads a register used in the comparison. I'm not sure what backends are supposed to emit for such a sequence but most likely everything is admissible. reg_overlap_mentioned_p does not handle every rtx_code so special handling is necessary which needs to be exhaustive in order for reg_overlap_mentioned_p not to ICE. I feel like this is not the best way to achieve what we want and would appreciate some insight. I attached the current full diff to master (including the fix for the problem triggered by the or1k backend. It's pretty raw/unpolished/ugly but the idea should come across. A full testsuite run on gcc202 just finished, 83 FAILs, but I haven't yet done a comparison run from before the ifcvt changes.
[Bug rtl-optimization/104198] [12 regression] ifcvt change breaks 64-bit SPARC bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104198 --- Comment #13 from rdapp at linux dot ibm.com --- I was away for some days, going to look into this again today.
[Bug rtl-optimization/104153] [12 Regression] ICE due to recent ifcvt changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104153 --- Comment #5 from rdapp at linux dot ibm.com --- I would speculate that some of the FAILs are due to the same problem seen in the other PR (104198), i.e. that for the second seq I wrongly assumed that the backend does not recreate the original comparison as well as not read from possibly overwritten regs. Regarding the test failures - in case you don't have anything yet, I could build a vanilla and a patched cross compiler and look and the differences for some of them.
[Bug rtl-optimization/104198] [12 regression] ifcvt change breaks 64-bit SPARC bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104198 --- Comment #14 from rdapp at linux dot ibm.com --- Ok, this is triggered by the copy_rtx I introduced for the or1k failure: + rtx rev_cc_cmp = copy_rtx (cond_exec_get_condition (jump, /* get_reversed */ true)); because copy_rtx is called with NULL_RTX here. I'm now testing with --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc @@ -3390,8 +3390,12 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, /* Decompose the condition attached to the jump. */ rtx cond = noce_get_condition (jump, &cond_earliest, false); - rtx cc_cmp = copy_rtx (cond_exec_get_condition (jump)); - rtx rev_cc_cmp = copy_rtx (cond_exec_get_condition (jump, /* get_reversed */ true)); + rtx cc_cmp = cond_exec_get_condition (jump); + if (cc_cmp) +cc_cmp = copy_rtx (cc_cmp); + rtx rev_cc_cmp = cond_exec_get_condition (jump, /* get_reversed */ true); + if (rev_cc_cmp) +rev_cc_cmp = copy_rtx (rev_cc_cmp); but the reg_overlap_mentioned_p part still needs to be improved (independently).
[Bug rtl-optimization/104153] [12 Regression] ICE due to recent ifcvt changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104153 --- Comment #6 from rdapp at linux dot ibm.com --- One thing wrong with the previous snippet is that cc_cmp and rev_cc_cmp can be NULL. This can also be the reason for your failures as seen on SPARC.
[Bug rtl-optimization/104198] [12 regression] ifcvt change breaks 64-bit SPARC bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104198 --- Comment #15 from rdapp at linux dot ibm.com --- Testsuite is same as before the original patch now. Going to post a patch to the mailing list later.
[Bug target/104335] [12 regression] build failure if go is included in languages after r12-6747
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104335 rdapp at linux dot ibm.com changed: What|Removed |Added CC||rdapp at linux dot ibm.com --- Comment #1 from rdapp at linux dot ibm.com --- The problem is that we pass a CC comparison to rs6000_emit_cmove (). In the sequence before it will return false via if (FLOAT_MODE_P (compare_mode) && !FLOAT_MODE_P (result_mode)) because compare_mode == SFmode and result_mode == DImode. Now it continues with compare_mode == CCFPmode and result_mode still DImode. Finally, in rs6000_generate_compare the validation fails. It looks like this function does not expect CC comparison. It looks like you always try to generate one in the .md file already, regardless of whether the incoming comparison already is a CC comparison or not. I'm not really sure how to proceed. I guess CCFPmode is not a real FLOAT_MODE but maybe we should not continue at this point anyway? Or would we need some other mechanism now to make backends aware that this situation can occur now? I haven't checked all of them but I was expecting the backend to just return NULL_RTX if it cannot handle a specific combination of ops.
[Bug target/104335] [12 regression] build failure if go is included in languages after r12-6747
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104335 --- Comment #4 from rdapp at linux dot ibm.com --- Hi Segher, originally ifcvt would only pass e.g. (unle (reg:SF 129 [ _29 ]) (reg/v:SF 118 [ highScore ])) as condition to rs6000_emit_cmove via emit_conditional_move (). (This is the example from the ICE). dest = (reg/f:DI 122 [ bestFuzz$__object ]) The check if (FLOAT_MODE_P (compare_mode) && !FLOAT_MODE_P (result_mode)) fails and we return false i.e. the expander fails and returns NULL_RTX. This is fine as we will just not do anything with it in ifcvt. Now with the patch rs6000_emit_cmove is passed (unle (reg:CCFP 153) (const_int 0 [0])). This CC comparison has already been computed before so the backend actually needs to do less work and could just use it without preparing a comparison. This helps costing on the ifcvt side. dest is the same as before but now we have FLOAT_MODE_P (compare_mode) == false and if (FLOAT_MODE_P (compare_mode) && !FLOAT_MODE_P (result_mode)) not causing a return false. We then call rs6000_emit_int_cmove which in turn calls rs6000_generate_compare. There we try to emit_insn (gen_rtx_SET (compare_result, gen_rtx_COMPARE (comp_mode, op0, op1))); but ICE since the modes don't match or make sense. >From an ifcvt point of view we would just expect a NULL_RTX if something cannot be handled. We did not pass CC compares to backends before, so I don't know how reasonable this assumption is :)
[Bug rtl-optimization/104154] [12 Regression] Another ICE due to recent ifcvt changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104154 --- Comment #1 from rdapp at linux dot ibm.com --- Strange, I didn't receive a mail/notification for this PR all, otherwise I would have looked into it earlier. This has been happening a few times lately, grml. Looking into it now.
[Bug rtl-optimization/104154] [12 Regression] Another ICE due to recent ifcvt changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104154 --- Comment #2 from rdapp at linux dot ibm.com --- Yes, your guess was right again. We ICE here: gcc_assert (cmode == SImode || cmode == SFmode || cmode == DFmode); but cmode == E_CCmode with the patch. This already helps and the resulting sequences are less expensive: diff --git a/gcc/config/arc/arc.cc b/gcc/config/arc/arc.cc index 8cc173519ab..d802c45f9af 100644 --- a/gcc/config/arc/arc.cc +++ b/gcc/config/arc/arc.cc @@ -2254,6 +2254,8 @@ gen_compare_reg (rtx comparison, machine_mode omode) cmode = GET_MODE (x); + if (cmode == CCmode) +return comparison; if (cmode == VOIDmode) cmode = GET_MODE (y); gcc_assert (cmode == SImode || cmode == SFmode || cmode == DFmode); As we are trying to ifcvt very long sequences (5+ insns) the overall outcome does not change, though.
[Bug rtl-optimization/104154] [12 Regression] Another ICE due to recent ifcvt changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104154 --- Comment #3 from rdapp at linux dot ibm.com --- Power(10) also saw a similar problem where the backend was not prepared to handle what we are passing now. I'm starting to become a bit concerned now that more backends might (latently) not be able to deal with this.
[Bug target/104335] [12 regression] build failure if go is included in languages after r12-6747
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104335 --- Comment #6 from rdapp at linux dot ibm.com --- Created attachment 52406 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52406&action=edit Tentative patch
[Bug target/104335] [12 regression] build failure if go is included in languages after r12-6747
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104335 --- Comment #7 from rdapp at linux dot ibm.com --- Sorry for not getting back to this earlier, talked to Segher off-list about this quickly. >> The check >> >> if (FLOAT_MODE_P (compare_mode) && !FLOAT_MODE_P (result_mode)) > > With compare_mode SFmode, and result_mode, what, SImode? Or VOIDmode? > In either case this would be true. > >> fails and we return false > > Which does not match my argument assumptions then, so they must be wrong. We had compare_mode == SFmode and result_mode == DImode before, making the whole if condition false, causing return false; > But this is done during expand only, costing is pretty irrelevant there. > expand should emit correct code, and later passes optimise that. Many of our > oldest problems are expand trying to "optimise" things :-( > > Or is this function called later as well? Yes this does not happen in expand here but during if-convert. >> dest is the same as before > > Which is? dest = (reg/f:DI 122 [ bestFuzz$__object ]) >> but now we have >> >> FLOAT_MODE_P (compare_mode) == false > > That is wrong then. > >> and >> >> if (FLOAT_MODE_P (compare_mode) && !FLOAT_MODE_P (result_mode)) >> >> not causing a return false. > > So you meant that FLOAT_MODE (compare_mode) is true? The whole condition is now if (FLOAT_MODE (CCFPmode) && !FLOAT_MODE (DImode)) causing no return false. > Yes, rs6000_generate_compare does not expect the funny RTL for a condition > code as input: it wants a real comparison, instead, so not a representation > of the result of such a comparison. > Yes you are right that this is not something that is usually expected as a comparison. Maybe in the future we can find a better way of informing backends about both choices. > Hard to say, I don't find documentation for this. But, such a change is not > in scope for stage 4, no matter how you look at it :-( I agree that I should have at least updated the optab doc for this. Going to do so still. I attached a hack that just returns false whenever a CCmode compare is passed to the Power backend functions. This should get you back the behavior from before the patch. I managed to bootstrap on Power 10 with ./configure --enable-languages=all --with-cpu=power10 fwiw. The testsuite ran but I saw plenty of go FAILS. Making use of the passed "CC compare"/result is probably a little more involved but essentially it should be possible to get rid of all the preparation steps for CC generation in that case.
[Bug rtl-optimization/104154] [12 Regression] Another ICE due to recent ifcvt changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104154 --- Comment #7 from rdapp at linux dot ibm.com --- This diff --git a/gcc/config/arc/arc.cc b/gcc/config/arc/arc.cc index 8cc173519ab..e9ea90631a2 100644 --- a/gcc/config/arc/arc.cc +++ b/gcc/config/arc/arc.cc @@ -2254,6 +2254,8 @@ gen_compare_reg (rtx comparison, machine_mode omode) cmode = GET_MODE (x); + if (GET_MODE_CLASS (cmode) == MODE_CC) +return comparison; if (cmode == VOIDmode) cmode = GET_MODE (y); gcc_assert (cmode == SImode || cmode == SFmode || cmode == DFmode); should be better and fixes the test case for me. Btw Segher is very much not in favor of this approach/patch and argues that reusing the comparison is the wrong thing to do as a "cc comparison" is not a comparison but rather the result of a comparison (PR104335).
[Bug ada/102450] GCC error: in set_min_and_max_values_for_integral_type, at stor-layout.c:2851
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102450 rdapp at linux dot ibm.com changed: What|Removed |Added CC||rdapp at linux dot ibm.com --- Comment #4 from rdapp at linux dot ibm.com --- Also happens on s390 with -m31 in stage 3: /home/rdapp/projects/gcc/build/./gcc/xgcc -B/home/rdapp/projects/gcc/build/./gcc/ -B/home/rdapp/gcc-fail/s390x-ibm-linux-gnu/bin/ -B/home/rdapp/gcc-fail/s390x-ibm-linux-gnu/lib/ -isystem /home/rdapp/gcc-fail/s390x-ibm-linux-gnu/include -isystem /home/rdapp/gcc-fail/s390x-ibm-linux-gnu/sys-include -fchecking=1 -c -g -O2 -m31 -fPIC -W -Wall -gnatpg -nostdinc -m31 g-sehash.adb -o g-sehash.o during GIMPLE pass: lower +===GNAT BUG DETECTED==+ | 12.0.0 20210923 (experimental) (s390x-ibm-linux-gnu) GCC error: | | in set_min_and_max_values_for_integral_type, at stor-layout.c:2851 | | Error detected around g-sehash.adb:98:4 | | Compiling g-sehash.adb | | Please submit a bug report; see https://gcc.gnu.org/bugs/ . | | Use a subject line meaningful to you and us to track the bug.| | Include the entire contents of this bug box in the report. | | Include the exact command that you entered. | | Also include sources listed below. | +==+ Please include these source files with error report Note that list may not be accurate in some cases, so please double check that the problem can still be reproduced with the set of files listed. Consider also -gnatd.n switch (see debug.adb). system.ads g-sehash.adb g-sehash.ads g-sechas.ads gnat.ads ada.ads a-stream.ads interfac.ads g-bytswa.ads a-sttebu.ads a-string.ads a-stuten.ads unchconv.ads s-exctab.ads s-stalib.ads a-unccon.ads a-tags.ads s-stoele.ads s-soflin.ads a-except.ads s-parame.ads s-traent.ads s-secsta.ads s-stache.ads s-putima.ads s-unstyp.ads g-sechas.adb s-stoele.adb raised TYPES.UNRECOVERABLE_ERROR : comperr.adb:414 precision=512 though, backtrace:#0 set_min_and_max_values_for_integral_type (type=type@entry=0x3fffb7c3000, precision=512, sgn=sgn@entry=UNSIGNED) at ../../gcc/stor-layout.c:2851 #1 0x01f141a8 in fixup_unsigned_type (type=type@entry=0x3fffb7c3000) at ../../gcc/stor-layout.c:2885 #2 0x01f14248 in make_unsigned_type (precision=) at ../../gcc/stor-layout.c:2708 #3 0x013c0698 in gnat_type_for_size (precision=, unsignedp=) at ../../gcc/ada/gcc-interface/utils.c:3670 #4 0x01b6584e in gimple_fold_builtin_memory_op (gsi=gsi@entry=0x3ffded8, dest=0x3fffb76f760, src=0x3fffb76e510, code=code@entry=BUILT_IN_MEMMOVE) at ../../gcc/gimple-fold.c:1004 #5 0x01b67a9e in gimple_fold_builtin (gsi=0x3ffded8) at ../../gcc/gimple.h:3297 #6 gimple_fold_call (gsi=gsi@entry=0x3ffded8, inplace=inplace@entry=false) at ../../gcc/gimple-fold.c:5588 #7 0x01b69732 in fold_stmt_1 (gsi=gsi@entry=0x3ffded8, inplace=inplace@entry=false, valueize=valueize@entry=0x1b500b0 ) at ../../gcc/gimple-fold.c:6290 #8 0x01b6b260 in fold_stmt (gsi=gsi@entry=0x3ffded8) at ../../gcc/gimple-fold.c
[Bug ada/102450] GCC error: in set_min_and_max_values_for_integral_type, at stor-layout.c:2851
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102450 --- Comment #5 from rdapp at linux dot ibm.com --- git bisect bad5f6a6c91d7c592cb49f7c519f289777eac09bb74 is the first bad commit commit 5f6a6c91d7c592cb49f7c519f289777eac09bb74 Author: Richard Earnshaw Date: Fri Sep 3 17:06:15 2021 +0100 gimple: allow more folding of memcpy [PR102125] [..]
[Bug tree-optimization/107617] SCC-VN with len_store and big endian
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107617 rdapp at linux dot ibm.com changed: What|Removed |Added CC||rdapp at linux dot ibm.com --- Comment #6 from rdapp at linux dot ibm.com --- Thank you and sorry for the delay, I was out on vacation. Will test it soon.
[Bug tree-optimization/107617] SCC-VN with len_store and big endian
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107617 --- Comment #7 from rdapp at linux dot ibm.com --- The patch fixes the problem for me. I did a full bootstrap with enabled len_load support. No new regressions with -march=arch14. Thanks again!