[Bug analyzer/118500] no diagnostics with strsep(3) and [[gnu::malloc(free)]] attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118500 Xi Ruoyao changed: What|Removed |Added CC||xry111 at gcc dot gnu.org --- Comment #9 from Xi Ruoyao --- (In reply to Alejandro Colomar from comment #8) > (In reply to Alejandro Colomar from comment #7) > > (In reply to Alejandro Colomar from comment #6) > > > (In reply to David Malcolm from comment #5) > > > > Thanks for filing this report. > > > > > > You're welcome! :-) > > > > > > > > > > > There are (at least) three -fanalyzer issues here: > > > > > > > > (a) false positive about leak of 'my_strdup("f,oo")': > > > > https://godbolt.org/z/rKxhfxWGf > > > > This is probably due to -fanalyzer getting confused by having both the > > > > attribute and a function body. I think there's already a report about > > > > this > > > > in BZ somewhere. > > > > > > > > (b) -fanalyzer doesn't "know" about the behavior of strsep beyond > > > > "knowing" > > > > that it doesn't malloc or free anything internally. Hence it doesn't > > > > know > > > > that it will advance s to a point within the buffer that's not the start > > > > (and hence the later "free" is a bug). > > > > > > Agree. > > > > On the other hand, I think fnalayzer shouldn't get fancy into learning about > > strsep(3), because then it won't be able to analyze all the other functions > > that modify a pointer. I think it should instead assume that a T** modifies > > the pointer value (as opposed to a const T**). > > Oops, I meant a T*const* Or T** having an access(none) or access(read_only) attribute.
[Bug c/118575] -ftrivial-auto-var-init=zero should perhaps imply -fzero-init-padding-bits=all
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118575 --- Comment #1 from Sam James --- (The natural counterpoint being "well, they're not trivial", but my response would be "okay, but it's about the almost-guaranteed intent of the user.") Or perhaps at least we should mention it in the docs for -ftrivial-auto-var-init.
[Bug c/118575] New: -ftrivial-auto-var-init=zero should perhaps imply -fzero-init-padding-bits=all
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118575 Bug ID: 118575 Summary: -ftrivial-auto-var-init=zero should perhaps imply -fzero-init-padding-bits=all Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: sjames at gcc dot gnu.org CC: jakub at gcc dot gnu.org, kees at outflux dot net, qing.zhao at oracle dot com Target Milestone: --- -ftrivial-auto-var-init=zero is intended for kernel-like usecases where they want to minimise the risk of uninitialised memory contents being leaked. Given the point of the option, it may make sense to have -ftrivial-auto-var-init=zero imply -fzero-init-padding-bits=all (unless overridden by the user), which I think is better aligned to what people will expect? I really doubt anyone using -ftrivial-auto-var-init=zero is worried about optimising the amount of initialisation work even if DSE may clean up a lot of it.
[Bug target/109832] aarch64: Inefficient code for logical or of two booleans
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109832 Andrew Pinski changed: What|Removed |Added Resolution|DUPLICATE |--- Last reconfirmed||2025-01-21 Ever confirmed|0 |1 Status|RESOLVED|NEW --- Comment #2 from Andrew Pinski --- Actually this is not a dup. I have a patch to fix PR 101806 but not this one.
[Bug sanitizer/118578] New: simple code crash on some AlmaLinux 8.4 but works on other machines of the same OS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118578 Bug ID: 118578 Summary: simple code crash on some AlmaLinux 8.4 but works on other machines of the same OS Product: gcc Version: 12.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: sanitizer Assignee: unassigned at gcc dot gnu.org Reporter: Yang.Li at synopsys dot com CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org, jakub at gcc dot gnu.org, kcc at gcc dot gnu.org Target Milestone: --- OS: AlmaLinux release 8.4 Kernel: Linux 4.18.0-425.3.1.el8.x86_64 x86_64 How to produce: 1. On AlmaLinux 8.4 machine: $echo 'void main(void){}' | g++ -fsanitize=thread -xc - 2. on some machine: $./a.out FATAL: ThreadSanitizer: unexpected memory mapping 0x8008a000-0x8008e000 3. On another AlmaLinux 8.4 machine: $./a.out A workaround is to select machine with cpu_code "E5-2660v3". But this may reduce much our available machines to run regression. Is this a known issue, or any solution can be found from GCC / sannitizer? Regards, Leon
[Bug c++/118577] Deleted dtor trying to override non-deleted one with user-defined 'operator delete' is not forbidden
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118577 --- Comment #1 from Andrew Pinski --- Adding: ``` void f() { C a; } ``` GCC now rejects it with a similar message as clang: ``` : In destructor 'virtual C::~C()': :8:8: error: 'static void A::operator delete(void*, int)' is private within this context 8 | struct C : /*mut1*/private B {}; |^ :5:8: note: declared private here 5 | struct B : /*mut1*/private A { |^ :8:8: error: no suitable 'operator delete' for 'C' 8 | struct C : /*mut1*/private B {}; |^ : In function 'void f()': :13:11: note: synthesized method 'virtual C::~C()' first required here 13 | C a; | ^ ``` I am not 100% sure if the synthesized method deconstructor should happen always or not.
[Bug c/118575] -ftrivial-auto-var-init=zero should perhaps imply -fzero-init-padding-bits=all
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118575 --- Comment #2 from Sam James --- Actually... for -ftrivial-auto-var-init, we document [0] the behaviour as "With this option, GCC will also initialize any padding of automatic variables that have structure or union types to zeroes.". So, maybe there's nothing to do except perhaps cross-link docs as a nicety. Kees, do you have a testcase where -ftrivial-auto-var-init=zero was previously doing something right, and now isn't (wrt "avoid regressing their uninitialized variable mitigations" on mastodon)? [0] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-ftrivial-auto-var-init
[Bug testsuite/118547] gcc.c-torture/compile/pr106433.c and others fail on aarch64 with an older binutils
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118547 --- Comment #2 from Andrew Pinski --- (In reply to Richard Sandiford from comment #1) > I suppose this is just a case of adding aarch64_variant_pcs requirements to > the tests. (The use of .variant_pcs is deliberately not gated on assembler > support, since dropping .variant_pcs would give incorrect object files.) Agreed which is why I put this in the testsuite component rather than target component to signify this is just a testing issue rather than a bug in the backend.
[Bug c++/118577] New: Deleted dtor trying to override non-deleted one with user-defined 'operator delete' is not forbidden
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118577 Bug ID: 118577 Summary: Deleted dtor trying to override non-deleted one with user-defined 'operator delete' is not forbidden Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: accepts-invalid Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: rush102333 at gmail dot com Target Milestone: --- Consider the following code: struct A { void operator delete(void *, int); }; struct B : /*mut1*/private A { virtual ~B(); }; struct C : /*mut1*/private B {}; First, gcc and clang agree that this should be rejected if 'A::operator delete' has the same parameter list as the default one like 'operator delete(void *)': :7:8: error: deleted function 'virtual C::~C()' overriding non-deleted function 7 | struct C : private B {}; |^ :5:11: note: overridden function is 'virtual B::~B()' 5 | virtual ~B(); | ^ :7:8: note: 'virtual C::~C()' is implicitly deleted because the default definition would be ill-formed: 7 | struct C : private B {}; |^ :7:8: error: 'static void A::operator delete(void*)' is private within this context :4:8: note: declared private here 4 | struct B : private A { |^ But gcc will not reject it anymore if the argument list of 'A::operator delete' changes(as shown in the example above): https://godbolt.org/z/vahT4rEzE It is even more confusing that EDG and MSVC seem to always accept that.
[Bug c++/118576] gcc does not realize implicitly deleted default constructor in virtual inheritance with using-decl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118576 --- Comment #2 from Andrew Pinski --- I am not 100% sure but this seems related to https://cplusplus.github.io/CWG/issues/2504.html .
[Bug c++/118049] conflicting global module declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118049 --- Comment #7 from GCC Commits --- The releases/gcc-14 branch has been updated by Nathaniel Shead : https://gcc.gnu.org/g:a82352a2a074230d841a3944e30bd497726e0bfa commit r14-11233-ga82352a2a074230d841a3944e30bd497726e0bfa Author: Nathaniel Shead Date: Fri Jan 17 21:29:08 2025 +1100 c++/modules: Propagate FNDECL_USED_AUTO when propagating deduced return types [PR118049] In the linked testcase, we're erroring because the declared return types of the functions do not appear to match. This is because when merging the deduced return types for 'foo' in 'auto-5_b.C', we overwrote the return type for the declaration with the deduced return type from 'auto-5_a.C' but neglected to track that we were originally declared with 'auto'. As a drive-by improvement to QOI, also add checks for if the deduced return types do not match; this is currently useful because we do not check the equivalence of the bodies of functions yet. PR c++/118049 gcc/cp/ChangeLog: * module.cc (trees_in::is_matching_decl): Propagate FNDECL_USED_AUTO as well. gcc/testsuite/ChangeLog: * g++.dg/modules/auto-5_a.C: New test. * g++.dg/modules/auto-5_b.C: New test. * g++.dg/modules/auto-5_c.C: New test. * g++.dg/modules/auto-6_a.H: New test. * g++.dg/modules/auto-6_b.C: New test. Signed-off-by: Nathaniel Shead (cherry picked from commit f054c36c4fcb693e04411dc691ef4172479143d6)
[Bug c++/118049] conflicting global module declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118049 Nathaniel Shead changed: What|Removed |Added Target Milestone|15.0|14.3 --- Comment #8 from Nathaniel Shead --- Also backported to GCC 14.3.
[Bug tree-optimization/117875] [15 Regression] 28% regression for 456.hmmer on Zen4 with -Ofast -march=native
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117875 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #21 from Richard Biener --- The big regression should now be fixed.
[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 117875, which changed state. Bug 117875 Summary: [15 Regression] 28% regression for 456.hmmer on Zen4 with -Ofast -march=native https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117875 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug middle-end/80342] useless outermost conversions not fully elided by genmatch generated code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80342 --- Comment #3 from Richard Biener --- *** Bug 118565 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/118565] match.pd patterns with non-leaf useless conversions result in SSA copies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118565 Richard Biener changed: What|Removed |Added Resolution|--- |DUPLICATE Status|ASSIGNED|RESOLVED --- Comment #3 from Richard Biener --- Indeed - I didn't find this somehow. *** This bug has been marked as a duplicate of bug 80342 ***
[Bug middle-end/109832] aarch64: Inefficient code for logical or of two booleans
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109832 Andrew Pinski changed: What|Removed |Added Component|target |middle-end Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #3 from Andrew Pinski --- I think I know what is happening here. Another testcase: ``` bool t(); bool f() { bool x = t(); bool y = t(); return x | y; } ``` But if we do this: ``` bool t(); bool f() { bool x = t() ^ t(); bool y = t() ^ t(); return x | y; } ``` it works. So this comes down to ccmp expansion really. In the case where it is bad the SSA_NAME comes from a non-gimple assign statement but in the good case it comes from an assign statement. I will solve this for GCC 16 (or assign it to someone to solve).
[Bug middle-end/109832] [12/13/14/15 Regression] aarch64: Inefficient code for logical or of two booleans due to ccmp expansion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109832 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |12.5 Known to work||7.5.0 Known to fail||8.1.0 Summary|aarch64: Inefficient code |[12/13/14/15 Regression] |for logical or of two |aarch64: Inefficient code |booleans due to ccmp|for logical or of two |expansion |booleans due to ccmp ||expansion --- Comment #4 from Andrew Pinski --- This is actually a regression. So maybe I will work on it before GCC 16 is released ...
[Bug fortran/118571] UTF-8 output and the A edit descriptor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118571 --- Comment #6 from Jerry DeLisle --- (In reply to kargls from comment #3) > diff --git a/libgfortran/io/write.c b/libgfortran/io/write.c > index 54312bf67e9..084ac314f5c 100644 > --- a/libgfortran/io/write.c > +++ b/libgfortran/io/write.c > @@ -178,7 +178,7 @@ write_utf8_char4 (st_parameter_dt *dtp, gfc_char4_t > *source, > } > >/* Now process the remaining characters, one at a time. */ > - for (j = k; j < src_len; j++) > + for (j = k; j < (w_len < src_len ? w_len : src_len); j++) > { >c = source[j]; >if (c < 0x80) The patch breaks utf8_1.f03. I have an inkling why and will get back to this in the morning.
[Bug c++/118576] New: gcc does not realize implicitly deleted default constructor in virtual inheritance with using-decl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118576 Bug ID: 118576 Summary: gcc does not realize implicitly deleted default constructor in virtual inheritance with using-decl Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: accepts-invalid Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: rush102333 at gmail dot com Target Milestone: --- Consider the following code: ~ struct A { A(int); }; struct B : virtual A { using A::A; }; struct C1 : virtual B { using B::B; }; struct D1 : virtual C1 { using C1::C1; }; D1 d1(0); ~ Clang rejects this by complaining that the default constructor of 'B' is necessary here, which does not exist: ~ :14:4: error: constructor inherited by 'D1' from base class 'A' is implicitly deleted 14 | D1 d1(0); |^ :7:13: note: constructor inherited by 'D1' is implicitly deleted because base class 'B' has a deleted corresponding constructor 7 | struct C1 : virtual B { using B::B; }; | ^ :5:12: note: default constructor of 'B' is implicitly deleted because base class 'A' has no default constructor 5 | struct B : virtual A { using A::A; }; |^ 1 error generated. ~ EDG and ICC error for similar reasons: ~ "", line 14: error: no default constructor exists for class "A" D1 d1(0); ^ detected during implicit generation of "D1::A(int)" at line 14 1 error detected in the compilation of "". Compiler returned: 2 ~ But gcc seems not: https://godbolt.org/z/5srhhqWeE
[Bug c++/118576] gcc does not realize implicitly deleted default constructor in virtual inheritance with using-decl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118576 --- Comment #1 from Andrew Pinski --- MSVC also accepts it.
[Bug c++/55120] Inaccessible virtual base constructor does not prevent generation of default constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55120 Andrew Pinski changed: What|Removed |Added Status|NEW |SUSPENDED Alias||cwg2246 --- Comment #16 from Andrew Pinski --- So this turns out to be DR 2246 : https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#2246 Status: drafting So suspending.
[Bug target/109093] csmith: a February runtime bug ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109093 Sam James changed: What|Removed |Added Attachment #59531|0 |1 is obsolete|| --- Comment #30 from Sam James --- Created attachment 60221 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60221&action=edit pr109093-comment24-reduction.c This testcase still fails on trunk (only) and -fstack-reuse=none doesn't help.
[Bug target/118501] [14/15 regression] aarch64: ICE in simplify_context::simplify_subreg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118501 Richard Sandiford changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #6 from Richard Sandiford --- Testing a patch.
[Bug tree-optimization/117875] [15 Regression] 28% regression for 456.hmmer on Zen4 with -Ofast -march=native
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117875 --- Comment #19 from Richard Biener --- Remains fast_algorithms.c:133:19: optimized: Loop 3 distributed: split to 3 loops and 0 library calls. fast_algorithms.c:133:19: optimized: Loop 5 distributed: split to 2 loops and 0 library calls. fast_algorithms.c:133:19: optimized: loop vectorized using 32 byte vectors fast_algorithms.c:133:19: optimized: loop versioned for vectorization because of possible aliasing fast_algorithms.c:133:19: optimized: loop vectorized using 32 byte vectors fast_algorithms.c:133:19: optimized: loop versioned for vectorization because of possible aliasing -fast_algorithms.c:133:19: optimized: loop vectorized using 32 byte vectors -fast_algorithms.c:133:19: optimized: loop versioned for vectorization because of possible aliasing specifically fast_algorithms.c:133:19: note: using as main loop exit: 58 -> 63 [AUX: (nil)] fast_algorithms.c:133:19: note: LOOP VECTORIZED is no longer vectorized. We have fast_algorithms.c:133:19: note: Build SLP for _ifc__350 = _1750; fast_algorithms.c:133:19: missed: Build SLP failed: operation unsupported _ifc__350 = _1750; which came up before. This is emitted from if-conversion: # DEBUG BEGIN_STMT - if (_1359 < -987654321) -goto ; [50.00%] - else -goto ; [50.00%] - - [local count: 489894279]: - goto ; [100.00%] - - [local count: 550443010]: + _227 = _1359 < -987654321; + # DEBUG BEGIN_STMT + _ifc__347 = _227 ? -987654321 : _1355; + _1750 = MAX_EXPR <_1359, -987654321>; + _ifc__350 = _1750; + *_1347 = _ifc__350; # DEBUG BEGIN_STMT k_1361 = k_1291 + 1; # DEBUG k => k_1361 @@ -2619,14 +2683,77 @@ else goto ; [11.00%] - [local count: 374301246]: - *_1347 = _1359; - goto ; [100.00%] + [local count: 489894279]: specifically from VN run on the body: Value numbering stmt = _ifc__350 = _1287 ? _ifc__348 : _ifc__349; Applying pattern match.pd:6569, gimple-match-3.cc:47593 Setting value number of _ifc__350 to _ifc__350 (changed) Applying pattern match.pd:5271, gimple-match-4.cc:7879 Applying pattern match.pd:6365, gimple-match-5.cc:6177 Applying pattern match.pd:6569, gimple-match-3.cc:47593 gimple_simplified to _1750 = MAX_EXPR <_1359, -987654321>; _ifc__350 = _1750; Making available beyond BB58 _ifc__350 for value _ifc__350 where (cond (cmp:c (nop_convert1?@c0 @0) (nop_convert2?@c1 @1)) (convert3? @0) (convert4? @1)) is simplified as (convert (max @c0 @c1) and the conversion turns out unnecessary. There's a longer standing issue with match producing this, we generate { res_op->set_op (NOP_EXPR, type, 1); { tree _o1[2], _r1; _o1[0] = captures[0]; _o1[1] = captures[2]; gimple_match_op tem_op (res_op->cond.any_else (), MAX_EXPR, TREE_TYPE (_o1[0]), _o1[0], _o1[1]); tem_op.resimplify (lseq, valueize); _r1 = maybe_push_res_to_seq (&tem_op, lseq); if (!_r1) goto next_after_fail1387; res_op->ops[0] = _r1; } res_op->resimplify (lseq, valueize); so we push the inner expression without considering the outer conversion. The other thing is that VN elimination folds stmts we substitute into but even in non-iterating mode we do not value-number all resulting stmts, we merely assign value-numbers to resulting defs. So those copies prevail. I've thought we should fix it elsewhere than allowing SSA copies in SLP build, but this shows it will be the easiest fix given non-SLP handles this as vector operation just fine. So I'll go for this, filing separate issues for the two above.
[Bug tree-optimization/118565] New: match.pd patterns with non-leaf useless conversions result in SSA copies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118565 Bug ID: 118565 Summary: match.pd patterns with non-leaf useless conversions result in SSA copies Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- In PR117875 we can see that patterns like /* A >= B ? A : B -> max (A, B) and friends. The code is still in fold_cond_expr_with_comparison for GENERIC folding with some extra constraints. */ (for cmp (eq ne le lt unle unlt ge gt unge ungt uneq ltgt) (simplify (cond (cmp:c (nop_convert1?@c0 @0) (nop_convert2?@c1 @1)) (convert3? @0) (convert4? @1)) ... (if (!HONOR_NANS (type)) (if (VECTOR_TYPE_P (type)) (view_convert (max @c0 @c1)) (convert (max @c0 @c1) can end up doing the following via VN done after if-conversion before vectorization: Value numbering stmt = _ifc__350 = _1287 ? _ifc__348 : _ifc__349; Applying pattern match.pd:6569, gimple-match-3.cc:47593 Setting value number of _ifc__350 to _ifc__350 (changed) Applying pattern match.pd:5271, gimple-match-4.cc:7879 Applying pattern match.pd:6365, gimple-match-5.cc:6177 Applying pattern match.pd:6569, gimple-match-3.cc:47593 gimple_simplified to _1750 = MAX_EXPR <_1359, -987654321>; _ifc__350 = _1750; Making available beyond BB58 _ifc__350 for value _ifc__350 where the SSA copy is redundant caused by pushing (max ..) to 'seq' before simplifying the (convert ..). Also VN elimination does fold substituted into stmts with allowing multiple replacement stmts but it does not (when not iterating) value-number those which would have eliminated the copy. It's best to not create such copies, thus for a fix in genmatch.
[Bug tree-optimization/118565] match.pd patterns with non-leaf useless conversions result in SSA copies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118565 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2025-01-20 Keywords||missed-optimization Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Richard Biener --- Mine.
[Bug go/118286] go crypto/tls test fails because of expired certificate? (TestVerifyConnection, bad certificate)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118286 Sam James changed: What|Removed |Added Last reconfirmed||2025-01-20 Status|RESOLVED|REOPENED Ever confirmed|0 |1 Assignee|ian at airs dot com|sjames at gcc dot gnu.org Resolution|FIXED |--- --- Comment #5 from Sam James --- Thanks. I'll get to it but I won't promise to immediately.
[Bug target/115921] Missed optimization: and->ashift might be cheaper than ashift->and on typical RISC targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115921 --- Comment #10 from GCC Commits --- The master branch has been updated by Xi Ruoyao : https://gcc.gnu.org/g:10e98638998745ebc3888a20e661a8364e88ea3a commit r15-7062-g10e98638998745ebc3888a20e661a8364e88ea3a Author: Xi Ruoyao Date: Tue Jan 14 17:26:04 2025 +0800 LoongArch: Improve reassociation for bitwise operation and left shift [PR 115921] For things like (x | 0x101) << 11 It's obvious to write: ori $r4,$r4,257 slli.d $r4,$r4,11 But we are actually generating something insane: lu12i.w $r12,524288>>12 # 0x8 ori $r12,$r12,2048 slli.d $r4,$r4,11 or $r4,$r4,$r12 jr $r1 It's because the target-independent canonicalization was written before we have all the RISC targets where loading an immediate may need multiple instructions. So for these targets we need to handle this in the target code. We do the reassociation on our own (i.e. reverting the target-independent reassociation) if "(reg [&|^] mask) << shamt" does not need to load mask into an register, and either: - (mask << shamt) needs to be loaded into an register, or - shamt is a const_immalsl_operand, so the outer shift may be further combined with an add. gcc/ChangeLog: PR target/115921 * config/loongarch/loongarch-protos.h (loongarch_reassoc_shift_bitwise): New function prototype. * config/loongarch/loongarch.cc (loongarch_reassoc_shift_bitwise): Implement. * config/loongarch/loongarch.md (*alslsi3_extend_subreg): New define_insn_and_split. (_shift_reverse): New define_insn_and_split. (_alsl_reversesi_extended): New define_insn_and_split. (zero_extend_ashift): Remove as it's just a special case of and_shift_reversedi, and it does not make too much sense to write "alsl.d rd,rs,r0,shamt" instead of "slli.d rd,rs,shamt". (bstrpick_alsl_paired): Remove as it is already done by splitting and_shift_reversedi into and + ashift first, then late combining the ashift and a further add. gcc/testsuite/ChangeLog: PR target/115921 * gcc.target/loongarch/bstrpick_alsl_paired.c (scan-rtl-dump): Scan for and_shift_reversedi instead of the removed bstrpick_alsl_paired. * gcc.target/loongarch/bitwise-shift-reassoc.c: New test.
[Bug libstdc++/118563] [15 regression] libstdc++ incompatible ABI change on riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118563 Richard Biener changed: What|Removed |Added Target Milestone|--- |15.0 Priority|P3 |P1
[Bug libstdc++/118563] New: [15 regression] libstdc++ incompatible ABI change on riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118563 Bug ID: 118563 Summary: [15 regression] libstdc++ incompatible ABI change on riscv64 Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: ABI Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: sch...@linux-m68k.org Target Milestone: --- 3 incompatible symbols 0 _ZTIPKDF16b typeinfo for std::bfloat16_t const* version status: incompatible CXXABI_1.3.14 type: object type size: 32 status: added 1 _ZTIPDF16b typeinfo for std::bfloat16_t* version status: incompatible CXXABI_1.3.14 type: object type size: 32 status: added 2 _ZTIDF16b typeinfo for std::bfloat16_t version status: incompatible CXXABI_1.3.14 type: object type size: 16 status: added
[Bug tree-optimization/117875] [15 Regression] 28% regression for 456.hmmer on Zen4 with -Ofast -march=native
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117875 --- Comment #18 from Richard Biener --- (In reply to Richard Biener from comment #17) > -sre_math.c:174:17: optimized: loop vectorized using 16 byte vectors > > -sre_math.c:192:17: optimized: loop vectorized using 16 byte vectors Those two are identical, float ** FMX2Alloc(int rows, int cols) { float **mx; int r; mx= (float **) __builtin_malloc (sizeof(float *) * rows); mx[0] = (float *) __builtin_malloc (sizeof(float) * rows * cols); for (r = 1; r < rows; r++) mx[r] = mx[0] + r*cols; return mx; } where the "failure" is a missed epilogue vectorization due to cost (reproducible with Zen2 and Zen4 tuning, not with generic), where SLP costs t.c:9:17: note: Cost model analysis: Vector inside of loop cost: 136 Vector prologue cost: 86 Vector epilogue cost: 128 Scalar iteration cost: 56 Scalar outside cost: 32 Vector outside cost: 214 prologue iterations: 0 epilogue iterations: 2 Calculated minimum iters for profitability: 6 and classical loop vect t.c:9:17: note: Cost model analysis: Vector inside of loop cost: 136 Vector prologue cost: 68 Vector epilogue cost: 128 Scalar iteration cost: 56 Scalar outside cost: 32 Vector outside cost: 196 prologue iterations: 0 epilogue iterations: 2 Calculated minimum iters for profitability: 5 where the difference is in cols_21(D) * r_42 1 times vector_stmt costs 12 in body node 0x25bf6f00 1 times scalar_to_vec costs 10 in prologue _8 w* 4 1 times vector_stmt costs 40 in prologue 1 times vector_load costs 12 in prologue vs. cols_21(D) * r_42 1 times scalar_to_vec costs 4 in prologue cols_21(D) * r_42 1 times vector_stmt costs 12 in body _8 w* 4 1 times vector_stmt costs 40 in prologue we seem to forget to cost the constant 4 load cost in non-SLP and we run into target specific costing of scalar_to_vec applying a GPR->XMM move penalty which we only do for SLP. So, SLP looks fine here. This looks like a not important vectorization. I verified that with Zen2 and epilogue vectorization disabled the regression triggered by --param vect-force-slp=1 remains.
[Bug target/118348] [15 Regression] [SVE] HACCKernels seems to miscompile with VLS SVE after 0c5c0c959c2e592b84739f19ca771fa69eb8dfee
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118348 Richard Sandiford changed: What|Removed |Added Status|NEW |ASSIGNED CC||rsandifo at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot gnu.org --- Comment #6 from Richard Sandiford --- Mine (after discussing with Tamar).
[Bug target/117270] [15 Regression] 9% exec time slowdown of 538.imagick_r on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117270 Richard Sandiford changed: What|Removed |Added Assignee|tnfchris at gcc dot gnu.org|rsandifo at gcc dot gnu.org --- Comment #3 from Richard Sandiford --- Taking after discussing with Tamar.
[Bug tree-optimization/118552] [15 regression] ICE on valid code at -O3 with "-fno-tree-ch -fno-tree-ccp -fno-tree-fre" on x86_64-linux-gnu: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118552 --- Comment #3 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:1265afa91d51606605f85e732344e86e4e4dae9b commit r15-7059-g1265afa91d51606605f85e732344e86e4e4dae9b Author: Richard Biener Date: Mon Jan 20 11:50:53 2025 +0100 tree-optimization/118552 - failed LC SSA update after unrolling When unrolling changes nesting relationship of loops we fail to mark blocks as in need to change for LC SSA update. Specifically the LC SSA PHI on a former inner loop exit might be misplaced if that loop becomes a sibling of its outer loop. PR tree-optimization/118552 * cfgloopmanip.cc (fix_loop_placement): Properly mark exit source blocks as to be scanned for LC SSA update when the loops nesting relationship changed. (fix_loop_placements): Adjust. (fix_bb_placements): Likewise. * gcc.dg/torture/pr118552.c: New testcase.
[Bug middle-end/118549] -funreachable-traps doesn't transform user provided unreachable into trap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118549 Sam James changed: What|Removed |Added Keywords||documentation --- Comment #8 from Sam James --- That's fine, we should just make it clear in the docs then.
[Bug tree-optimization/118552] [15 regression] ICE on valid code at -O3 with "-fno-tree-ch -fno-tree-ccp -fno-tree-fre" on x86_64-linux-gnu: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118552 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #2 from Richard Biener --- So the issue is that we only update LC-SSA for uses in specific blocks determined by unloop_loops and siblings. This does not consider the case where eliding a very innermost loop changes the nesting relation of its outer and outer outer loop, making them siblings, and thus turning a former valid LC-SSA PHI of the outer outer loop become a LC-SSA PHI of the outer loop where the def is in the former outer outer loop and thus the LC-SSA PHI is now misplaced (on the "wrong" exit). In particular we are removing two exits of the outer loop in unloop () (those from the original innermost loop) and the 2nd places the outer loop now on an exit edge of the outer outer loop.
[Bug tree-optimization/118544] -fopt-info misreports unroll factor when using #pragma GCC unroll
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118544 --- Comment #7 from Tibor Győri --- (In reply to Richard Biener from comment #5) > I suppose cunroll should report the loop was fully peeled. > > Note the unroll amount might be confusig when for example loop header copying > causes the number of latch executions to decrease by one before we get to > unroll. Yes, I think adding reports to some currently silent passes would help a lot.
[Bug c/118564] New: DSP instruction support missing for ARM Cortex M4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118564 Bug ID: 118564 Summary: DSP instruction support missing for ARM Cortex M4 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: anstein99 at googlemail dot com Target Milestone: --- ARM Cortex M4 does support DSP instructions. An example chip with a Cortex M4 and DSP instructions is STM32F411CEU6. The Cortex M4 DSP instructions are listed at: https://developer.arm.com/documentation/100166/0001/Programmers-Model/Instruction-set-summary/Table-of-processor-DSP-instructions As of arm-none-eabi-gcc version 14.2.0 gcc does not support "+dsp" for "-mcpu=cortex-m4" which it should as these instructions are available.
[Bug fortran/118321] [OpenMP] declare_variant's 'adjust_args' yields wrong code if the result is passed by argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118321 --- Comment #7 from Tobias Burnus --- (In reply to Thomas Schwinge from comment #6) > Curious, for C we don't need any such changes, to handle nested functions, > for example? (I don't remember how these are implemented exactly.) For internal functions – C and Fortran – there is frame pointer argument, which does not show up at gimplifcation. Likewise, some targets also have odd ABI rules where the way arguments are processed differently after a certain number or for certain data types. However, until after gimplify those do not play a role and get only added later on. In case of internal functions, those get added via the "nested" pass, that runs after "gimple"; still, at 'tree' level, no argument is used for those - and in a tree dump, the frame pointer show up as "[static-chain: &FRAME.0]".
[Bug tree-optimization/118273] [15 Regression] ICE when vectorizing uniform vector function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118273 --- Comment #2 from Tamar Christina --- It seems that the nmasks is wrong here: unsigned nmasks = exact_div (ncopies * bestn->simdclone->simdlen, TYPE_VECTOR_SUBPARTS (vectype)).to_constant (); ncopies is correct and vectype is correct, but the value of bestn->simdclone->simdlen is suspect.
[Bug fortran/118321] [OpenMP] declare_variant's 'adjust_args' yields wrong code if the result is passed by argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118321 Thomas Schwinge changed: What|Removed |Added CC||tschwinge at gcc dot gnu.org --- Comment #6 from Thomas Schwinge --- Curious, for C we don't need any such changes, to handle nested functions, for example? (I don't remember how these are implemented exactly.)
[Bug tree-optimization/114948] [15 Regression] ICE on valid code at -O3 with "-fno-tree-ccp -fno-tree-ch" on x86_64-linux-gnu: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114948 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #9 from Richard Biener --- I fixed PR118552, but I'm not sure if this bug is fixed as well as I was never able to reproduce it. Note there's the related PR116796 fix, but that was done after this issue went latent again. So I'm going to close as fixed, hoping for a new testcase if a unfixed part remains.
[Bug tree-optimization/118552] [15 regression] ICE on valid code at -O3 with "-fno-tree-ch -fno-tree-ccp -fno-tree-fre" on x86_64-linux-gnu: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118552 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #4 from Richard Biener --- Fixed on trunk (it's of course latent).
[Bug c++/118566] New: 'requires' avoids out-class implemention to find inside-class declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118566 Bug ID: 118566 Summary: 'requires' avoids out-class implemention to find inside-class declaration Product: gcc Version: 14.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: shyeyian at petalmail dot com Target Milestone: --- Compilation fails on g++14.2. It is ok on clang++. #include struct test { // Declaration test(std::from_range_t, std::ranges::input_range auto&& r) requires std::convertible_to, int>; }; // Implemention test::test(std::from_range_t, std::ranges::input_range auto&& r) requires std::convertible_to, int> { } // Compilation failed. // Oops! The outside-class implemention cannot find the inside-class declaration! /* GCC Version: 14.2.0(MacOS 15.1.1, Macbook with Apple M2, gcc is installed from brew), 14.2.0(Windows11, Matebook with Intel(x86_64), gcc is installed from MSYS) Compile command: g++ -std=c++23 main.cpp -o main.o Compile output: main.cpp:11:1: error: no declaration matches 'test::test(std::from_range_t, auto:49&&) requires convertible_to)()))>::type, std::indirectly_readable_traits)()))>::type> >::__iter_traits)()))>::type, std::indirectly_readable_traits)()))>::type> >::value_type, int>' 11 | test::test(std::from_range_t, std::ranges::input_range auto&& r) | ^~~~ main.cpp:3:8: note: candidates are: 'constexpr test::test(test&&)' 3 | struct test |^~~~ main.cpp:3:8: note: 'constexpr test::test(const test&)' main.cpp:6:5: note: 'template requires input_range test::test(std::from_range_t, auto:48&&) requires convertible_to)()))>::type, std::indirectly_readable_traits)()))>::type> >::__iter_traits)()))>::type, std::indirectly_readable_traits)()))>::type> >::value_type, int>' 6 | test(std::from_range_t, std::ranges::input_range auto&& r) | ^~~~ main.cpp:3:8: note: 'struct test' defined here 3 | struct test |^~~~ Expected behavior: command="clang++ -std=c++23 main.cpp -o main.o", compilation is ok. */
[Bug c++/118255] [12 Regression] Unnecessary error on variable shadowing for friend declaration inside template class with non-type parameter since r9-1493-g8945521a50a7dd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118255 --- Comment #8 from GCC Commits --- The releases/gcc-12 branch has been updated by Simon Martin : https://gcc.gnu.org/g:7bb462dd2a6a6551d142e7ad983fa2afd1df9253 commit r12-10919-g7bb462dd2a6a6551d142e7ad983fa2afd1df9253 Author: Simon Martin Date: Sun Jan 5 10:36:47 2025 +0100 c++: Friend classes don't shadow enclosing template class paramater [PR118255] We currently reject the following code === code here === template struct S { friend class non_template; }; class non_template {}; S<0> s; === code here === While EDG agrees with the current behaviour, clang and MSVC don't (see https://godbolt.org/z/69TGaabhd), and I believe that this code is valid, since the friend clause does not actually declare a type, so it cannot shadow anything. The fact that we didn't error out if the non_template class was declared before S backs this up as well. This patch fixes this by skipping the call to check_template_shadow for hidden bindings. PR c++/118255 gcc/cp/ChangeLog: * name-lookup.cc (pushdecl): Don't call check_template_shadow for hidden bindings. gcc/testsuite/ChangeLog: * g++.dg/lookup/pr99116-1.C: Adjust test expectation. * g++.dg/template/friend84.C: New test. (cherry picked from commit b5a069203fc074ab75d994c4a7e0f2db6a0a00fd)
[Bug testsuite/118547] gcc.c-torture/compile/pr106433.c and others fail on aarch64 with an older binutils
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118547 Richard Sandiford changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org --- Comment #1 from Richard Sandiford --- I suppose this is just a case of adding aarch64_variant_pcs requirements to the tests. (The use of .variant_pcs is deliberately not gated on assembler support, since dropping .variant_pcs would give incorrect object files.)
[Bug c++/118509] [14 regression] Front-end produced uninitialized memory reference when compiling Nektar since r15-4595-gb25d3201b6338d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118509 Jakub Jelinek changed: What|Removed |Added Summary|[14/15 regression] |[14 regression] Front-end |Front-end produced |produced uninitialized |uninitialized memory|memory reference when |reference when compiling|compiling Nektar since |Nektar since|r15-4595-gb25d3201b6338d |r15-4595-gb25d3201b6338d| --- Comment #14 from Jakub Jelinek --- Fixed on the trunk so far. For 14.2.1, perhaps we're going to temporarily revert the r14-10836 PR117259 and r14-10666 PR116449 changes until this change is sufficiently verified on the trunk.
[Bug tree-optimization/118224] [15 Regression] Incorrect optimization with calloc when the size exceeds SIZE_MAX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118224 --- Comment #8 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:d882e48d48bf300941c3610c5af157c64ccf0a84 commit r15-7056-gd882e48d48bf300941c3610c5af157c64ccf0a84 Author: Jakub Jelinek Date: Mon Jan 20 10:24:18 2025 +0100 tree-ssa-dce: Fix calloc handling [PR118224] As reported by Dimitar, this should have been a multiplication, but wasn't caught because in the test (~(__SIZE_TYPE__) 0) / 2 is the largest accepted size and so adding 3 to it also resulted in "overflow". The following patch adds one subtest to really verify it is a multiplication and fixes the operation. 2025-01-20 Jakub Jelinek PR tree-optimization/118224 * tree-ssa-dce.cc (is_removable_allocation_p): Multiply a1 by a2 instead of adding it. * gcc.dg/pr118224.c: New test.
[Bug c++/118509] [14/15 regression] Front-end produced uninitialized memory reference when compiling Nektar since r15-4595-gb25d3201b6338d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118509 --- Comment #13 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:d9d0eeea93d39d304c7420e87f4b903d89f2e9fa commit r15-7057-gd9d0eeea93d39d304c7420e87f4b903d89f2e9fa Author: Jakub Jelinek Date: Mon Jan 20 10:26:49 2025 +0100 tree, c++: Consider TARGET_EXPR invariant like SAVE_EXPR [PR118509] My October PR117259 fix to get_member_function_from_ptrfunc to use a TARGET_EXPR rather than SAVE_EXPR unfortunately caused some regressions as well as the following testcase shows. What happens is that get_member_function_from_ptrfunc -> build_base_path calls save_expr, so since the PR117259 change in mnay cases it will call save_expr on a TARGET_EXPR. And, for some strange reason a TARGET_EXPR is not considered an invariant, so we get a SAVE_EXPR wrapped around the TARGET_EXPR. That SAVE_EXPR > gets initially added only to the second operand of ?:, so at that point it would still work fine during expansion. But unfortunately an expression with that subexpression is handed to the caller also through *instance_ptrptr = instance_ptr; and gets evaluated once again when computing the first argument to the method. So, essentially, we end up with (TARGET_EXPR , (... ? ... SAVE_EXPR ... : ...)) (... SAVE_EXPR ..., ...); and while D.2907 is initialized during gimplification in the code dominating everything that uses it, the extra temporary created for the SAVE_EXPR is initialized only conditionally (if the ?: condition is true) but then used unconditionally, so we get pmf-4.C: In function âvoid foo(C, B*)â: pmf-4.C:12:11: warning: ââ may be used uninitialized [-Wmaybe-uninitialized] 12 | (y->*x) (); | ^~ pmf-4.C:12:11: note: ââ was declared here 12 | (y->*x) (); | ^~ diagnostic and wrong-code issue too. The following patch fixes it by considering a TARGET_EXPR invariant for SAVE_EXPR purposes the same as SAVE_EXPR is. Really creating another temporary for it is just a waste of the IL. Unfortunately I had to tweak the omp matching code to be able to accept TARGET_EXPR the same as SAVE_EXPR. 2025-01-20 Jakub Jelinek PR c++/118509 gcc/ * tree.cc (tree_invariant_p_1): Return true for TARGET_EXPR too. gcc/c-family/ * c-omp.cc (c_finish_omp_for): Handle TARGET_EXPR in first operand of COMPOUND_EXPR incr the same as SAVE_EXPR. gcc/testsuite/ * g++.dg/expr/pmf-4.C: New test.
[Bug tree-optimization/118544] -fopt-info misreports unroll factor when using #pragma GCC unroll
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118544 Richard Sandiford changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org --- Comment #6 from Richard Sandiford --- Obviously very minor, but "unrolled by a factor of N" would also avoid the singular/plural issue with "unrolled N times".
[Bug ipa/118535] [15 regression] wrong code at -O{2,3} on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118535 Richard Biener changed: What|Removed |Added Keywords||needs-bisection Priority|P3 |P1
[Bug tree-optimization/118544] -fopt-info misreports unroll factor when using #pragma GCC unroll
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118544 --- Comment #5 from Richard Biener --- "loop unrolled 2 times" would be wrong, "loop unrolled using an unroll factor of two" might be OK. I suppose cunroll should report the loop was fully peeled. Note the unroll amount might be confusig when for example loop header copying causes the number of latch executions to decrease by one before we get to unroll. So - the current message is correct. Maybe there can be an improvement in reporting, but a change by itself might be confusing to users.
[Bug middle-end/118549] -funreachable-traps doesn't transform user provided unreachable into trap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118549 Richard Biener changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #6 from Richard Biener --- Not sure it works as designed (only affecting compiler-generated unreachable())?
[Bug tree-optimization/114948] [15 Regression] ICE on valid code at -O3 with "-fno-tree-ccp -fno-tree-ch" on x86_64-linux-gnu: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114948 Richard Biener changed: What|Removed |Added Target Milestone|14.3|15.0
[Bug c++/65608] [meta-bug] friend issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65608 Bug 65608 depends on bug 118255, which changed state. Bug 118255 Summary: [12 Regression] Unnecessary error on variable shadowing for friend declaration inside template class with non-type parameter since r9-1493-g8945521a50a7dd https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118255 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug c++/118255] [12 Regression] Unnecessary error on variable shadowing for friend declaration inside template class with non-type parameter since r9-1493-g8945521a50a7dd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118255 Simon Martin changed: What|Removed |Added Known to work||12.4.1, 13.3.1, 14.2.1 Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #9 from Simon Martin --- Fixed on all active branches.
[Bug tree-optimization/117875] [15 Regression] 28% regression for 456.hmmer on Zen4 with -Ofast -march=native
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117875 --- Comment #20 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:7b64f757a8df8efd989000baa667279f8957442e commit r15-7063-g7b64f757a8df8efd989000baa667279f8957442e Author: Richard Biener Date: Mon Jan 20 14:25:31 2025 +0100 tree-optimization/117875 - missed SLP vectorization There's a discrepancy in SLP vs non-SLP vectorization that SLP build does not handle plain SSA copies (which should have been elimiated earlier). But this now bites back since non-SLP happily handles them, causing a regression with --param vect-force-slp=1 which is now default, resulting in a big performance regression in 456.hmmer. So the following restores parity between SLP and non-SLP here, defering the missed copy elimination to later (PR118565). PR tree-optimization/117875 * tree-vect-slp.cc (vect_build_slp_tree_1): Handle SSA copies.
[Bug rtl-optimization/118067] [15 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1860 (unable to find a register to spill) {*lshrhi3_1} with -O -fno-split-wide-types -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118067 --- Comment #16 from GCC Commits --- The releases/gcc-13 branch has been updated by Uros Bizjak : https://gcc.gnu.org/g:1fe03d184723ee942c74b5e6f8cde45d2fcdcd60 commit r13-9335-g1fe03d184723ee942c74b5e6f8cde45d2fcdcd60 Author: Uros Bizjak Date: Mon Jan 20 16:12:26 2025 +0100 i386: Disable SImode/DImode moves from/to mask regs without avx512bw [PR118067] SImode and DImode moves from/to mask registers are valid only with AVX512BW, so mark relevant alternatives in *movsi_internal and *movdi_internal as such. PR target/118067 gcc/ChangeLog: * config/i386/i386.md (*movdi_internal): Disable alternatives from/to mask registers without AVX512BW. (*movsi_internal): Ditto.
[Bug ipa/118535] [15 regression] wrong code at -O{2,3} on x86_64-linux-gnu since r15-6294
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118535 Jakub Jelinek changed: What|Removed |Added Keywords|needs-bisection | CC||jakub at gcc dot gnu.org Summary|[15 regression] wrong code |[15 regression] wrong code |at -O{2,3} on |at -O{2,3} on |x86_64-linux-gnu|x86_64-linux-gnu since ||r15-6294 --- Comment #3 from Jakub Jelinek --- Started with r15-6294-g96fb71883d438bdb241fdf9c7d12f945c5ba0c7f
[Bug tree-optimization/118570] -O2 much faster than -O3 for Romberg's method
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118570 Sam James changed: What|Removed |Added See Also||https://github.com/llvm/llv ||m-project/issues/123649 --- Comment #1 from Sam James --- I've filed https://github.com/llvm/llvm-project/issues/123649 for Clang too which seems to always have the "-O3" performance (slower).
[Bug target/116447] g++.dg/cpp23/ext-floating13.C fails on Cortex-M55 due to undefined reference
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116447 --- Comment #8 from Christophe Lyon --- Indeed, the problem is that +mve does not enable the floating-point extension, as opposed to +mve.fp, which means that arm_fp16_inst is false (in arm.cc / arm_option_reconfigure_globals) so arm_fp16_format == 0 and the __fp16 type is not defined in arm_init_fp16_builtins. Related to PR 117814 and the incorrect warning emitted for pr112337.c
[Bug target/118560] [15 regression] ICE when building powerpc-unknown-linux-gnu cross-compiler since r15-7008
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118560 Sam James changed: What|Removed |Added Status|UNCONFIRMED |NEW Keywords||compile-time-hog Ever confirmed|0 |1 Last reconfirmed||2025-01-20
[Bug tree-optimization/118570] -O2 much faster than -O3 for Romberg's method
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118570 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Comment #2 from Alexander Monakov --- -O2 trivializes the benchmark by discovering that 'romberg' is const without inlining it, then moving it out of the loop; -O3 inlines it, and I think uses of stack arrays make it difficult to passes like loop invariant motion to optimize it. I wouldn't expect this to be a big issue for real-world code: either the function wouldn't be called with the same arguments in the loop, or if it was a "proper" benchmark there would be compiler barriers in place to prevent the compiler from discovering the redundancy.
[Bug go/117702] [15 Regression] Decide about libgo soname bump for GCC 15
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117702 Ian Lance Taylor changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #2 from Ian Lance Taylor --- Yes, I don't think we have to do anything this round. Thanks.
[Bug tree-optimization/117424] [12/13/14/15 regression] Miscompile with different optimization flags since r12-4871-g502ffb1f389011
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117424 --- Comment #10 from Andrew Macleod --- It also works if I use --disable-tree-cunroll which makes me suspicious of cunroll. It also works if I use --fdisable-tree-evrp --fdisable-tree-vrp1 So I thought perhaps VRP is doing something to mess up cunroll.. however when I look at the difference coming out of the pass before cunroll with the 2 sets of options, the only code differences are negligible. And yet cunroll makes some very very different decisions. In theory that leaves global values which are exported by EVRP/VRP. Given that turning off VRP1 by itself still causes the failure, that would imply either a global exported by EVRP, and/or the utilization of SCEV by a VRP pass impacting a global perhaps. I then tried it with simply: --param=vrp-block-limit=1 which invokes fast VRP, and does not then utilize SCEV/loop analysis. This also passes on its own, while setting a number of the same globals. (albeit will some different values due to the lack of loop info) I don't see a VRP pass doing anything obviously wrong. Someone who understands the cunroll pass can maybe discover what its making such different decisions (which apparently cause the fault) when disabling evrp and vrp1 (or using fast vrp)
[Bug fortran/118571] UTF-8 output and the A edit descriptor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118571 --- Comment #1 from kargls at comcast dot net --- Trimming the testcase to show to clean up the -fdump-tree-original output. program test use iso_fortran_env implicit none integer, parameter :: ucs4 = selected_char_kind('ISO_10646') character(kind=ucs4, len=1), parameter :: alpha = char(int(z'03B1'), ucs4) character(kind=ucs4, len=1), parameter :: beta = char(int(z'03B2'), ucs4) integer fd character(kind=ucs4,len=:), allocatable :: str fd = output_unit open (fd, encoding='UTF-8') str = alpha // beta // alpha // beta write(fd,'(A1)') str end program % gfcx -o z -fdump-tree-original jj.f90 && ./z αβαβ { struct __st_parameter_dt dt_parm.3; dt_parm.3.common.filename = &"jj.f90"[1]{lb: 1 sz: 1}; dt_parm.3.common.line = 12; dt_parm.3.format = &"(A1)"[1]{lb: 1 sz: 1}; dt_parm.3.format_len = 4; dt_parm.3.common.flags = 4096; dt_parm.3.common.unit = fd; _gfortran_st_write (&dt_parm.3); _gfortran_transfer_character_wide_write (&dt_parm.3, str, .str, 4); _gfortran_st_write_done (&dt_parm.3); }
[Bug analyzer/118498] not diagnostic a leak with analyzer and malloc attribute with free filled in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118498 David Malcolm changed: What|Removed |Added Resolution|--- |WORKSFORME Status|UNCONFIRMED |RESOLVED --- Comment #6 from David Malcolm --- -fanalyzer deliberately doesn't warn about memory leaks within "main": https://godbolt.org/z/94ch4T1Ke given that once you exit main, leaks don't matter. Renaming the function to "not_main" shows it complain correctly about the leak: https://godbolt.org/z/d4GM3EdKr
[Bug analyzer/118500] no diagnostics with strsep(3) and [[gnu::malloc(free)]] attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118500 David Malcolm changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2025-01-20 Status|UNCONFIRMED |NEW --- Comment #5 from David Malcolm --- Thanks for filing this report. There are (at least) three -fanalyzer issues here: (a) false positive about leak of 'my_strdup("f,oo")': https://godbolt.org/z/rKxhfxWGf This is probably due to -fanalyzer getting confused by having both the attribute and a function body. I think there's already a report about this in BZ somewhere. (b) -fanalyzer doesn't "know" about the behavior of strsep beyond "knowing" that it doesn't malloc or free anything internally. Hence it doesn't know that it will advance s to a point within the buffer that's not the start (and hence the later "free" is a bug). (c) With the "s++;" case in comment #4, -fanalyzer doesn't warn about free called on a pointer *within* the buffer; it seems like it should. https://godbolt.org/z/dMaGnTEYs
[Bug d/116632] d_diagnostic_report_diagnostic and non-textual diagnostic output formats (e.g. SARIF)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116632 Iain Buclaw changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #2 from Iain Buclaw --- As per above, I think this is fine, and nothing to fix here.
[Bug other/116613] RFE: support outputting diagnostics in *multiple* formats
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116613 Bug 116613 depends on bug 116632, which changed state. Bug 116632 Summary: d_diagnostic_report_diagnostic and non-textual diagnostic output formats (e.g. SARIF) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116632 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/118572] New: wrong code for expression ((0x80 & c) != 0) && ((0xc0 & c&) == 0x80))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118572 Bug ID: 118572 Summary: wrong code for expression ((0x80 & c) != 0) && ((0xc0 & c&) == 0x80)) Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: peter0x44 at disroot dot org Target Milestone: --- bool test(char c) { return (((0x80 & (c&0xff)) != 0) && ((0xc0 & (c&0xff)) == 0x80)); } This is getting optimized to: test(char): xor eax, eax ret And a warning is emitted about the comparison being tautological: :4:38: warning: comparison is always 0 [-Wtautological-compare] 4 | return (((0x80 & (c&0xff)) != 0) && ((0xc0 & (c&0xff)) == 0x80)); |~~^~~~ But it don't think it is (aside from the first half being able to be optimized out) https://gcc.godbolt.org/z/T8TM6n833 Found in upstream code: https://github.com/raysan5/raylib/blob/master/src/rtext.c#L2121-L2138
[Bug tree-optimization/118572] [15 regression] wrong code for expression ((0x80 & c) != 0) && ((0xc0 & c&) == 0x80))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118572 Sam James changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=118456 Summary|wrong code for expression |[15 regression] wrong code |((0x80 & c) != 0) && ((0xc0 |for expression ((0x80 & c) |& c) == 0x80)) |!= 0) && ((0xc0 & c&) == ||0x80)) Target Milestone|--- |15.0 CC||aoliva at gcc dot gnu.org, ||jakub at gcc dot gnu.org, ||sjames at gcc dot gnu.org --- Comment #1 from Sam James --- See PR118456 but especially Jakub's comment at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118456#c3.
[Bug tree-optimization/118572] [15 regression] wrong code for expression ((0x80 & c) != 0) && ((0xc0 & c) == 0x80))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118572 Andrew Pinski changed: What|Removed |Added CC||pinskia at gcc dot gnu.org --- Comment #2 from Andrew Pinski --- So we have a&0x80 !=0 (or rather a&0x80 == 0x80) and a&0xc0 == 0x80 So it optimize to a&xc0 == 0x80.
[Bug tree-optimization/118572] [15 regression] wrong code for expression ((0x80 & c) != 0) && ((0xc0 & c) == 0x80))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118572 --- Comment #3 from Sam James --- ``` __attribute__((noipa)) int test(char c) { return (((0x80 & (c&0xff)) != 0) && ((0xc0 & (c&0xff)) == 0x80)); } int main() { if (test(0x80) == 0) __builtin_abort(); } ```
[Bug c++/118568] New: Diagnosing more dangling references
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118568 Bug ID: 118568 Summary: Diagnosing more dangling references Product: gcc Version: 14.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: barry.revzin at gmail dot com Target Milestone: --- Consider the following simple program which tries to figure out what situations get dangling reference warnings: #include #include template struct X { T t; }; template auto foo(T const&, T const&) -> R; int main() { [[maybe_unused]] auto const& a = foo(1, 2); // warns [[maybe_unused]] auto const& b = foo>(1, 2); // warns [[maybe_unused]] autoc = foo>(1, 2); // does not warn [[maybe_unused]] auto const& d = foo>(1, 2); // does not warn [[maybe_unused]] autoe = foo>(1, 2); // warns [[maybe_unused]] autof = foo>(1, 2); // does not warn } Ideally, I would get a warning on all of these. I'm guessing there are some heuristics hard-coded to try to reduce the false positive rate, since otherwise c, e, and f are basically the same thing but only e warns. Likewise, b and c are the same, since b is just a reference bound to a temporary, but only b warns. clang has an attribute for this specific scenario, [[clang::lifetimebound]]. Annotating the two parameters to foo() means I can get warnings on all six. Which is nice, but also a fairly narrow situation though. There was an LLVM thread once upon a time (https://discourse.llvm.org/t/rfc-lifetime-annotations-for-c/61377/1) to make it more generally useful but I'm not sure what came of it.
[Bug tree-optimization/118565] match.pd patterns with non-leaf useless conversions result in SSA copies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118565 Andrew Pinski changed: What|Removed |Added Depends on||80342 --- Comment #2 from Andrew Pinski --- Isn't this a dup of bug 80342? Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80342 [Bug 80342] useless outermost conversions not fully elided by genmatch generated code
[Bug c++/118566] 'requires' avoids out-class implemention to find inside-class declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118566 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2025-01-20 Keywords||rejects-valid Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski --- Reduced: ``` template concept t = true; template using ty = int; struct test { // Declaration test(auto&& r) requires t, int>; }; // Implemention test::test(auto&& r) requires t, int> { } ```
[Bug c++/118566] 'requires' avoids out-class implemention to find inside-class declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118566 Marek Polacek changed: What|Removed |Added CC||mpolacek at gcc dot gnu.org --- Comment #2 from Marek Polacek --- Doesn't look like a regression.
[Bug target/118560] [15 regression] ICE when building powerpc-unknown-linux-gnu cross-compiler since r15-7008
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118560 Jakub Jelinek changed: What|Removed |Added Summary|[15 regression] ICE when|[15 regression] ICE when |building|building |powerpc-unknown-linux-gnu |powerpc-unknown-linux-gnu |cross-compiler |cross-compiler since ||r15-7008 CC||jakub at gcc dot gnu.org Priority|P3 |P1 --- Comment #3 from Jakub Jelinek --- Started with r15-7008-g9f009e8865cda01310c52f7ec8bdaa3c557a2745 Vlad, could you please have a look? Reproduceable also on powerpc64-linux with -m32 -O1 on #c1. It doesn't ICE, but takes 0m55.731s to compile, while r15-7007 needed 0m0.028s.
[Bug tree-optimization/118569] New: ICE on valid code at -O3 with "-fno-tree-ch -fno-tree-ccp -fno-tree-fre" on x86_64-linux-gnu: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118569 Bug ID: 118569 Summary: ICE on valid code at -O3 with "-fno-tree-ch -fno-tree-ccp -fno-tree-fre" on x86_64-linux-gnu: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zhendong.su at inf dot ethz.ch Target Milestone: --- Here is another variant that still triggers: [598] % gcctk -v Using built-in specs. COLLECT_GCC=gcctk COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/15.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-trunk/configure --disable-bootstrap --enable-checking=yes --prefix=/local/suz-local/software/local/gcc-trunk --enable-sanitizers --enable-languages=c,c++ --disable-werror --enable-multilib Thread model: posix Supported LTO compression algorithms: zlib gcc version 15.0.1 20250120 (experimental) (GCC) [599] % [599] % gcctk -O3 -fno-tree-ch -fno-tree-ccp -fno-tree-fre small.c during GIMPLE pass: cunroll small.c: In function ‘main’: small.c:3:5: internal compiler error: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647 3 | int main() { | ^~~~ 0x26d8716 internal_error(char const*, ...) ../../gcc-trunk/gcc/diagnostic-global-context.cc:517 0xac3780 fancy_abort(char const*, int, char const*) ../../gcc-trunk/gcc/diagnostic.cc:1722 0x900647 check_loop_closed_ssa_def ../../gcc-trunk/gcc/tree-ssa-loop-manip.cc:647 0x13f17b4 check_loop_closed_ssa_bb ../../gcc-trunk/gcc/tree-ssa-loop-manip.cc:661 0x13f33ae verify_loop_closed_ssa(bool, loop*) ../../gcc-trunk/gcc/tree-ssa-loop-manip.cc:697 0x13f33ae verify_loop_closed_ssa(bool, loop*) ../../gcc-trunk/gcc/tree-ssa-loop-manip.cc:681 0x13d9e19 tree_unroll_loops_completely ../../gcc-trunk/gcc/tree-ssa-loop-ivcanon.cc:1623 0x13d9eb5 execute ../../gcc-trunk/gcc/tree-ssa-loop-ivcanon.cc:1727 0x13d9eb5 execute ../../gcc-trunk/gcc/tree-ssa-loop-ivcanon.cc:1717 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. [600] % [600] % cat small.c volatile int a; int b, c, d, e, f, g; int main() { int i = 2, j = 1; k: if (!e) ; else { short l = 1; if (0) m: d = g; f = 0; for (; f < 2; f++) { if (f) for (; j < 2; j++) if (i) goto m; a; if (l) continue; i = 0; while (c) l++; } g = 0; } if (b) { i = 1; goto k; } return 0; }
[Bug tree-optimization/118569] [15 regression] ICE on valid code at -O3 with "-fno-tree-ch -fno-tree-ccp -fno-tree-fre" on x86_64-linux-gnu: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118569 Sam James changed: What|Removed |Added Keywords||ice-on-valid-code Summary|ICE on valid code at -O3|[15 regression] ICE on |with "-fno-tree-ch |valid code at -O3 with |-fno-tree-ccp |"-fno-tree-ch -fno-tree-ccp |-fno-tree-fre" on |-fno-tree-fre" on |x86_64-linux-gnu: in|x86_64-linux-gnu: in |check_loop_closed_ssa_def, |check_loop_closed_ssa_def, |at |at |tree-ssa-loop-manip.cc:647 |tree-ssa-loop-manip.cc:647 Target Milestone|--- |15.0 Version|unknown |15.0 CC||rguenth at gcc dot gnu.org See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=114948, ||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=116796
[Bug d/114434] gdc.test/runnable/test23514.d FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114434 --- Comment #8 from GCC Commits --- The master branch has been updated by Iain Buclaw : https://gcc.gnu.org/g:9ab38952a2033d6d4a8e31c3c4d2ab1a25a406c6 commit r15-7071-g9ab38952a2033d6d4a8e31c3c4d2ab1a25a406c6 Author: Iain Buclaw Date: Mon Jan 20 20:01:03 2025 +0100 d: Fix failing test with 32-bit compiler [PR114434] Since the introduction of gdc.test/runnable/test23514.d, it's exposed an incorrect compilation when adding a 64-bit constant to a link-time address. The current cast to size_t causes a loss of precision, which can result in incorrect compilation. PR d/114434 gcc/d/ChangeLog: * expr.cc (ExprVisitor::visit (PtrExp *)): Get the offset as a dinteger_t rather than a size_t. (ExprVisitor::visit (SymOffExp *)): Likewise.
[Bug tree-optimization/118572] [15 regression] wrong code for expression ((0x80 & c) != 0) && ((0xc0 & c) == 0x80))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118572 --- Comment #4 from Peter Damianov --- It gets optimized correctly with `-funsigned-char`, so I think this is something signedness related.
[Bug libstdc++/118563] [15 regression] libstdc++ incompatible ABI change on riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118563 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org, ||redi at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- If those were added on riscv in a different version from other arches, then it needs to conditionalize it in libstdc++-v3/config/abi/pre/gnu.ver accordingly. CXXABI_1.3.14 { # typeinfo for _Float{16,32,64,128,32x,64x,128x} and # __bf16 _ZTIDF[0-9]*[_bx]; _ZTIPDF[0-9]*[_bx]; _ZTIPKDF[0-9]*[_bx]; _ZTIu6__bf16; _ZTIPu6__bf16; _ZTIPKu6__bf16; } CXXABI_1.3.13; is what it currently has, _ZTIDF[0-9]b were meant for targets with __bf16 support with the standard mangling, _ZTI*u6__bf16 with the arm mangling. So minimal change could be to _ZTIDF[0-9]*[_x]; _ZTIPDF[0-9]*[_x]; _ZTIPKDF[0-9]*[_x]; #ifndef __riscv _ZTIDF[0-9]*b; _ZTIPDF[0-9]*b; _ZTIPKDF[0-9]*b; #endif _ZTIu6__bf16; _ZTIPu6__bf16; _ZTIPKu6__bf16; and add CXXABI_1.3.16 with something in it just for riscv. Will defer details to Jon or riscv maintainers.
[Bug c++/118214] [15 regression] OpenTTD test failure with C++ large initializer since r15-6339-g40f243e9179667
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118214 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #10 from Jakub Jelinek --- Fixed.
[Bug tree-optimization/118569] [15 regression] ICE on valid code at -O3 with "-fno-tree-ch -fno-tree-ccp -fno-tree-fre" on x86_64-linux-gnu: in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:647
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118569 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org Last reconfirmed||2025-01-20 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Summary|[15 regression] ICE on |[15 regression] ICE on |valid code at -O3 with |valid code at -O3 with |"-fno-tree-ch -fno-tree-ccp |"-fno-tree-ch -fno-tree-ccp |-fno-tree-fre" on |-fno-tree-fre" on |x86_64-linux-gnu: in|x86_64-linux-gnu: in |check_loop_closed_ssa_def, |check_loop_closed_ssa_def, |at |at |tree-ssa-loop-manip.cc:647 |tree-ssa-loop-manip.cc:647 ||since r15-80 --- Comment #1 from Jakub Jelinek --- Started with r15-80-g0ade358cd72ffa591dd2f1404765b379bbf709d4
[Bug c++/118528] [15 Regression] Template argument deduction failure with RAW_DATA_CST
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118528 --- Comment #2 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:0b58219fe112c01ff335edf699c4fc69e718c75b commit r15-7069-g0b58219fe112c01ff335edf699c4fc69e718c75b Author: Jakub Jelinek Date: Mon Jan 20 18:00:43 2025 +0100 c++: Handle RAW_DATA_CST in make_tree_vector_from_ctor [PR118528] This is the first bug discovered today with the https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673945.html hack but then turned into proper testcases where embed-21.C FAILed since introduction of optimized #embed support and the other when optimizing large C++ initializers using RAW_DATA_CST. The problem is that the C++ FE calls make_tree_vector_from_ctor and uses that as arguments vector for deduction guide handling. The call.cc code isn't prepared to handle RAW_DATA_CST just about everywhere, so I think it is safer to make sure RAW_DATA_CST only appears in CONSTRUCTOR_ELTS and nowhere else. Thus, the following patch expands the RAW_DATA_CSTs from initializers into multiple INTEGER_CSTs in the returned vector. 2025-01-20 Jakub Jelinek PR c++/118528 * c-common.cc (make_tree_vector_from_ctor): Expand RAW_DATA_CST elements from the CONSTRUCTOR to individual INTEGER_CSTs. * g++.dg/cpp/embed-21.C: New test. * g++.dg/cpp2a/class-deduction-aggr16.C: New test.
[Bug tree-optimization/118570] New: -O2 much faster than -O3 for Romberg's method
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118570 Bug ID: 118570 Summary: -O2 much faster than -O3 for Romberg's method Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: sjames at gcc dot gnu.org Target Milestone: --- Created attachment 60215 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60215&action=edit rom3.c This came up in #gcc on libera where trn reported an oddity, where -O2 is significantly faster than -O3: ``` $ gcc -O2 rom3.c -o rom3 -lm && time ./rom3 rom3.c: In function ‘romberg’: rom3.c:33:1: warning: old-style function definition [-Wold-style-definition] 33 | romberg(a, b, acc) | ^~~ 9558673.398323269560933 real0m0.008s user0m0.002s sys 0m0.006s $ gcc -O3 rom3.c -o rom3 -lm && time ./rom3 rom3.c: In function ‘romberg’: rom3.c:33:1: warning: old-style function definition [-Wold-style-definition] 33 | romberg(a, b, acc) | ^~~ 9558673.398323269560933 real0m1.674s user0m1.658s sys 0m0.004s ``` This is with recent trnuk but I see it on older releases too. -O3 -fno-ipa-cp gives the same performance as -O2.
[Bug testsuite/113425] gcc.dg/fold-copysign-1.c fails on arm since g:7cbe41d35e6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113425 --- Comment #6 from Tamar Christina --- (In reply to Torbjorn SVENSSON from comment #5) > @Tamar: You can see the same fails with 14.2.Rel1 that is available for > download from the Arm webpage. > > I see the following in my gcc.log for Cortex-A7 with -mfloat-abi=hard > -mfpu=nenon: > > gcc.dg/fold-copysign-1.c: pattern found 0 times > FAIL: gcc.dg/fold-copysign-1.c scan-tree-dump-times cddce1 > "__builtin_copysign" 1 > gcc.dg/fold-copysign-1.c: pattern found 2 times > FAIL: gcc.dg/fold-copysign-1.c scan-tree-dump-times cddce1 "= ABS_EXPR" 1 > > And for Cortex-M3/4/7/33/55/85 and Cortex-A7 with -mfloat-abi=soft: > > gcc.dg/fold-copysign-1.c: pattern found 0 times > FAIL: gcc.dg/fold-copysign-1.c scan-tree-dump-times cddce1 "= -" 1 > gcc.dg/fold-copysign-1.c: pattern found 1 times > FAIL: gcc.dg/fold-copysign-1.c scan-tree-dump-times cddce1 "= ABS_EXPR" 2 > It looks like the optimization applied to both of them. So I guess check_effective_target_ifn_copysign needs something other than arm_neon. > > > Pardon me for perhaps a stupid side question, but what happens with the > stack in the "bar" function of this testcase? > > bar: > @ args = 0, pretend = 0, frame = 0 > @ frame_needed = 0, uses_anonymous_args = 0 > @ link register save eliminated. > push{fp}@ 27[c=8 l=4] *push_multi > orr r1, r1, #-2147483648@ 12[c=4 l=4] *iorsi3_insn/0 > ldr fp, [sp], #4@ 30[c=12 l=4] *thumb2_movsi_insn/5 > bx lr @ 31[c=8 l=4] *thumb2_return > > Lets say that there is an interrupt firing when PC points at the "orr" > instruction. Will "fp" have the right value after the handler returns or can > it get corrupted by the interrupt handler in that case? (I'm assuming the > same SP is used for both the execution of "bar" and the handler and that the > handler code does not have any stack corruption on its own.) fp needs to have the same value as otherwise any fp relative addressing would be broken. bar doesn't need a framepointer but it's saving it because the original copysign was a function call and fp never got cleaned up.
[Bug target/118531] [14/15 Regression] aarch64/ins_bitfield_1.c generates INS instructions even for +nosimd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118531 --- Comment #2 from GCC Commits --- The trunk branch has been updated by Richard Sandiford : https://gcc.gnu.org/g:1b8820421488d220a95f651b51175d618063c48c commit r15-7072-g1b8820421488d220a95f651b51175d618063c48c Author: Richard Sandiford Date: Mon Jan 20 19:52:30 2025 + aarch64: Add missing simd requirements for INS [PR118531] In g:b096a6ebe9d9f9fed4c105f6555f724eb32af95c I'd forgotten to gate some uses of INS on TARGET_SIMD. gcc/ PR target/118531 * config/aarch64/aarch64.md (*insv_reg_) (*aarch64_bfi_) (*aarch64_bfidi_subreg_): Add missing simd requirements. gcc/testsuite/ * gcc.target/aarch64/ins_bitfield_1a.c: New test. * gcc.target/aarch64/ins_bitfield_3a.c: Likewise. * gcc.target/aarch64/ins_bitfield_5a.c: Likewise.
[Bug target/118384] unexpected call to __muldi3 generated for riscv target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118384 --- Comment #8 from GCC Commits --- The trunk branch has been updated by Richard Sandiford : https://gcc.gnu.org/g:8edf8b552313951cb4f2f97821ee4b3820c9506b commit r15-7074-g8edf8b552313951cb4f2f97821ee4b3820c9506b Author: Richard Sandiford Date: Mon Jan 20 19:52:31 2025 + vect: Preserve OMP info for conditional stores [PR118384] OMP reductions are lowered into the form: idx = .OMP_SIMD_LANE (simuid, 0); ... oldval = D.anon[idx]; newval = oldval op ...; D.anon[idx] = newval; So if the scalar loop has a {0, +, 1} iv i, idx = i % vf. Despite this wraparound, the vectoriser pretends that the D.anon accesses are linear. It records the .OMP_SIMD_LANE's second argument (val) in the data_reference aux field (-1 - val) and then copies this to the stmt_vec_info simd_lane_access_p field (val + 1). vectorizable_load and vectorizable_store use simd_lane_access_p to detect accesses of this form and suppress the vector pointer increments that would be used for genuine linear accesses. The difference in this PR is that the reduction is conditional, and so the store back to D.anon is recognised as a conditional store pattern. simd_lane_access_p was not being copied across from the original stmt_vec_info to the pattern stmt_vec_info, meaning that it was vectorised as a normal linear store. gcc/ PR tree-optimization/118384 * tree-vectorizer.cc (vec_info::move_dr): Copy STMT_VINFO_SIMD_LANE_ACCESS_P. gcc/testsuite/ PR tree-optimization/118384 * gcc.target/aarch64/pr118384_1.c: New test. * gcc.target/aarch64/pr118384_2.c: Likewise.
[Bug target/118501] [14/15 regression] aarch64: ICE in simplify_context::simplify_subreg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118501 --- Comment #7 from GCC Commits --- The trunk branch has been updated by Richard Sandiford : https://gcc.gnu.org/g:6612b8e55471fabd2071a9637a06d3ffce2b05a6 commit r15-7073-g6612b8e55471fabd2071a9637a06d3ffce2b05a6 Author: Richard Sandiford Date: Mon Jan 20 19:52:31 2025 + aarch64: Fix invalid subregs in xorsign [PR118501] In the testcase, we try to use xorsign on: (subreg:DF (reg:TI R) 8) i.e. the highpart of the TI. xorsign wants to take a V2DF paradoxical subreg of this, which is rightly rejected as a direct operation. In cases like this, we need to force the highpart into a fresh register first. gcc/ PR target/118501 * config/aarch64/aarch64.md (@xorsign3): Use force_lowpart_subreg. gcc/testsuite/ PR target/118501 * gcc.c-torture/compile/pr118501.c: New test.
[Bug d/114434] gdc.test/runnable/test23514.d FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114434 --- Comment #9 from GCC Commits --- The releases/gcc-14 branch has been updated by Iain Buclaw : https://gcc.gnu.org/g:ffa44df6768368dc516c9626ec388a3561c7644f commit r14-11230-gffa44df6768368dc516c9626ec388a3561c7644f Author: Iain Buclaw Date: Mon Jan 20 20:01:03 2025 +0100 d: Fix failing test with 32-bit compiler [PR114434] Since the introduction of gdc.test/runnable/test23514.d, it's exposed an incorrect compilation when adding a 64-bit constant to a link-time address. The current cast to size_t causes a loss of precision, which can result in incorrect compilation. PR d/114434 gcc/d/ChangeLog: * expr.cc (ExprVisitor::visit (PtrExp *)): Get the offset as a dinteger_t rather than a size_t. (ExprVisitor::visit (SymOffExp *)): Likewise. (cherry picked from commit 9ab38952a2033d6d4a8e31c3c4d2ab1a25a406c6)
[Bug target/118384] unexpected call to __muldi3 generated for riscv target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118384 Richard Sandiford changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org --- Comment #9 from Richard Sandiford --- Gah, sorry, the above should have been for PR118348.
[Bug target/118348] [15 Regression] [SVE] HACCKernels seems to miscompile with VLS SVE after 0c5c0c959c2e592b84739f19ca771fa69eb8dfee
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118348 --- Comment #7 from GCC Commits --- The trunk branch has been updated by Richard Sandiford : https://gcc.gnu.org/g:749dcd9ba8466fec5b51dd564cd63424c44f808b commit r15-7076-g749dcd9ba8466fec5b51dd564cd63424c44f808b Author: Richard Sandiford Date: Mon Jan 20 20:05:05 2025 + vect: Preserve OMP info for conditional stores [PR118348] OMP reductions are lowered into the form: idx = .OMP_SIMD_LANE (simuid, 0); ... oldval = D.anon[idx]; newval = oldval op ...; D.anon[idx] = newval; So if the scalar loop has a {0, +, 1} iv i, idx = i % vf. Despite this wraparound, the vectoriser pretends that the D.anon accesses are linear. It records the .OMP_SIMD_LANE's second argument (val) in the data_reference aux field (-1 - val) and then copies this to the stmt_vec_info simd_lane_access_p field (val + 1). vectorizable_load and vectorizable_store use simd_lane_access_p to detect accesses of this form and suppress the vector pointer increments that would be used for genuine linear accesses. The difference in this PR is that the reduction is conditional, and so the store back to D.anon is recognised as a conditional store pattern. simd_lane_access_p was not being copied across from the original stmt_vec_info to the pattern stmt_vec_info, meaning that it was vectorised as a normal linear store. gcc/ PR tree-optimization/118348 * tree-vectorizer.cc (vec_info::move_dr): Copy STMT_VINFO_SIMD_LANE_ACCESS_P. gcc/testsuite/ PR tree-optimization/118348 * gcc.target/aarch64/pr118348_1.c: New test. * gcc.target/aarch64/pr118348_2.c: Likewise.
[Bug fortran/118571] UTF-8 output and the A edit descriptor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118571 --- Comment #2 from kargls at comcast dot net --- Tracing into libgfortran, the bug appears to be in write.c(write_utf8_char4). In particular, the entire string is written due to line 181. The 'src_len' is likely wrong if one has an A edit descriptor with a width. Perhaps, it should be 'w_len < src_len ? w_len : src_len' /* Now process the remaining characters, one at a time. */ for (j = k; j < src_len; j++) {
[Bug rtl-optimization/118067] [15 Regression] ICE: in lra_split_hard_reg_for, at lra-assigns.cc:1860 (unable to find a register to spill) {*lshrhi3_1} with -O -fno-split-wide-types -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118067 --- Comment #17 from GCC Commits --- The releases/gcc-12 branch has been updated by Uros Bizjak : https://gcc.gnu.org/g:9a1efd1ee2509abb93878bd911d8c07143b10e33 commit r12-10920-g9a1efd1ee2509abb93878bd911d8c07143b10e33 Author: Uros Bizjak Date: Mon Jan 20 16:19:43 2025 +0100 i386: Disable SImode/DImode moves from/to mask regs without avx512bw [PR118067] SImode and DImode moves from/to mask registers are valid only with AVX512BW, so mark relevant alternatives in *movsi_internal and *movdi_internal as such. PR target/118067 gcc/ChangeLog: * config/i386/i386.md (*movdi_internal): Disable alternatives from/to mask registers without AVX512BW. (*movsi_internal): Ditto.
[Bug tree-optimization/118572] [15 regression] wrong code for expression ((0x80 & c) != 0) && ((0xc0 & c) == 0x80))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118572 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2025-01-20 Status|UNCONFIRMED |NEW --- Comment #7 from Andrew Pinski --- Confirmed.
[Bug tree-optimization/118572] [15 regression] wrong code for expression ((0x80 & c) != 0) && ((0xc0 & c) == 0x80))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118572 --- Comment #6 from Sam James --- Thanks, I'd meant to do that and forgot.