[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop since g:2efe3a7de0107618397264017fb045f237764cc7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441 Tamar Christina changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org --- Comment #29 from Tamar Christina --- (In reply to rguent...@suse.de from comment #28) > On Mon, 26 Feb 2024, tnfchris at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441 > > > > --- Comment #27 from Tamar Christina --- > > Created attachment 57538 [details] > > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57538&action=edit > > proposed1.patch > > > > proposed patch, this gets the gathers and scatters back. doing regression > > run. > > I don't think this will fly. Well.. I don't really know what the do here I guess. per the discussion on irc, we only used to try gather/scatters when SCEV fails. Now that it succeeds we no longer try using the pattern and try to handle it during vectorizable_load/vectorizable_stores as recognizing the gather/scatters inline through VMAT_GATHER_SCATTER. This works fine for normal gather and scatters but doesn't work for widening gathers and narrowing scatters which only the pattern seems to handle. I don't know how to get this to be detected through get_load_store_type since well, that's very late. among others we've already determined the VF and the unpacks have already been marked relevant. So vectorizable_load/vectorizable_store would have to actively change the IL. So I don't know how widening and narrowing operations are supposed to work here. given that.. I will leave it up to the maintainers I guess.
[Bug target/94789] Failure to take advantage of shift operand semantics to turn subtraction into negate
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94789 Andrew Pinski changed: What|Removed |Added Target|x86_64-*-* i?86-*-* aarch64 |x86_64-*-* i?86-*-* --- Comment #5 from Andrew Pinski --- (In reply to Wilco from comment #4) > AArch64 already generates: > > neg w1, w1 > lsl w0, w0, w1 > ret aarch64 is because it has a pattern to optimize this explictly: (insn 14 9 15 2 (set (reg/i:SI 0 x0) (ashift:SI (reg:SI 108) (minus:QI (const_int 32 [0x20]) (subreg:QI (reg:SI 109) 0 "/app/example.cpp":5:1 744 {*aarch64_ashl_reg_minussi3} (expr_list:REG_DEAD (reg:SI 108) (expr_list:REG_DEAD (reg:SI 109) (nil Which was added in r8-3672-g59abe903987d61 . Maybe the x86_64 backend do a similar thing?
[Bug target/95341] Poor vector_size decomposition when SVE is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95341 Andrew Pinski changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|--- |14.0 See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=112787 --- Comment #5 from Andrew Pinski --- Fixed by r14-6752-ga3ff76278efe00 for GCC 14.
[Bug middle-end/88670] [meta-bug] generic vector extension issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88670 Bug 88670 depends on bug 95341, which changed state. Bug 95341 Summary: Poor vector_size decomposition when SVE is enabled https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95341 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/113441] [14 Regression] Fail to fold the last element with multiple loop since g:2efe3a7de0107618397264017fb045f237764cc7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441 --- Comment #30 from Richard Biener --- The x86 and "emulation" paths handle narrowing/widening during code generation (but yes, the IFN path doesn't). A fix would be to do similar as for the gs_info.decl case in vectorizable_load/store and handle select cases of widening/narrowing (2x) and adjust vect_check_gather_scatter accordingly. That might be against the spirit of how the IFN support was laid out (possibly to be "cleaner"), but I don't see a good way to avoid the very premature (during pattern selection) load/store vectorization choosing for the cases there are multiple possibilities as seen here.
[Bug middle-end/114081] [14 regression] ICE in verify_dominators when building php-8.3.3 (error: dominator of 16 should be 111, not 3) since r14-6822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114081 --- Comment #7 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:8a5d9409584aeb777b06f9c19c7d1a3552d496ad commit r14-9191-g8a5d9409584aeb777b06f9c19c7d1a3552d496ad Author: Richard Biener Date: Mon Feb 26 15:17:43 2024 +0100 tree-optimization/114081 - dominator update for prologue peeling The following implements manual update for multi-exit loop prologue peeling during vectorization. PR tree-optimization/114081 * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Perform manual dominator update for prologue peeling. (vect_do_peeling): Properly update dominators after adding the prologue-around guard. * gcc.dg/vect/vect-early-break_121-pr114081.c: New testcase.
[Bug middle-end/114081] [14 regression] ICE in verify_dominators when building php-8.3.3 (error: dominator of 16 should be 111, not 3) since r14-6822
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114081 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #8 from Richard Biener --- The testcase is now fixed for me.
[Bug target/96463] [SVE] Optimise svld1rq from vectors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96463 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |13.0 Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #2 from Andrew Pinski --- Fixed.
[Bug c++/114114] [11/12/13/14 Regression] Internal compiler error on function-local conditional noexcept
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114114 --- Comment #3 from Yves Bailly --- Due credits to Stefano Bellotti for writing the code that triggers the ICE - I only did the paperwork.
[Bug tree-optimization/114120] add reduction with promotion and then truncation poorly vectorized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114120 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-02-27 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #1 from Richard Biener --- I think I've seen a duplicate for this. We lack a pass replacing an IV (a PHI) based on how that is used outside of the loop. Basically we fail to treat PHIs transparently when folding conversions. This _might_ be sth for IVCANON since I think it doesn't really fit any other pass. It also came up in the context of int f (unsigned *src) { int sum = 0; for (int y = 0; y < 8; y++) { sum += src[y]; } return sum; } which we handle fine in vectorization but still the reduction could be done in 'unsigned' all the way through (and that conversion handling in the vectorizer reduction code is somewhat ugly).
[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121 --- Comment #7 from Richard Biener --- I will have a look.
[Bug target/98532] Use load/store pairs for 2-element vector in memory permutes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98532 Andrew Pinski changed: What|Removed |Added Resolution|--- |FIXED Target Milestone|--- |12.0 Status|NEW |RESOLVED --- Comment #4 from Andrew Pinski --- Fixed by enabling SLP at -O2. Though this could be improved without the SLP. _1 = BIT_FIELD_REF <*a_4(D), 64, 64>; _2 = BIT_FIELD_REF <*a_4(D), 64, 0>; tmp_5 = {_1, _2}; Could be turned into VEC_PERM<*a_4(D), {1, 0}> earlier on. But I doubt that it will matter so much really.
[Bug target/114122] RISC-V: poor code generation in calling convention with vlen > 4096
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114122 Richard Biener changed: What|Removed |Added Keywords||missed-optimization --- Comment #1 from Richard Biener --- Note it's always difficult if you rely on argument passing / return that is outside of the ABI specification for the platform so I'd advise against such interfaces. Instead I'd suggest to go with by-referenece argument/return.
[Bug target/98877] [AArch64] Inefficient code generated for tbl NEON intrinsics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877 --- Comment #7 from Andrew Pinski --- >Maybe the issue is only with arguments now. Actually I think this is still a subreg vs ra issue. (insn 8 5 9 2 (set (subreg:V16QI (reg/v:V2x16QI 100 [ __tab ]) 0) (reg/v:V16QI 102 [ lo ])) -1 (nil)) (insn 9 8 10 2 (set (subreg:V16QI (reg/v:V2x16QI 100 [ __tab ]) 16) (reg/v:V16QI 103 [ hi ])) -1 (nil)) (insn 10 9 11 2 (set (reg:V16QI 101 [ ]) (unspec:V16QI [ (reg/v:V2x16QI 100 [ __tab ]) (reg/v:V16QI 104 [ idx ]) ] UNSPEC_TBL)) "/opt/compiler-explorer/arm64/gcc-trunk-20240227/aarch64-unknown-linux-gnu/lib/gcc/aarch64-unknown-linux-gnu/14.0.1/include/arm_neon.h":19566:43 -1 (nil))
[Bug target/99161] Suboptimal SVE code for ld4/st4 MLA code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99161 Andrew Pinski changed: What|Removed |Added Resolution|--- |FIXED Target Milestone|--- |13.0 Status|UNCONFIRMED |RESOLVED --- Comment #2 from Andrew Pinski --- Fixed in GCC 13.
[Bug target/106694] Redundant move instructions in ARM SVE intrinsics use cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106694 Bug 106694 depends on bug 99161, which changed state. Bug 99161 Summary: Suboptimal SVE code for ld4/st4 MLA code https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99161 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug target/99195] Optimise away vec_concat of 64-bit AdvancedSIMD operations with zeroes in aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99195 --- Comment #20 from Andrew Pinski --- Is there any remaining patterns that need vczle/vczbe added to it? Otherwise please close this as fixed for GCC 14.
[Bug target/100165] fmov could be used to zero out the upper bits instead of movi/zip or movi/ins with __builtin_shuffle and zero vector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100165 --- Comment #5 from Andrew Pinski --- For the ones which produce ins, it should be easy to modify the pattern to emit fmov for those cases, that is `elt == 0`: (define_insn "aarch64_simd_vec_set_zero" [(set (match_operand:VALLS_F16 0 "register_operand" "=w") (vec_merge:VALLS_F16 (match_operand:VALLS_F16 1 "aarch64_simd_imm_zero" "") (match_operand:VALLS_F16 3 "register_operand" "0") (match_operand:SI 2 "immediate_operand" "i")))] "TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0" { int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2]))); operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt); return "ins\\t%0.[%p2], zr"; } )
[Bug target/110411] ICE on simple memcpy test case when allowing generation of vector pair load/store insns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110411 --- Comment #7 from GCC Commits --- The releases/gcc-11 branch has been updated by jeevitha : https://gcc.gnu.org/g:41af48a1750635a72c48a5809e713d9dd14d9655 commit r11-11257-g41af48a1750635a72c48a5809e713d9dd14d9655 Author: Jeevitha Date: Thu Aug 31 05:40:18 2023 -0500 rs6000: Don't allow AltiVec address in movoo & movxo pattern [PR110411] There are no instructions that do traditional AltiVec addresses (i.e. with the low four bits of the address masked off) for OOmode and XOmode objects. The solution is to modify the constraints used in the movoo and movxo pattern to disallow these types of addresses, which assists LRA in resolving this issue. Furthermore, the mode size 16 check has been removed in vsx_quad_dform_memory_operand to allow OOmode and XOmode, and quad_address_p already handles less than size 16. 2023-08-31 Jeevitha Palanisamy gcc/ PR target/110411 * config/rs6000/mma.md (define_insn_and_split movoo): Disallow AltiVec address operands. (define_insn_and_split movxo): Likewise. * config/rs6000/predicates.md (vsx_quad_dform_memory_operand): Remove redundant mode size check. gcc/testsuite/ PR target/110411 * gcc.target/powerpc/pr110411-1.c: New testcase. * gcc.target/powerpc/pr110411-2.c: New testcase. (cherry picked from commit 9ea1248604d7b65009af32103814332f35bd33e2)
[Bug tree-optimization/100745] GCC generates suboptimal assembly from vector extensions on AArch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100745 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-02-27 Component|target |tree-optimization Status|UNCONFIRMED |NEW --- Comment #3 from Andrew Pinski --- ``` # vsum$0_107 = PHI <_47(11), _29(10)> _232 = BIT_FIELD_REF ; _231 = .FMA (_100, _101, _232); _230 = BIT_FIELD_REF ; _229 = .FMA (_234, _235, _230); ... _47 = {_231, _229}; ... ``` Confirmed, I thought I saw this before, basically inside the loop we keep together the generic vector still and this causes stores IIRC.
[Bug rtl-optimization/114044] ICE: in expand_fn_using_insn, at internal-fn.cc:208 with _BitInt() and -O -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114044 --- Comment #5 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:c3c44c01d20b00ab5228f32596153b7f4cbc6036 commit r14-9192-gc3c44c01d20b00ab5228f32596153b7f4cbc6036 Author: Jakub Jelinek Date: Tue Feb 27 09:52:07 2024 +0100 expand: Add trivial folding for bit query builtins at expansion time [PR114044] While it seems a lot of places in various optimization passes fold bit query internal functions with INTEGER_CST arguments to INTEGER_CST when there is a lhs, when lhs is missing, all the removals of such dead stmts are guarded with -ftree-dce, so with -fno-tree-dce those unfolded ifn calls remain in the IL until expansion. If they have large/huge BITINT_TYPE arguments, there is no BLKmode optab and so expansion ICEs, and bitint lowering doesn't touch such calls because it doesn't know they need touching, functions only containing those will not even be further processed by the pass because there are no non-small BITINT_TYPE SSA_NAMEs + the 2 exceptions (stores of BITINT_TYPE INTEGER_CSTs and conversions from BITINT_TYPE INTEGER_CSTs to floating point SSA_NAMEs) and when walking there is no special case for calls with BITINT_TYPE INTEGER_CSTs either, those are for normal calls normally handled at expansion time. So, the following patch adjust the expansion of these 6 ifns, by doing nothing if there is no lhs, and also just in case and user disabled all possible passes that would fold this handles the case of setting lhs to ifn call with INTEGER_CST argument. 2024-02-27 Jakub Jelinek PR rtl-optimization/114044 * internal-fn.def (CLRSB, CLZ, CTZ, FFS, PARITY): Use DEF_INTERNAL_INT_EXT_FN macro rather than DEF_INTERNAL_INT_FN. * internal-fn.h (expand_CLRSB, expand_CLZ, expand_CTZ, expand_FFS, expand_PARITY): Declare. * internal-fn.cc (expand_bitquery, expand_CLRSB, expand_CLZ, expand_CTZ, expand_FFS, expand_PARITY): New functions. (expand_POPCOUNT): Use expand_bitquery. * gcc.dg/bitint-95.c: New test.
[Bug rtl-optimization/114044] ICE: in expand_fn_using_insn, at internal-fn.cc:208 with _BitInt() and -O -fno-tree-dce
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114044 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #6 from Jakub Jelinek --- Fixed.
[Bug target/102171] vget_low_*/vget_high_* intrinsics should become BIT_FIELD_REF during gimple
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102171 Andrew Pinski changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org --- Comment #2 from Andrew Pinski --- I think I am going to implement this (or assign it interally to someone else to implement).
[Bug target/102171] vget_low_*/vget_high_* intrinsics should become BIT_FIELD_REF during gimple
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102171 --- Comment #3 from Tamar Christina --- (In reply to Andrew Pinski from comment #2) > I think I am going to implement this (or assign it interally to someone else > to implement). If you do, please also remove them from arm_neon.h and use the new intrinsics framework. We're gradually trying to get this file empty.
[Bug target/102652] Unnecessary zeroing out of local ARM NEON arrays
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102652 --- Comment #3 from Andrew Pinski --- The zeroing part was fixed in GCC 12.
[Bug fortran/114012] overloaded unary operator called twice
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114012 --- Comment #5 from Alexandre Poux --- Thanks for the quick fix !
[Bug tree-optimization/114074] [11/12/13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r8-343
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114074 --- Comment #8 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:a0b1798042d033fd2cc2c806afbb77875dd2909b commit r14-9193-ga0b1798042d033fd2cc2c806afbb77875dd2909b Author: Richard Biener Date: Mon Feb 26 13:33:21 2024 +0100 tree-optimization/114074 - CHREC multiplication and undefined overflow When folding a multiply CHRECs are handled like {a, +, b} * c is {a*c, +, b*c} but that isn't generally correct when overflow invokes undefined behavior. The following uses unsigned arithmetic unless either a is zero or a and b have the same sign. I've used simple early outs for INTEGER_CSTs and otherwise use a range-query since we lack a tree_expr_nonpositive_p and get_range_pos_neg isn't a good fit. PR tree-optimization/114074 * tree-chrec.h (chrec_convert_rhs): Default at_stmt arg to NULL. * tree-chrec.cc (chrec_fold_multiply): Canonicalize inputs. Handle poly vs. non-poly multiplication correctly with respect to undefined behavior on overflow. * gcc.dg/torture/pr114074.c: New testcase. * gcc.dg/pr68317.c: Adjust expected location of diagnostic. * gcc.dg/vect/vect-early-break_119-pr114068.c: Do not expect loop to be vectorized.
[Bug target/106106] SRA scalarizes structure copies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106106 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=102652, ||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=98877 Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-27 Component|tree-optimization |target
[Bug tree-optimization/114074] [11/12/13 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r8-343
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114074 Richard Biener changed: What|Removed |Added Summary|[11/12/13/14 Regression]|[11/12/13 Regression] wrong |wrong code at -O1 and above |code at -O1 and above on |on x86_64-linux-gnu since |x86_64-linux-gnu since |r8-343 |r8-343 Known to work||14.0 --- Comment #9 from Richard Biener --- Fixed on trunk sofar.
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #28 from Jakub Jelinek --- (In reply to Lukas Grätz from comment #9) > Well it is not my testcase. But I added backtracing and observed that the > printed backtrace is unchanged with your patch. The new > no_return_to_caller(): You haven't tried hard enough. Consider the testcase I've posted to the mailing list, built with -Og -g. It is artificial in that register pressure is increased artificially rather than coming from meaningful code, noipa attribute is used heavily instead of functions being too large or in different TUs, and optimize attribute used instead of the noreturn function sitting in a different library, built there with -O2, while user program say with -Og. extern void abort (void); volatile unsigned v = 0xdeadbeefU; int w; __attribute__((noipa)) void corge (char *p) { (void) p; } __attribute__((noipa)) int foo (int x) { return x; } __attribute__((noipa, noreturn, optimize (2))) void bar (void) { unsigned a = v; unsigned b = v; unsigned c = v; unsigned d = v; unsigned e = v; unsigned f = v; unsigned g = v; unsigned h = v; int i = foo (50); v = a + b + c + d + e + f + g + h; abort (); } __attribute__((noipa)) void baz (int a, int b, int c, int d, int e, int f, int g, int h) { int i = foo (51); if (w) bar (); } __attribute__((noipa)) void qux (void) { int a = foo (42); int b = foo (43); int c = foo (44); int d = foo (45); int e = foo (46); int f = foo (47); int g = foo (48); int h = foo (49); corge (__builtin_alloca (foo (52))); baz (a, b, c, d, e, f, g, h); w++; baz (a, b, c, d, e, f, g, h); baz (a, b, c, d, e, f, g, h); } int main () { qux (); } Before the r14-8470 changes the backtrace on abort was #0 0x77dbd765 in abort () from /lib64/libc.so.6 #1 0x004011ca in bar () at /tmp/1.c:30 #2 0x004011f1 in baz (a=a@entry=42, b=b@entry=43, c=c@entry=44, d=d@entry=45, e=e@entry=46, f=f@entry=47, g=48, h=49) at /tmp/1.c:38 #3 0x004012d8 in qux () at /tmp/1.c:55 #4 0x00401319 in main () at /tmp/1.c:62 The gcc trunk hits the backtrace not possible problem because rbp is clobbered and needed in upper frame CFA computation: #0 0x77dbd765 in abort () from /lib64/libc.so.6 #1 0x004011b0 in bar () at /tmp/1.c:30 #2 0x004011d1 in baz (a=, b=, c=, d=d@entry=-559038737, e=e@entry=-559038737, f=f@entry=-559038737, g=48, h=49) at /tmp/1.c:38 #3 0x004012a9 in qux () at /tmp/1.c:55 Backtrace stopped: previous frame inner to this frame (corrupt stack?) And in the patched gcc (with PR114116 patch to save bp register) backtrace works but several of the values are bogus: #0 0x77dbd765 in abort () from /lib64/libc.so.6 #1 0x004011b1 in bar () at /tmp/1.c:30 #2 0x004011d2 in baz (a=a@entry=42, b=b@entry=43, c=c@entry=44, d=d@entry=-559038737, e=e@entry=-559038737, f=f@entry=-559038737, g=48, h=49) at /tmp/1.c:38 #3 0x004012aa in qux () at /tmp/1.c:55 #4 0x004012e4 in main () at /tmp/1.c:62 So, I think we should limit this to -fno-unwind-tables or maybe -mcmodel=kernel.
[Bug libquadmath/114126] New: A not infinite result of tanq of M_PI_2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114126 Bug ID: 114126 Summary: A not infinite result of tanq of M_PI_2 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: libquadmath Assignee: unassigned at gcc dot gnu.org Reporter: www3.spl at gmail dot com Target Milestone: --- Hi there. I think that this 'bug' was not sent before. I'm obtaining an incorrect result for tanq( M_PI_2q ): tanq( M_PI_2q ) = +2.306323558737156172766198381637374e+34 when it should be infinite.
[Bug libquadmath/114126] A not infinite result of tanq of M_PI_2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114126 --- Comment #1 from Andrew Pinski --- Can you provide a full testcase? And also specify which target are you on?
[Bug target/114098] _tile_loadconfig doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 --- Comment #6 from GCC Commits --- The releases/gcc-11 branch has been updated by H.J. Lu : https://gcc.gnu.org/g:26b1012c26c4b4de0b4561e74b856a7f7d259a48 commit r11-11258-g26b1012c26c4b4de0b4561e74b856a7f7d259a48 Author: H.J. Lu Date: Sun Feb 25 10:21:04 2024 -0800 x86: Properly implement AMX-TILE load/store intrinsics ldtilecfg and sttilecfg take a 512-byte memory block. With _tile_loadconfig implemented as extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _tile_loadconfig (const void *__config) { __asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config))); } GCC sees: (parallel [ (asm_operands/v ("ldtilecfg %X0") ("") 0 [(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars) (const_int -64 [0xffc0])) [1 MEM[(const void * *)&tile_data]+0 S8 A128])] [(asm_input:DI ("m"))] (clobber (reg:CC 17 flags))]) and the memory operand size is 1 byte. As the result, the rest of 511 bytes is ignored by GCC. Implement ldtilecfg and sttilecfg intrinsics with a pointer to XImode to honor the 512-byte memory block. gcc/ChangeLog: PR target/114098 * config/i386/amxtileintrin.h (_tile_loadconfig): Use __builtin_ia32_ldtilecfg. (_tile_storeconfig): Use __builtin_ia32_sttilecfg. * config/i386/i386-builtin.def (BDESC): Add __builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg. * config/i386/i386-expand.c (ix86_expand_builtin): Handle IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG. * config/i386/i386.md (ldtilecfg): New pattern. (sttilecfg): Likewise. gcc/testsuite/ChangeLog: PR target/114098 * gcc.target/i386/amxtile-4.c: New test. (cherry picked from commit 4972f97a265c574d51e20373ddefd66576051e5c)
[Bug target/114098] _tile_loadconfig doesn't work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098 H.J. Lu changed: What|Removed |Added Target Milestone|--- |11.5 Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #7 from H.J. Lu --- Fixed for 11.5, 12.4, 13.3 and 14.
[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121 --- Comment #8 from Richard Biener --- Created attachment 57549 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57549&action=edit prototype fix This is very similar to PR113831. We again have two refs looking seemingly the same: _80 = _109 + 1; _79 = VIEW_CONVERT_EXPR(y)[_80]; _77 = .USUBC (0, _79, _103); and _43 = _109 + 1; _42 = VIEW_CONVERT_EXPR(y)[_43]; _39 = .USUBC (0, _42, _103); so they are structurally entered in the same way into the expression hash table. But since _80 and _43 have different ranges what get_ref_base_and_extent will compute differs - in the case of _109 <= 3 it will make the stmt walking hit the __builtin_memset and record a value number of zero for the expresssion. As we only after that (by bad luck) visit the other reference we successfully look up the existing value from the hashtable during the walk. In the PR113831 the accesses degenerated to a single array element which allowed the fix to work (adjust the expression we put into the hash). But this shows (and I feared that ...) this doesn't work. We either have to make all ranges part of the expression (even if they make a difference in the end) or avoid using ranges alltogether when computing a value for an expression during the walk, most definitely when we walk to different context (but that's hard to specify). Maybe a middle-ground would be to make the get_ref_base_and_extent computed info part of the expression. Like the attached. Lot's of ??? to address though ...
[Bug libquadmath/114126] A not infinite result of tanq of M_PI_2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114126 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #2 from Jakub Jelinek --- Why do you think this is a bug? #include #include #include int main () { _Float128 f = tanf128 (M_PI_2f128); volatile _Float128 g = M_PI_2f128; g = tanf128 (g); char buf[128]; strfromf128 (buf, 128, "%.34a", f); printf ("%s\n", buf); strfromf128 (buf, 128, "%.34a", g); printf ("%s\n", buf); } also prints 0x1.1c46bd57277993a2ee60193c957b00p+114 0x1.1c46bd57277993a2ee60193c957b00p+114 M_PI_2q or M_PI_2f128 is 1.5707963267948966192313216916397513987... while pi/2 with larger precision is I think 1.5707963267948966192313216916397514420... so M_PI_2{q,f128} is rounded down, not up, so no wonder tanq/tanf128 is not inf.
[Bug libquadmath/114126] A not infinite result of tanq of M_PI_2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114126 --- Comment #3 from Jakub Jelinek --- Not to mention that if it would be rounded up (like it happens e.g. in the M_PI_f32 case), you wouldn't get inf either, nor -inf, but some large negative number.
[Bug ada/114127] New: [14 regression] Assert_Failure in nlists.adb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114127 Bug ID: 114127 Summary: [14 regression] Assert_Failure in nlists.adb Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: ada Assignee: unassigned at gcc dot gnu.org Reporter: simon at pushface dot org CC: dkm at gcc dot gnu.org Target Milestone: --- Created attachment 57550 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57550&action=edit Reproducer This was originally found in the equivalent arm-eabi cross compiler. Sources in attached zip file. $ gcc -c framebuffer_ili9341.ads +===GNAT BUG DETECTED==+ | 14.0.1 20240218 (experimental) (x86_64-apple-darwin21) Assert_Failure nlists.adb:952| | Error detected at ili9341-device.adb:44:53 [framebuffer_ili9341.ads:59:4]| | Compiling framebuffer_ili9341.ads| | Please submit a bug report; see https://gcc.gnu.org/bugs/ . | | Use a subject line meaningful to you and us to track the bug.| | Include the entire contents of this bug box in the report. | | Include the exact command that you entered. | | Also include sources listed below. | +==+ Please include these source files with error report Note that list may not be accurate in some cases, so please double check that the problem can still be reproduced with the set of files listed. Consider also -gnatd.n switch (see debug.adb). framebuffer_ili9341.ads hal.ads hal-framebuffer.ads hal-bitmap.ads framebuffer_ltdc.ads stm32.ads stm32-dma2d_bitmap.ads stm32-dma2d.ads memory_mapped_bitmap.ads soft_drawing_bitmap.ads stm32-ltdc.ads stm32-device.ads stm32_svd.ads stm32_svd-sdio.ads stm32-dma.ads stm32_svd-dma.ads stm32-gpio.ads stm32_svd-gpio.ads stm32-exti.ads hal-gpio.ads stm32-adc.ads stm32_svd-adc.ads stm32-usarts.ads hal-uart.ads stm32_svd-usart.ads stm32-spi.ads stm32_svd-spi.ads hal-spi.ads stm32-spi-dma.ads stm32-dma-interrupts.ads stm32-i2s.ads hal-audio.ads stm32-i2c.ads stm32_svd-i2c.ads hal-i2c.ads stm32-i2c-dma.ads stm32-timers.ads stm32-dac.ads stm32_svd-dac.ads stm32-rtc.ads hal-real_time_clock.ads stm32-crc.ads stm32_svd-crc.ads stm32-sdmmc.ads sdmmc_svd_periph.ads hal-sdmmc.ads hal-block_drivers.ads stm32-sdmmc_interrupt.ads ili9341.ads ili9341-device.ads hal-time.ads ili9341-spi_connector.ads ili9341-device.adb ili9341-regs.ads
[Bug c++/114128] New: ice with -fstrub=internal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114128 Bug ID: 114128 Summary: ice with -fstrub=internal Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: dcb314 at hotmail dot com Target Milestone: ---
[Bug middle-end/112938] ice with -fstrub=internal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112938 --- Comment #8 from David Binderman --- (In reply to Alexandre Oliva from comment #7) > Fixed. Seems to have reappeared: $ ~/gcc/results/bin/gcc -c -fstrub=internal bug988.cc bt2_locks.cpp: In function ‘void mcs_lock::spin_while_eq(const volatile std::atomic_bool&, bool)’: bt2_locks.cpp:36:1: error: invalid address operand in ‘mem_ref’ *expected; # VUSE <.MEM_8> expected.8_3 ={v} *expected; during IPA pass: strub bt2_locks.cpp:36:1: internal compiler error: verify_gimple failed 0x11f4a92 verify_gimple_in_cfg(function*, bool, bool) /home/dcb38/gcc/working/gcc/../../trunk.20210101/gcc/tree-cfg.cc:5663 0x1065788 execute_function_todo(function*, void*) /home/dcb38/gcc/working/gcc/../../trunk.20210101/gcc/passes.cc:2088 I would be grateful if someone could confirm what I am seeing here.
[Bug middle-end/113988] during GIMPLE pass: bitintlower: internal compiler error: in lower_stmt, at gimple-lower-bitint.cc:5470
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113988 --- Comment #27 from Jakub Jelinek --- Created attachment 57551 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57551&action=edit gcc14-pr113988.patch Untested fix.
[Bug ada/114127] Assert_Failure in nlists.adb on [] aggregate in generic with pragma Ada_2022
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114127 Eric Botcazou changed: What|Removed |Added Summary|[14 regression] |Assert_Failure in |Assert_Failure in |nlists.adb on [] aggregate |nlists.adb |in generic with pragma ||Ada_2022 CC||ebotcazou at gcc dot gnu.org Ever confirmed|0 |1 Last reconfirmed||2024-02-27 Status|UNCONFIRMED |NEW --- Comment #1 from Eric Botcazou --- Compile with -gnat2022 or use pragma Ada_2022 consistently, but that's not a regression.
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #29 from Lukas Grätz --- (In reply to Jakub Jelinek from comment #28) > (In reply to Lukas Grätz from comment #9) > > Well it is not my testcase. But I added backtracing and observed that the > > printed backtrace is unchanged with your patch. The new > > no_return_to_caller(): > > You haven't tried hard enough. That might be true. > Consider the testcase I've posted to the mailing list, built with -Og -g. > The gcc trunk hits the backtrace not possible problem because rbp is > > clobbered and needed in upper frame CFA computation: Yes, when a backtrace is based on rbp, one needs -fno-omit-frame-pointer. I trusted comment #10 here, as it made sense. > And in the patched gcc (with PR114116 patch to save bp register) backtrace > works but several of the values are bogus: > #2 0x004011d2 in baz (a=a@entry=42, b=b@entry=43, c=c@entry=44, > d=d@entry=-559038737, e=e@entry=-559038737, f=f@entry=-559038737, g=48, > h=49) at /tmp/1.c:38 glibc's backtrace() function and friends only reports function names and addresses. This looks like the gdb bt command. I admit, I did not take a proper look into that before. I belief this could and should be somehow be fixed by adding DWARF info that certain callee-saved registers (= the function parameter values) were overwritten. The corrected backtrace could look something like this: #2 0x004011d2 in baz (a=42, b=43, c=44, d=, e=, f=, g=48, h=49) at /tmp/1.c:38 Some parameters would be , and this would be fine because the code was partially compiled with -O2. It is not unusual to have parameter values in gdb's bt. > So, I think we should limit this to -fno-unwind-tables or maybe > -mcmodel=kernel. Now I am confused. The optimization is limited to -fexceptions. And the documentation of -funwind-tables says "Similar to -fexceptions, except". So shouldn't -funwind-tables behave similar to -fexceptions? I don't see anything kernel-specific here.
[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121 --- Comment #9 from Richard Biener --- Which of course would regress something like int a[16]; int foo (int i) { if (i > 7) return a[i]; else return a[i]; } where we'd no longer hoist as we no longer would value-number the refs the same (that extends to any ref using SSA names with eventually differing ranges). Reverting r9-398 likely isn't the best answer either, it would of course also regress the two valid replacements with zero (the prototype patch preserves those). There's the PR100923 fix (r12-1295-g7a56d3d3e99cc7) which targeted a similar (but even more odd) case with "contextual" PTA info (though there's really no such thing). But it didn't really fix the contextual thing but added re-valueization which in case of vn_reference_lookup_pieces works on value-numbered refs where failure mode is keeping the value-number. The prototype (after fixing it a bit) passes bootstrap but regresses quite some number of testcases (maybe due to ???s present). FAIL: g++.dg/ipa/devirt-20.C -std=gnu++98 scan-tree-dump-not release_ssa "abor t" FAIL: g++.dg/ipa/devirt-20.C -std=gnu++14 scan-tree-dump-not release_ssa "abor t" FAIL: g++.dg/ipa/devirt-20.C -std=gnu++17 scan-tree-dump-not release_ssa "abor t" FAIL: g++.dg/ipa/devirt-20.C -std=gnu++20 scan-tree-dump-not release_ssa "abort" FAIL: g++.dg/opt/pr110879.C -std=gnu++14 scan-tree-dump-not optimized "=s*S*res_(?!S*_M_end_of_storage;)" FAIL: g++.dg/opt/pr110879.C -std=gnu++17 scan-tree-dump-not optimized "=s*S*res_(?!S*_M_end_of_storage;)" FAIL: g++.dg/opt/pr110879.C -std=gnu++20 scan-tree-dump-not optimized "=s*S*res_(?!S*_M_end_of_storage;)" FAIL: g++.dg/pr99966.C -std=gnu++17 scan-tree-dump-not vrp1 "throw" FAIL: g++.dg/pr99966.C -std=gnu++20 scan-tree-dump-not vrp1 "throw" FAIL: g++.dg/vect/pr112961.cc -std=c++98 scan-tree-dump vect "LOOP VECTORIZED" FAIL: g++.dg/vect/pr112961.cc -std=c++14 scan-tree-dump vect "LOOP VECTORIZED" FAIL: g++.dg/vect/pr112961.cc -std=c++17 scan-tree-dump vect "LOOP VECTORIZED" FAIL: g++.dg/vect/pr112961.cc -std=c++20 scan-tree-dump vect "LOOP VECTORIZED" FAIL: g++.dg/vect/pr89653.cc -std=c++98 scan-tree-dump vect "vectorized 1 loops" FAIL: g++.dg/vect/pr89653.cc -std=c++14 scan-tree-dump vect "vectorized 1 loops" FAIL: g++.dg/vect/pr89653.cc -std=c++17 scan-tree-dump vect "vectorized 1 loops" FAIL: g++.dg/vect/pr89653.cc -std=c++20 scan-tree-dump vect "vectorized 1 loops" FAIL: g++.dg/vect/simd-10.cc -std=c++98 scan-tree-dump-times vect "vectorized [1-3] loops" 2 FAIL: g++.dg/vect/simd-10.cc -std=c++14 scan-tree-dump-times vect "vectorized [1-3] loops" 2 FAIL: g++.dg/vect/simd-10.cc -std=c++17 scan-tree-dump-times vect "vectorized [1-3] loops" 2 FAIL: g++.dg/vect/simd-10.cc -std=c++20 scan-tree-dump-times vect "vectorized [1-3] loops" 2 FAIL: gcc.dg/ira-loop-pressure.c scan-rtl-dump loop2_invariant "Decided to move invariant" FAIL: gcc.dg/pr41783.c scan-tree-dump pre "pretmp[^n]* = a_global_var;" FAIL: gcc.dg/pr78138.c (test for warnings, line 23) FAIL: gcc.dg/tree-ssa/ifc-pr69489-1.c scan-tree-dump-times ifcvt "Applying if-conversion" 1 FAIL: gcc.dg/tree-ssa/ifc-pr69489-1.c scan-tree-dump-times ifcvt "Invalid sum of outgoing probabilities 200.0" 1 FAIL: gcc.dg/tree-ssa/ifc-pr69489-1.c scan-tree-dump-times ifcvt "Invalid sum of incoming counts" 1 FAIL: gcc.dg/tree-ssa/ifc-pr69489-2.c scan-tree-dump-times ifcvt "Applying if-conversion" 1 FAIL: gcc.dg/tree-ssa/ifc-pr69489-2.c scan-tree-dump-times ifcvt "Invalid sum of outgoing probabilities 200.0" 1 FAIL: gcc.dg/tree-ssa/ifc-pr69489-2.c scan-tree-dump-times ifcvt "Invalid sum of incoming counts" 1 FAIL: gcc.dg/tree-ssa/loadpre1.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/loadpre10.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/loadpre11.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/loadpre12.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/loadpre13.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/loadpre14.c scan-tree-dump-times pre "Eliminated: 2" 1 FAIL: gcc.dg/tree-ssa/loadpre16.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/loadpre2.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/loadpre21.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/loadpre23.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/loadpre24.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/loadpre25.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/loadpre3.c scan-tree-dump-times pre "Eliminated: 2" 1 FAIL: gcc.dg/tree-ssa/loadpre4.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/loadpre6.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/loadpre6.c scan-tree-dump-times pre "Insertions: 1" 1 FAIL: gcc.dg/tree-ssa/pr21417.c sc
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #30 from Jakub Jelinek --- (In reply to Lukas Grätz from comment #29) > Yes, when a backtrace is based on rbp, one needs -fno-omit-frame-pointer. I > trusted comment #10 here, as it made sense. See PR114116. > glibc's backtrace() function and friends only reports function names and > addresses. This looks like the gdb bt command. I admit, I did not take a > proper look into that before. Yes, it is gdb bt. And it is what people heavily rely on for debugging, if something fails an assertion or aborts etc., they want to figure out why. > I belief this could and should be somehow be fixed by adding DWARF info that > certain callee-saved registers (= the function parameter values) were > overwritten. The corrected backtrace could look something like this: That can be arranged by emitting those .cfi_undefined directives... > #2 0x004011d2 in baz (a=42, b=43, c=44, d=, > e=, f=, g=48, h=49) at /tmp/1.c:38 ... but really will not help users to debug/fix their code. > > So, I think we should limit this to -fno-unwind-tables or maybe > > -mcmodel=kernel. > Now I am confused. The optimization is limited to -fexceptions. And the > documentation of -funwind-tables says "Similar to -fexceptions, except". So > shouldn't -funwind-tables behave similar to -fexceptions? I don't see > anything kernel-specific here. Given that even with -fno-asynchronous-unwind-tables (or -fno-unwind-tables) gcc emits the unwind info, just not into .eh_frame but .debug_frame, we shouldn't disable it just when not emitting .eh_frame, but should just disable it always. There is a reason why it has been rejected years ago. If anything, guard it with some non-default -m* option and explain the consequences to users if they use it. Still, the guarding IMHO should be done on top of the PR114116 change, because having random crashes from backtrace or gdb bt even when user asked for it is a bad idea.
[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121 --- Comment #10 from Jakub Jelinek --- Could we for lookups if range isn't a subset of the found range pretend there was not a match, try to see through definitions again and only if it yields an equivalent result value range it the same? Perhaps even remember the range used in it and in case we find non-subset lookup having the same result union the remembered range?
[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121 --- Comment #11 from Jakub Jelinek --- Shall I try to construct a non-bitint testcase for this?
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 --- Comment #3 from Stefan Schulze Frielinghaus --- This seems to be a bug in the three way comparison introduced with C++20. The bug happens while deciding whether key v2 already exists in the map or not. template constexpr auto lexicographical_compare_three_way(_InputIter1 __first1, _InputIter1 __last1, _InputIter2 __first2, _InputIter2 __last2, _Comp __comp) -> decltype(__comp(*__first1, *__first2)) { // concept requirements __glibcxx_function_requires(_InputIteratorConcept<_InputIter1>) __glibcxx_function_requires(_InputIteratorConcept<_InputIter2>) __glibcxx_requires_valid_range(__first1, __last1); __glibcxx_requires_valid_range(__first2, __last2); using _Cat = decltype(__comp(*__first1, *__first2)); static_assert(same_as, _Cat>); if (!std::__is_constant_evaluated()) if constexpr (same_as<_Comp, __detail::_Synth3way> || same_as<_Comp, compare_three_way>) if constexpr (__is_byte_iter<_InputIter1>) if constexpr (__is_byte_iter<_InputIter2>) { const auto [__len, __lencmp] = _GLIBCXX_STD_A:: __min_cmp(__last1 - __first1, __last2 - __first2); if (__len) { const auto __c = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0; if (__c != 0) return __c; } return __lencmp; } __len equals 1 since both vectors have length 1. However, memcmp should be called with the number of bytes and not the number of elements of the vector. That means memcmp is called with two pointers to MEMs of unsigned shorts 1 and 2 where the high-bytes equal 0 and therefore memcmp returns with 0 on big-endian targets. Ultimately __lencmp is returned which itself equals std::strong_ordering::equal rendering v2 replacing v1. Fixed by diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h index d534e02871f..6ebece315f7 100644 --- a/libstdc++-v3/include/bits/stl_algobase.h +++ b/libstdc++-v3/include/bits/stl_algobase.h @@ -1867,8 +1867,10 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO __min_cmp(__last1 - __first1, __last2 - __first2); if (__len) { + const auto __len_bytes = __len * sizeof (*first1); const auto __c - = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0; + = __builtin_memcmp(&*__first1, &*__first2, __len_bytes) + <=> 0; if (__c != 0) return __c; } Can you give the patch a try?
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #31 from Lukas Grätz --- (In reply to Jakub Jelinek from comment #30) > (In reply to Lukas Grätz from comment #29) > > Yes, when a backtrace is based on rbp, one needs -fno-omit-frame-pointer. I > > trusted comment #10 here, as it made sense. > > See PR114116. > > > glibc's backtrace() function and friends only reports function names and > > addresses. This looks like the gdb bt command. I admit, I did not take a > > proper look into that before. > > Yes, it is gdb bt. And it is what people heavily rely on for debugging, if > something fails an assertion or aborts etc., they want to figure out why. > True. > > I belief this could and should be somehow be fixed by adding DWARF info that > > certain callee-saved registers (= the function parameter values) were > > overwritten. The corrected backtrace could look something like this: > > That can be arranged by emitting those .cfi_undefined directives... > > > #2 0x004011d2 in baz (a=42, b=43, c=44, d=, > > e=, f=, g=48, h=49) at /tmp/1.c:38 > > ... but really will not help users to debug/fix their code. Even when I compile a simple program with gcc -O2 -g: #include int main(int argc, char** argv) { abort(); } I still get an "argc=": (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x77dcd859 in __GI_abort () at abort.c:79 #2 0x00401046 in main (argc=, argv=) at simple.c:4 Yes, for a better debugging, it would be nice if optimised code would just not be optimised... But this goes against optimization. > > > So, I think we should limit this to -fno-unwind-tables or maybe > > > -mcmodel=kernel. > > Now I am confused. The optimization is limited to -fexceptions. And the > > documentation of -funwind-tables says "Similar to -fexceptions, except". So > > shouldn't -funwind-tables behave similar to -fexceptions? I don't see > > anything kernel-specific here. > > Given that even with -fno-asynchronous-unwind-tables (or -fno-unwind-tables) > gcc emits > the unwind info, just not into .eh_frame but .debug_frame, we shouldn't > disable it > just when not emitting .eh_frame, but should just disable it always. > There is a reason why it has been rejected years ago. > If anything, guard it with some non-default -m* option and explain the > consequences to users if they use it. Still, the guarding IMHO should be > done on top of the PR114116 > change, because having random crashes from backtrace or gdb bt even when > user asked for it is a bad idea. Yes, it is a bad idea to have crashes from backtrace or gdb. But when this is only about , I don't see the point about disabling it always.
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #32 from Jakub Jelinek --- (In reply to Lukas Grätz from comment #31) > Even when I compile a simple program with gcc -O2 -g: > > #include > int main(int argc, char** argv) { > abort(); > } > > > I still get an "argc=": Sure, debugging info in optimized code is best effort. > Yes, for a better debugging, it would be nice if optimised code would just > not be optimised... But this goes against optimization. The significant difference between other optimizations and this one is that normal optimizations affect the debuggability of the optimized function. This one affects the debuggability of all callers as well, even if they are compiled in a way that should make them more debuggable. Normally, if debugging optimized code doesn't work out, one can simply rebuild that code with -O0 or -Og to make it more debuggable. Here one would also need to rebuild all the shared libraries it uses.
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 Stefan Schulze Frielinghaus changed: What|Removed |Added CC||jwakely at redhat dot com --- Comment #4 from Stefan Schulze Frielinghaus --- While giving it a second thought maybe something like const auto __len_bytes = __len * std::min (sizeof (*__first1), sizeof (*__first2)); would be more appropriate since AFAICT the types _InputIter1 and _InputIter2 are not related to each other w.r.t. to their pointed size. Maybe Jonathan can shed some light on this?
[Bug libquadmath/114126] A not infinite result of tanq of M_PI_2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114126 --- Comment #4 from Sergio Peña --- (In reply to Jakub Jelinek from comment #2) > Why do you think this is a bug? > #include > #include > #include > > int > main () > { > _Float128 f = tanf128 (M_PI_2f128); > volatile _Float128 g = M_PI_2f128; > g = tanf128 (g); > char buf[128]; > strfromf128 (buf, 128, "%.34a", f); > printf ("%s\n", buf); > strfromf128 (buf, 128, "%.34a", g); > printf ("%s\n", buf); > } > also prints > 0x1.1c46bd57277993a2ee60193c957b00p+114 > 0x1.1c46bd57277993a2ee60193c957b00p+114 > > M_PI_2q or M_PI_2f128 is > 1.5707963267948966192313216916397513987... > while pi/2 with larger precision is I think > 1.5707963267948966192313216916397514420... > so M_PI_2{q,f128} is rounded down, not up, > so no wonder tanq/tanf128 is not inf. Ok. It is posible I was wrong.
[Bug c++/114129] New: Inaccurate error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114129 Bug ID: 114129 Summary: Inaccurate error message Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: Theodore.Papadopoulo at inria dot fr Target Milestone: --- Given the code below struct A { virtual void f() { } }; struct B: public A { void f() override() { } }; The g++ compiler gives the following error: -> g++ -O3 test.cpp test.cpp:6:5: error: ‘f’ declared as function returning a function 6 | void f() override() { } | ^~~~ Technically, it should be 'override' declared as function returning a function. or even maybe that override is a reserved name and cannot be used as a function name... Yet this is much better than clang: -> clang++ -O3 test.cpp test.cpp:6:22: error: expected ';' at end of declaration list 6 | void f() override() { } | ^ | ; 1 error generated.
[Bug libquadmath/114126] A not infinite result of tanq of M_PI_2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114126 --- Comment #5 from Sergio Peña --- (In reply to Jakub Jelinek from comment #2) > Why do you think this is a bug? > #include > #include > #include > > int > main () > { > _Float128 f = tanf128 (M_PI_2f128); > volatile _Float128 g = M_PI_2f128; > g = tanf128 (g); > char buf[128]; > strfromf128 (buf, 128, "%.34a", f); > printf ("%s\n", buf); > strfromf128 (buf, 128, "%.34a", g); > printf ("%s\n", buf); > } > also prints > 0x1.1c46bd57277993a2ee60193c957b00p+114 > 0x1.1c46bd57277993a2ee60193c957b00p+114 > > M_PI_2q or M_PI_2f128 is > 1.5707963267948966192313216916397513987... > while pi/2 with larger precision is I think > 1.5707963267948966192313216916397514420... > so M_PI_2{q,f128} is rounded down, not up, > so no wonder tanq/tanf128 is not inf. Ok. It is posible I was wrong. I have found this question: https://stackoverflow.com/questions/54287492/why-didnt-i-get-tanpi-2-infinty-in-c
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 --- Comment #5 from Jonathan Wakely --- But it's guarded by: if constexpr (__is_byte_iter<_InputIter1>) if constexpr (__is_byte_iter<_InputIter2>) This condition is only supposed to be true when sizeof(*__first1) == 1 and sizeof(*__first2) == 1 We can only use memcmp if we're comparing single bytes as unsigned values (and if the iterators are pointers to contiguous memory, not e.g. segmented iterators like std::deque's, or not even random access iterators, like std::list's). For std::vector we should not use this code at all.
[Bug analyzer/111881] [14 Regression] analyzer: ICE in ensure_closed, at analyzer/constraint-manager.cc:130 with -Ofast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111881 --- Comment #2 from GCC Commits --- The master branch has been updated by David Malcolm : https://gcc.gnu.org/g:43ad6ce60108acc822efcd394b75e270c1996cb5 commit r14-9195-g43ad6ce60108acc822efcd394b75e270c1996cb5 Author: David Malcolm Date: Tue Feb 27 08:36:58 2024 -0500 analyzer: fix ICE on floating-point bounds [PR111881] gcc/analyzer/ChangeLog: PR analyzer/111881 * constraint-manager.cc (bound::ensure_closed): Assert that m_constant has integral type. (range::add_bound): Bail out on floating point constants. gcc/testsuite/ChangeLog: PR analyzer/111881 * c-c++-common/analyzer/conditionals-pr111881.c: New test. Signed-off-by: David Malcolm
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #33 from Lukas Grätz --- (In reply to Jakub Jelinek from comment #32) > (In reply to Lukas Grätz from comment #31) > > Even when I compile a simple program with gcc -O2 -g: > > > > #include > > int main(int argc, char** argv) { > > abort(); > > } > > > > > > I still get an "argc=": > > Sure, debugging info in optimized code is best effort. > > > Yes, for a better debugging, it would be nice if optimised code would just > > not be optimised... But this goes against optimization. > > The significant difference between other optimizations and this one is > that normal optimizations affect the debuggability of the optimized function. > This one affects the debuggability of all callers as well, even if they are > compiled in a way that should make them more debuggable. > Normally, if debugging optimized code doesn't work out, one can simply > rebuild that code with -O0 or -Og to make it more debuggable. > Here one would also need to rebuild all the shared libraries it uses. When the debugger is inside the debuggable -O0 or -Og compiled function, we would see all parameters and current variable values. However, in the bt example, we are in another function. So the parameters are only available at best effort. I just noticed that for my simple.c example above, I get "argc=" even with -Og. However, when breakpoint is somewhere else, (gdb) break main (gdb) run (gdb) bt I get the correct "argc=1". The same applies to your example with "break baz". It is just not guaranteed that gdb is able to reconstruct function parameters when we are in some other function.
[Bug target/114130] New: RISC-V: `__atomic_compare_exchange` does not use sign-extended value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114130 Bug ID: 114130 Summary: RISC-V: `__atomic_compare_exchange` does not use sign-extended value Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: x at maxxsoft dot net Target Milestone: --- GCC 13.2 does not generate sign-extension for value to be compared with the result of `lr.w` instruction in `__atomic_compare_exchange`: https://godbolt.org/z/nafKhPa1Y Code: ``` void foo(uint32_t *p) { uintptr_t x = *(uintptr_t *)p; uint32_t e = !p ? 0 : (uintptr_t)p >> 1; uint32_t d = (uintptr_t)x; __atomic_compare_exchange(p, &e, &d, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED); } ``` Assembly generated by `gcc -O3`: ``` foo: ld a4,0(a0) srlia5,a0,1 1: lr.w a3,0(a0); bne a3,a5,1f; sc.w a2,a4,0(a0); bnez a2,1b; 1: ret ``` Which `a5` should be sign-extended, since the RISC-V ISA manual says `lr.w` returns a sign-extended value in RV64. But `gcc -O3 -fno-delete-null-pointer-checks` generates correct code: ``` foo: ld a4,0(a0) li a5,0 beq a0,zero,.L2 srlia5,a0,1 sext.w a5,a5 .L2: 1: lr.w a3,0(a0); bne a3,a5,1f; sc.w a2,a4,0(a0); bnez a2,1b; 1: ret ``` `gcc -O3 -fno-tree-ter`'s output is slight different, but also sign-extended. `clang -O3` always generates correct code: ``` foo:# @foo lw a1, 0(a0) srlia2, a0, 1 sext.w a2, a2 .LBB0_1:# =>This Inner Loop Header: Depth=1 lr.wa3, (a0) bne a3, a2, .LBB0_3 sc.wa4, a1, (a0) bneza4, .LBB0_1 .LBB0_3: ret ```
[Bug analyzer/111881] [14 Regression] analyzer: ICE in ensure_closed, at analyzer/constraint-manager.cc:130 with -Ofast
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111881 David Malcolm changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #3 from David Malcolm --- Should be fixed by above patch.
[Bug c++/114129] Inaccurate error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114129 Jonathan Wakely changed: What|Removed |Added Keywords||diagnostic --- Comment #1 from Jonathan Wakely --- (In reply to Theodore.Papadopoulo from comment #0) > Technically, it should be 'override' declared as function returning a > function. No, GCC is correct here, according to the grammar ... or as much as you can reason about how the grammar applies to code that doesn't conform to the grammar. The function declarator is `f() override` and the return type is `void()`. You get exactly the same error without the override: o.cc:6:5: error: ‘f’ declared as function returning a function 6 | void f()() { } | ^~~~ > or even maybe that override is a reserved name and cannot be used as a > function name... That would definitely be wrong though. It's not reserved, it's "an identifier with special meaning", and `void override() { }` is perfectly valid as a function declaration.
[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121 --- Comment #12 from Richard Biener --- (In reply to Jakub Jelinek from comment #11) > Shall I try to construct a non-bitint testcase for this? That would be nice, more coverage is always good.
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #34 from Jakub Jelinek --- Best effort are the whatever@entry values, that is used if an argument is no longer used across the function call and isn't stored in any call saved register or stack slot. There can be also automatic variables which are live across the call (note, even if you have noreturn function lower in the call stack, its caller e.g. could just call the noreturn function conditionally or could be from a different translation unit in which the callee is not declared noreturn, and any caller up in the call stack then won't have noreturn calls). If you have int x = fn1 (...); fn2 (...); // This function conditionally calls a noreturn function fn3 (x); then typically x will be in callee saved register (unless we run out of them), it isn't best effort in there, the debug info just says that say x lives in %ebx register, it doesn't say it might be in that register. Now, when you up from the noreturn function to this frame, gdb won't be able to restore the register (if .cfi_undefined is emitted), or right now just can have completely bogus values.
[Bug libgcc/114131] New: std::isinf(std::float128_t) generates superfluous nan-checks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114131 Bug ID: 114131 Summary: std::isinf(std::float128_t) generates superfluous nan-checks Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgcc Assignee: unassigned at gcc dot gnu.org Reporter: g.peterh...@t-online.de Target Milestone: --- please see https://godbolt.org/z/djc9q1vcv test1(default): includes nan-checks (__unordtf2) test2: no nan-checks, but calls __eqtf2 test3: only checks for inf (via bit_cast); no additional function calls + branchfree. Of course, this only works if (unsigned) __int128 is available. thx Gero
[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121 --- Comment #13 from Richard Biener --- (In reply to Jakub Jelinek from comment #10) > Could we for lookups if range isn't a subset of the found range pretend > there was not a match, try to see through definitions again and only if it > yields an equivalent result value range it the same? Perhaps even remember > the range used in it and in case we find non-subset lookup having the same > result union the remembered range? So pretend we record the first match with using a range that improves the result in the hashtable. Then, when looking up the second ref we hit the hashtable entry, see it has an incompatible range so we can't use the recorded value. We can then easily only ignore the entry (the prototype patch does this). As we can't easily tell whether we used any (or even which) range without doing multiple lookups for each ref and comparing the result "re-doing" things wouldn't work. But for determining two refs are equivalent it might be enough to avoid recording any kind of range for when the value was "varying". The value of such hashtable entry would be usable even by lookups with narrower range (but also not yielding any better "constant" value). I'm trying to improve things this way.
[Bug libgcc/114131] std::isinf(std::float128_t) generates superfluous nan-checks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114131 Jonathan Wakely changed: What|Removed |Added Last reconfirmed||2024-02-27 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Keywords||missed-optimization
[Bug target/114004] GCC emits a superfluous instruction for simple test case on ppc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114004 Surya Kumari Jangala changed: What|Removed |Added Status|NEW |ASSIGNED
[Bug target/114132] New: [avr] Code sets up a frame pointer without need
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114132 Bug ID: 114132 Summary: [avr] Code sets up a frame pointer without need Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- $ avr-gcc -S -Os -mmcu=attiny40 of void funcab_c (long x, char c) { } sets up a frame-pointer without need. Arguments x and c occupy all of the argument registers R25..R20, so that no arg registers are left. Then there is this implementation of TARGET_FRAME_POINTER_REQUIRED in avr.cc: static bool avr_frame_pointer_required_p (void) { return (cfun->calls_alloca || cfun->calls_setjmp || cfun->has_nonlocal_label || crtl->args.info.nregs == 0 || get_frame_size () > 0); } Problem is that crtl->args.info.nregs == 0 does not discriminate between need for arg pointer and no need for arg pointer (but all arg regs are used up, like in the example).
[Bug target/114132] [avr] Code sets up a frame pointer without need
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114132 Georg-Johann Lay changed: What|Removed |Added Target Milestone|--- |14.0 Priority|P3 |P4 Target||avr
[Bug libstdc++/114103] FAIL: 29_atomics/atomic/lock_free_aliases.cc -std=gnu++20 (test for excess errors)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114103 Jonathan Wakely changed: What|Removed |Added Keywords||patch Last reconfirmed||2024-02-27 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #7 from Jonathan Wakely --- Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646619.html
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #35 from Jakub Jelinek --- If I hand edit the gcc trunk + PR114116 patch assembly, add to bar + .cfi_undefined 3 + .cfi_undefined 12 + .cfi_undefined 13 + .cfi_undefined 14 + .cfi_undefined 15 then bt in gdb shows #2 0x004011d2 in baz (a=a@entry=42, b=b@entry=43, c=c@entry=44, d=, e=, f=, g=48, h=49) at /tmp/1.c:38 and everything in qux live across the call is as well, (gdb) p $r12 $10 = etc. while without that (gdb) p a $1 = (gdb) p b $2 = (gdb) p c $3 = (gdb) p d $4 = -559038737 (gdb) p e $5 = -559038737 (gdb) p f $6 = -559038737 (gdb) p g $7 = -559038737 (gdb) p h $8 = -559038737 (gdb) p $r12 $9 = 3735928559
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 --- Comment #6 from Stefan Schulze Frielinghaus --- Guard __is_byte_iter checks for contiguous bytes which I guess is fine for std::vector and then checks for __is_memcmp_ordered which is fine for big-endian targets in conjunction with unsigned integers. From cpp_type_traits.h we have: // Whether memcmp can be used to determine ordering for a type // e.g. in std::lexicographical_compare or three-way comparisons. // True for unsigned integer-like types where comparing each byte in turn // as an unsigned char yields the right result. This is true for all // unsigned integers on big endian targets, but only unsigned narrow // character types (and std::byte) on little endian targets. template::__value #else __is_byte<_Tp>::__value #endif Thus using memcmp here is fine, however, I'm still a bit unsure whether we really have to take the minimum of *__first1 and *__first2 since I haven't found any size-relation between those types.
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 --- Comment #7 from Jonathan Wakely --- Ohhh, I forgot I did that, sorry! Yeah, the memcmp code wasn't updated to match the different behaviour of __is_byte_iter for BE. We can't use memcmp if the sizes are different. We don't want to use the min, we want to guard that code with the sizes being the same, then we can just use len*sizeof(*first1) because we know it's the same as sizeof(*first2).
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 --- Comment #8 from Jonathan Wakely --- --- a/libstdc++-v3/include/bits/stl_algobase.h +++ b/libstdc++-v3/include/bits/stl_algobase.h @@ -1824,8 +1824,9 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO } #if __cpp_lib_three_way_comparison - // Iter points to a contiguous range of unsigned narrow character type - // or std::byte, suitable for comparison by memcmp. + // Iter points to a contiguous range of unsigned narrow character type, + // or std::byte, or big-endian unsigned integers, suitable for comparison + // by memcmp. template concept __is_byte_iter = contiguous_iterator<_Iter> && __is_memcmp_ordered>::__value; @@ -1879,14 +1880,16 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO if constexpr (same_as<_Comp, __detail::_Synth3way> || same_as<_Comp, compare_three_way>) if constexpr (__is_byte_iter<_InputIter1>) - if constexpr (__is_byte_iter<_InputIter2>) + if constexpr (__is_byte_iter<_InputIter2> + && sizeof(*__first1) == sizeof(*__first2)) { const auto [__len, __lencmp] = _GLIBCXX_STD_A:: __min_cmp(__last1 - __first1, __last2 - __first2); if (__len) { + const auto __blen = __len * sizeof(*__first1); const auto __c - = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0; + = __builtin_memcmp(&*__first1, &*__first2, __blen) <=> 0; if (__c != 0) return __c; }
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 --- Comment #9 from Stefan Schulze Frielinghaus --- (In reply to Jonathan Wakely from comment #7) > We can't use memcmp if the sizes are different. We don't want to use the > min, we want to guard that code with the sizes being the same, then we can > just use len*sizeof(*first1) because we know it's the same as > sizeof(*first2). Hehe I was about to add another comment. I just confused myself with taking the minimum but we rather need to ensure that we are walking over same sized integers. LGTM
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 --- Comment #10 from Jonathan Wakely --- Oh I already defined a __is_memcmp_ordered_with trait, which does the same-size check. I think that's what should be used here.
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 --- Comment #11 from Jonathan Wakely --- --- a/libstdc++-v3/include/bits/stl_algobase.h +++ b/libstdc++-v3/include/bits/stl_algobase.h @@ -1824,11 +1824,14 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO } #if __cpp_lib_three_way_comparison - // Iter points to a contiguous range of unsigned narrow character type - // or std::byte, suitable for comparison by memcmp. - template -concept __is_byte_iter = contiguous_iterator<_Iter> - && __is_memcmp_ordered>::__value; + // Both iterators refer to contiguous ranges of unsigned narrow characters, + // or std::byte, or big-endian unsigned integers, suitable for comparison + // using memcmp. + template +concept __memcmp_ordered_with + = (__is_memcmp_ordered_with, + iter_value_t<_Iter2>>::__value) + && contiguous_iterator<_Iter1> && contiguous_iterator<_Iter2>; // Return a struct with two members, initialized to the smaller of x and y // (or x if they compare equal) and the result of the comparison x <=> y. @@ -1878,20 +1881,20 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO if (!std::__is_constant_evaluated()) if constexpr (same_as<_Comp, __detail::_Synth3way> || same_as<_Comp, compare_three_way>) - if constexpr (__is_byte_iter<_InputIter1>) - if constexpr (__is_byte_iter<_InputIter2>) - { - const auto [__len, __lencmp] = _GLIBCXX_STD_A:: - __min_cmp(__last1 - __first1, __last2 - __first2); - if (__len) - { - const auto __c - = __builtin_memcmp(&*__first1, &*__first2, __len) <=> 0; - if (__c != 0) - return __c; - } - return __lencmp; - } + if constexpr (__memcmp_ordered_with<_InputIter1, _InputIter2>) + { + const auto [__len, __lencmp] = _GLIBCXX_STD_A:: + __min_cmp(__last1 - __first1, __last2 - __first2); + if (__len) + { + const auto __blen = __len * sizeof(*__first1); + const auto __c + = __builtin_memcmp(&*__first1, &*__first2, __blen) <=> 0; + if (__c != 0) + return __c; + } + return __lencmp; + } while (__first1 != __last1) {
[Bug modula2/114133] New: problem passing a string pointer to a C function on solaris 32 bit and 64 bit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114133 Bug ID: 114133 Summary: problem passing a string pointer to a C function on solaris 32 bit and 64 bit Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: modula2 Assignee: gaius at gcc dot gnu.org Reporter: gaius at gcc dot gnu.org Target Milestone: --- This is a follow on from: Bug 114026 - incorrect location during for loop type check which occurred when I added a new testcase and was reported as failing: Two of the new tests FAIL on 32 and 64-bit Solaris/SPARC: +FAIL: gm2/extensions/run/pass/callingc10.mod execution, -O +FAIL: gm2/extensions/run/pass/callingc10.mod execution, -O -g +FAIL: gm2/extensions/run/pass/callingc10.mod execution, -O3 -fomit-frame-point er +FAIL: gm2/extensions/run/pass/callingc10.mod execution, -O3 -fomit-frame-point er -finline-functions +FAIL: gm2/extensions/run/pass/callingc10.mod execution, -Os +FAIL: gm2/extensions/run/pass/callingc10.mod execution, -g +FAIL: gm2/extensions/run/pass/callingc11.mod execution, -O +FAIL: gm2/extensions/run/pass/callingc11.mod execution, -O -g +FAIL: gm2/extensions/run/pass/callingc11.mod execution, -O3 -fomit-frame-pointer +FAIL: gm2/extensions/run/pass/callingc11.mod execution, -O3 -fomit-frame-pointer -finline-functions +FAIL: gm2/extensions/run/pass/callingc11.mod execution, -Os +FAIL: gm2/extensions/run/pass/callingc11.mod execution, -g The failure mode is the same for both: parameter is hello and length 0 executed /var/gcc/regression/master/11.4-gcc/build/gcc/testsuite/gm2/callingc10.x0 with result fail
[Bug modula2/114026] incorrect location during for loop type check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114026 Gaius Mulley changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED|RESOLVED --- Comment #6 from Gaius Mulley --- ah I'll open up a new PR as this is now a bug relating to passing a pointer string to a C function. For clarity and future searching the new PR follow on is: Bug 114133 - problem passing a string pointer to a C function on solaris 32 bit and 64 bit marking the original FOR loop issue as resolved.
[Bug other/89863] [meta-bug] Issues in gcc that other static analyzers (cppcheck, clang-static-analyzer, PVS-studio) find that gcc misses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89863 Bug 89863 depends on bug 106907, which changed state. Bug 106907 Summary: gcc/config/rs6000/rs6000.cc:23155: strange expression ? https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106907 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug target/106907] gcc/config/rs6000/rs6000.cc:23155: strange expression ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106907 Jeevitha changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #11 from Jeevitha --- Fixed
[Bug target/110320] ELFv2 pc-rel ABI extension allows using r2 as a volatile register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110320 Jeevitha changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #3 from Jeevitha --- Fixed
[Bug target/110411] ICE on simple memcpy test case when allowing generation of vector pair load/store insns
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110411 Jeevitha changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #8 from Jeevitha --- Fixed
[Bug target/100799] Stackoverflow in optimized code on PPC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100799 --- Comment #31 from Peter Bergner --- (In reply to Jakub Jelinek from comment #30) > Either tree parmdef = ssa_default_def (cfun, parm) is NULL, or has_zero_uses > (parmdef). > Not sure if has_zero_uses will work properly after some bbs are converted > from GIMPLE to RTL, but maybe it will, I think the expansion generally > doesn't gsi_remove statements it expands nor calls update_stmt on them. One > could always also just compute in generic code at the start of expansion the > number of unused DECL_HIDDEN_STRING_LENGTH PARM_DECLs at the end of the > argument list, save that as a flag in struct function or where and let the > backends use it from there. Ok, I think that gives us some idea what needs to be done. I'll look for someone in the team to have a look at implementing this workaround. Thanks.
[Bug modula2/114133] problem passing a string pointer to a C function on solaris 32 bit and 64 bit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114133 Gaius Mulley changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Last reconfirmed||2024-02-27 --- Comment #1 from Gaius Mulley --- The gimple IR looks correct, given the input code: MODULE callingc10 ; FROM cvararg IMPORT funcptr ; FROM SYSTEM IMPORT ADR ; BEGIN IF funcptr (1, "hello", 5) = 1 THEN END ; IF funcptr (1, "hello" + " ", 6) = 1 THEN END ; IF funcptr (1, "hello" + " " + "world", 11) = 1 THEN END END callingc10. $ gm2 -g callingc10.mod -c -fdump-ipa-all $ cat callingc10.mod.095i.comdats ... PROC _M2_callingc10_init (INTEGER argc, PROC * argv, PROC * envp) { INTEGER D.670; INTEGER D.669; INTEGER D.668; PROC * _T34.0_1; INTEGER _2; INTEGER _T35.1_3; PROC * _T36.2_4; INTEGER _5; INTEGER _T37.3_6; PROC * _T38.4_7; INTEGER _8; INTEGER _12; INTEGER _16; INTEGER _20; : _T34 = "hello"; _T34.0_1 = _T34; _12 = funcptr (1, _T34.0_1, 5); _2 = _12; _T35 = _2; _T35.1_3 = _T35; ...
[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121 --- Comment #14 from Jakub Jelinek --- Tried __attribute__((noipa)) unsigned long foo (unsigned long x) { unsigned long y[128], z = 0, w = 0; y[127] = x; __builtin_memset (&y, 0, 127 * sizeof (long)); for (unsigned long i = 0; i < 128; i += 2) { unsigned long a = y[i], b, c, d; b = __builtin_subcl (0, a, z, &c); z = c; if (i >= 64) { if (i == 64) w = c != 0; else w = (c != 0) | w; } d = i + 1; a = y[d]; b = __builtin_subcl (0, a, z, &c); z = c; if (d > 64) w = (c != 0) | w; } return w; } but that doesn't reproduce it unfortunately.
[Bug modula2/114133] problem passing a string pointer to a C function on solaris 32 bit and 64 bit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114133 --- Comment #2 from Gaius Mulley --- Created attachment 57552 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57552&action=edit Query proposed fix Does this patch fix the problem?
[Bug c++/114134] New: Extra mov instructions for simple function compared with GCC13
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114134 Bug ID: 114134 Summary: Extra mov instructions for simple function compared with GCC13 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: pilarlatiesa at gmail dot com Target Milestone: --- In the example below, the function `Key` has some extra (useless?) mov instructions that are not generated with GCC 13. $ cat borrar.cpp #include struct TVec3D { double x, y, z; }; struct TKey { int i, j, k; }; extern double const BinSize; inline int Index(double const x) { return static_cast(std::floor(static_cast(x / BinSize + 1.0) - 1.0f)); }; TKey Key(TVec3D const &r) { return {Index(r.x), Index(r.y), Index(r.z)}; } $ ./gcc-13/bin/g++ -O3 -march=skylake -fno-trapping-math -S borrar.cpp -o- .file "borrar.cpp" .text .p2align 4 .globl _Z3KeyRK6TVec3D .type _Z3KeyRK6TVec3D, @function _Z3KeyRK6TVec3D: .LFB993: .cfi_startproc vmovsd BinSize(%rip), %xmm1 vmovupd (%rdi), %xmm3 vmovddup.LC1(%rip), %xmm2 vmovddup%xmm1, %xmm0 vdivpd %xmm0, %xmm3, %xmm0 vaddpd %xmm2, %xmm0, %xmm0 vmovq .LC2(%rip), %xmm2 vcvtpd2psx %xmm0, %xmm0 vaddps %xmm2, %xmm0, %xmm0 vroundps$9, %xmm0, %xmm0 vcvttps2dq %xmm0, %xmm4 vmovsd 16(%rdi), %xmm0 vmovq %xmm4, %rax vdivsd %xmm1, %xmm0, %xmm0 vaddsd .LC1(%rip), %xmm0, %xmm0 vcvtsd2ss %xmm0, %xmm0, %xmm0 vsubss .LC3(%rip), %xmm0, %xmm0 vroundss$9, %xmm0, %xmm0, %xmm0 vcvttss2sil %xmm0, %edx movl%edx, %edx ret .cfi_endproc .LFE993: .size _Z3KeyRK6TVec3D, .-_Z3KeyRK6TVec3D .section.rodata.cst8,"aM",@progbits,8 .align 8 .LC1: .long 0 .long 1072693248 .align 8 .LC2: .long -1082130432 .long -1082130432 .section.rodata.cst4,"aM",@progbits,4 .align 4 .LC3: .long 1065353216 .ident "GCC: (GNU) 13.1.0" .section.note.GNU-stack,"",@progbits $ ./gcc-14/bin/g++ -O3 -march=skylake -fno-trapping-math -S borrar.cpp -o- .file "borrar.cpp" .text .p2align 4 .globl _Z3KeyRK6TVec3D .type _Z3KeyRK6TVec3D, @function _Z3KeyRK6TVec3D: .LFB1032: .cfi_startproc vmovsd BinSize(%rip), %xmm2 vmovupd (%rdi), %xmm0 vmovddup%xmm2, %xmm1 vdivpd %xmm1, %xmm0, %xmm0 vmovddup.LC1(%rip), %xmm1 vaddpd %xmm1, %xmm0, %xmm0 vmovq .LC2(%rip), %xmm1 vcvtpd2psx %xmm0, %xmm0 vaddps %xmm1, %xmm0, %xmm0 vroundps$9, %xmm0, %xmm0 vcvttps2dq %xmm0, %xmm0 vmovq %xmm0, %rdx vmovsd 16(%rdi), %xmm0 vdivsd %xmm2, %xmm0, %xmm0 vaddsd .LC1(%rip), %xmm0, %xmm0 vcvtsd2ss %xmm0, %xmm0, %xmm0 vsubss .LC3(%rip), %xmm0, %xmm0 vroundss$9, %xmm0, %xmm0, %xmm0 vcvttss2sil %xmm0, %eax movl%eax, %eax movq%rax, %rdi movq%rdx, %rax movq%rdi, %rdx ret .cfi_endproc .LFE1032: .size _Z3KeyRK6TVec3D, .-_Z3KeyRK6TVec3D .section.rodata.cst8,"aM",@progbits,8 .align 8 .LC1: .long 0 .long 1072693248 .align 8 .LC2: .long -1082130432 .long -1082130432 .section.rodata.cst4,"aM",@progbits,4 .align 4 .LC3: .long 1065353216 .ident "GCC: (GNU) 14.0.0 20240112 (experimental)" .section.note.GNU-stack,"",@progbits
[Bug modula2/114133] problem passing a string pointer to a C function on solaris 32 bit and 64 bit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114133 --- Comment #3 from Gaius Mulley --- At a guess the problem was the ZTyped constant (1 and 5). Now the gimple IR shows these constants as integers: $ cat callingc10.mod.095i.comdats PROC _M2_callingc10_init (INTEGER argc, PROC * argv, PROC * envp) { INTEGER D.676; INTEGER D.675; INTEGER D.674; INTEGER _T35.0_1; PROC * _T36.1_2; INTEGER _T34.2_3; INTEGER _4; INTEGER _T37.3_5; INTEGER _T39.4_6; PROC * _T40.5_7; INTEGER _T38.6_8; INTEGER _9; INTEGER _T41.7_10; INTEGER _T43.8_11; PROC * _T44.9_12; INTEGER _T42.10_13; INTEGER _14; INTEGER _20; INTEGER _26; INTEGER _32; : _T34 = 1; _T35 = 5; _T36 = "hello"; _T35.0_1 = _T35; _T36.1_2 = _T36; _T34.2_3 = _T34; _20 = funcptr (_T34.2_3, _T36.1_2, _T35.0_1); _4 = _20; _T37 = _4; _T37.3_5 = _T37;
[Bug c++/101443] [9/10 Regression] internal compiler error: in wide_int_to_tree_1, at tree.c:1519
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101443 Rafi Wiener changed: What|Removed |Added Status|RESOLVED|CLOSED --- Comment #14 from Rafi Wiener --- thanks
[Bug c++/114013] [14 Regression] Specializations of var templates no longer emitted since r14-8987
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114013 --- Comment #3 from Enrico Seiler --- For -O0 and -O1, this also does not link: template int value; template <> inline int value<1>; void bar(int) { bar(value<1>); } https://godbolt.org/z/Wxv7PE8ob
[Bug tree-optimization/114041] wrong code with _BitInt() and -O -fgraphite-identity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114041 --- Comment #5 from Jakub Jelinek --- Reduced testcase: unsigned a[24], b[24]; __attribute__((noipa)) unsigned foo (unsigned _BitInt(4) x) { for (int i = 0; i < 24; ++i) a[i] = i; unsigned e = __builtin_stdc_bit_ceil (x); for (int i = 0; i < 24; ++i) b[i] = i; return e; } int main () { if (foo (0) != 1) __builtin_abort (); } I have to confirm Andrew's comment, before the graphite dump there was if (x_14(D) > 1) goto ; [59.00%] else goto ; [41.00%] [local count: 17609365]: goto ; [100.00%] [local count: 25340307]: _2 = x_14(D) + 15; _3 = (unsigned int) _2; _4 = __builtin_clz (_3); _5 = 31 - _4; _6 = 2 << _5; iftmp.1_15 = (unsigned int) _6; [local count: 42949672]: # iftmp.1_10 = PHI This isn't part of any kind of loop, it is in between 2 different loops. Graphite hoists some of the statements to bb 2 where it is unconditional: _32 = x_14(D) + 15; _33 = (unsigned int) _32; the rest of it remains after the first loop, but is now unconditional: [count: 0]: _47 = 1; _31 = __builtin_clz (_33); _34 = 31 - _31; _35 = 2 << _34; iftmp.1_36 = (unsigned int) _35; _48 = iftmp.1_36; iftmp.1_37 = _48; In the testcase x is 0, so __builtin_stdc_bit_ceil returns 1, but when we take the > 1 path, it is 2 << (31 - 24) instead. The above feels like what ifcvt would do, if that _47 in there stands for one of the phi arguments and _48 for the other. Except __builtin_clz invokes UB when run on 0 (which is one of the reasons why it was guarded) and there is no conditional merging at the end.
[Bug c++/114135] New: Diagnostic missing useful information for ranges code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114135 Bug ID: 114135 Summary: Diagnostic missing useful information for ranges code Product: gcc Version: 13.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: barry.revzin at gmail dot com Target Milestone: --- This is an example using Ranges: #include #include using namespace std; int main() { auto rng = views::iota(0, 3); const auto [a, b] = * ranges::min_element(views::cartesian_product(rng, rng)); return 0; } This is an ill-formed program, the error given by gcc trunk is: :7:25: error: no match for 'operator*' (operand type is 'std::ranges::borrowed_iterator_t, std::ranges::iota_view > >') 7 | const auto [a, b] = * ranges::min_element(views::cartesian_product(rng, rng)); | ^ This is all correct. However, it would be more helpful in this case for the reader to also note that the type std::ranges::borrowed_iterator_t is actually the type std::ranges::dangling. Seeing "dangling" in the error message makes it a lot easier to understand what the issue here actually is.
[Bug tree-optimization/114041] wrong code with _BitInt() and -O -fgraphite-identity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114041 --- Comment #6 from Jakub Jelinek --- unsigned a[24], b[24]; __attribute__((noipa)) unsigned foo (unsigned char x) { for (int i = 0; i < 24; ++i) a[i] = i; unsigned e = __builtin_stdc_bit_ceil (x); for (int i = 0; i < 24; ++i) b[i] = i; return e; } int main () { if (foo (0) != 1) __builtin_abort (); } works right, but s/unsigned char/unsigned _BitInt(8)/ does not, so it must be something in graphite that handles INTEGER_TYPE and not all integral types.
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #36 from Lukas Grätz --- (In reply to Jakub Jelinek from comment #35) > If I hand edit the gcc trunk + PR114116 patch assembly, add to bar > + .cfi_undefined 3 > + .cfi_undefined 12 > + .cfi_undefined 13 > + .cfi_undefined 14 > + .cfi_undefined 15 > then bt in gdb shows > #2 0x004011d2 in baz (a=a@entry=42, b=b@entry=43, c=c@entry=44, > d=, > e=, f= reading variable: value has been optimized out>, g=48, h=49) at /tmp/1.c:38 I can confirm that. What bothers me, is the wording "d=" and not just "d=". (gdb) run Starting program: bar-artificial-mod Program received signal SIGABRT, Aborted. (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x77dcd859 in __GI_abort () at abort.c:79 #2 0x004011b1 in bar () at bar-artificial.c:30 #3 0x004011d2 in baz (a=a@entry=42, b=b@entry=43, c=c@entry=44, d=, e=, f=, g=48, h=49) at bar-artificial.c:38 #4 0x004012aa in qux () at bar-artificial.c:55 #5 0x004012e4 in main () at bar-artificial.c:62 (gdb) p a No symbol "a" in current context. (gdb) p b No symbol "b" in current context. > and everything in qux live across the call is as well, > (gdb) p $r12 > $10 = > etc. while without that > (gdb) p a > $1 = > (gdb) p b > $2 = > (gdb) p c > $3 = > (gdb) p d > $4 = -559038737 > (gdb) p e > $5 = -559038737 > (gdb) p f > $6 = -559038737 > (gdb) p g > $7 = -559038737 > (gdb) p h > $8 = -559038737 > (gdb) p $r12 > $9 = 3735928559 Where did you set the breakpoint? When I set it somewhere in qux (after a,b,c,... were initialized), I get conclusive results: (gdb) break bar-artificial.c:52 Breakpoint 1 at 0x40124a: file bar-artificial.c, line 52. (gdb) run Breakpoint 1, qux () at bar-artificial.c:52 52corge (__builtin_alloca (foo (52))); (gdb) p a $1 = 42 (gdb) p b $2 = 43 (gdb) p c $3 = 44 (gdb) p d $4 = 45 (gdb) p e $5 = 46 (gdb) p f $6 = 47 (gdb) p g $7 = 48 (gdb) p h $8 = 49 (gdb) p $r12 $9 = 46
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #37 from Jakub Jelinek --- Nowhere, just run and when it stops due to abort, just up several times until reaching the appropriate frame.
[Bug middle-end/114136] New: wrong code for c23 fully anonymous arg lists on arm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114136 Bug ID: 114136 Summary: wrong code for c23 fully anonymous arg lists on arm Product: gcc Version: 13.1.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: rearnsha at gcc dot gnu.org Target Milestone: --- Target: arm On arm, a fully anonymous c23-style function is called incorrectly. All arguments are passed on the stack while the receiving function expects r0-r3 to be used for the initial arguments. For example, void f (...); void g() { f (1, 2, 3, 4); } With gcc compiles to: g: push{lr} movsr0, #1 movsr1, #2 sub sp, sp, #20 movsr2, #3 movsr3, #4 stm sp, {r0, r1, r2, r3} // Arguments pushed to stack (wrong) bl f add sp, sp, #20 ldr pc, [sp], #4 When the correct code (eg, as produced by clang) is something like g: mov r0, #1 mov r1, #2 mov r2, #3 mov r3, #4 b f compile with, eg arm-non-eabi-gcc -O2 -c23
[Bug middle-end/114136] wrong code for c23 fully anonymous arg lists on arm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114136 Richard Earnshaw changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2024-02-27
[Bug middle-end/114136] wrong code for c23 fully anonymous arg lists on arm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114136 Andrew Pinski changed: What|Removed |Added Keywords||testsuite-fail --- Comment #1 from Andrew Pinski --- The following testcases fail because of this: FAIL: gcc.dg/c23-stdarg-4.c execution test FAIL: gcc.dg/torture/c23-stdarg-split-1a.c -O0 execution test FAIL: gcc.dg/torture/c23-stdarg-split-1a.c -O1 execution test FAIL: gcc.dg/torture/c23-stdarg-split-1a.c -O2 execution test FAIL: gcc.dg/torture/c23-stdarg-split-1a.c -O2 -flto -fno-use-linker-plugin -flto-partition=none execution test FAIL: gcc.dg/torture/c23-stdarg-split-1a.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects execution test FAIL: gcc.dg/torture/c23-stdarg-split-1a.c -O3 -g execution test FAIL: gcc.dg/torture/c23-stdarg-split-1a.c -Os execution test
[Bug modula2/113768] gm2/extensions/run/pass/vararg2.mod FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113768 Gaius Mulley changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-02-27 Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Gaius Mulley --- Thanks this is a duplicate of Bug 114133 (or visa versa).
[Bug target/113871] psrlq is not used for PERM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113871 --- Comment #8 from GCC Commits --- The master branch has been updated by Uros Bizjak : https://gcc.gnu.org/g:15d1dae0d4d1be88d28ad7578a60fd3e36de36d8 commit r14-9198-g15d1dae0d4d1be88d28ad7578a60fd3e36de36d8 Author: Uros Bizjak Date: Tue Feb 27 18:41:24 2024 +0100 i386: psrlq is not used for PERM [PR113871] Also handle V2BF mode. PR target/113871 gcc/ChangeLog: * config/i386/mmx.md (V248FI): Add V2BF mode. (V24FI_32): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr113871-5a.c: New test. * gcc.target/i386/pr113871-5b.c: New test.