[Bug c++/109884] New: __builtin_Xq returns _Float128 instead of __float128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109884 Bug ID: 109884 Summary: __builtin_Xq returns _Float128 instead of __float128 Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: g.peterh...@t-online.de Target Milestone: --- #include #include #include #include #include template inline std::string nameof() { return boost::core::demangle(typeid(Type).name()); } int main() { std::cout << nameof() << std::endl; std::cout << nameof() << std::endl; std::cout << nameof() << std::endl; } compiled with 13 returns the incorrect type _Float128 _Float128 _Float128 with 12 or older gives the correct type __float128 __float128 __float128 regards Gero
[Bug target/109874] [SH] GCC 13's -Os code is 50% bigger than GCC 4's
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109874 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-05-17 Keywords||missed-optimization Status|UNCONFIRMED |NEW Target||sh* --- Comment #2 from Richard Biener --- It looks like the target cannot do arbitrary constant shifts so it benefits from shifting incrementally. Even if that is exposed early enough for CSE the optimal sequences for shifting by 10, 11, 12 and 13 could prevent CSE here. I'm not sure if there are other targets affected but this is a "global" optimization problem which for example also affects optimal power expansion. Generally strength-reduction techniques apply to improve these kind of things, possibly in a machine dependent pass. The regression was likely introduced when merging the shifts at the GIMPLE level without considering the uses of the intermediate values (after the transform the values can be computed in parallel since the dependency chains are shortened)
[Bug c++/109884] __builtin_Xq returns _Float128 instead of __float128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109884 --- Comment #1 from Andrew Pinski --- I think this is expected behavior now.
[Bug c++/109877] Support for clang-style attributes is needed to parse Darwin SDK headers properly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109877 Richard Biener changed: What|Removed |Added Target||*-darwin Version|unknown |14.0 --- Comment #7 from Richard Biener --- can we fixinclude the headers?
[Bug c++/109884] __builtin_Xq returns _Float128 instead of __float128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109884 --- Comment #2 from Andrew Pinski --- _Float128 is the standard specified way of defining these types in c++23 IIRC.
[Bug libgcc/109712] Segmentation fault in linear_search_fdes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109712 --- Comment #9 from Richard Biener --- Yes, using a newer libgcc_s.so.1 or libstdc++.so.6 should work fine - again, unless we end up with mixing static/dynamic parts of the unwinder of different versions.
[Bug c++/109884] __builtin_Xq returns _Float128 instead of __float128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109884 --- Comment #3 from g.peterh...@t-online.de --- But these are different types (even if they are mathematically/behaviorally equivalent) std::is_same_v --> false
[Bug c++/109884] __builtin_Xq returns _Float128 instead of __float128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109884 --- Comment #4 from Andrew Pinski --- OK. And? Q specifies the _Float128 type now. I don't think we had any abi guarantees on the builtins nor on the q literals.
[Bug c++/109884] __builtin_Xq returns _Float128 instead of __float128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109884 --- Comment #5 from Jonathan Wakely --- This changed with r13-2887 when adding _Float128 to C++
[Bug c/102989] Implement C2x's n2763 (_BitInt)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989 --- Comment #41 from Richard Biener --- (In reply to Jakub Jelinek from comment #40) > Created attachment 55094 [details] > gcc14-bitint-wip.patch > > So, on IRC we've agreed with Richi that given the limits we have in the > compiler > (what wide_int/widest_int can represent at most without making the types have > optional arbitrary length indirect payload, what INTEGER_CST can handle > (right > now 255 64-bit limbs) and TYPE_PRECISION limitation (max 65535 precision)) > it would be best to first try to implement _BitInt support with small > BITINT_MAXWIDTH (in particular, what fits into wide_int, which is e.g. on > x86_64 > 575 bits) and only when the implementation of that is complete, attempt to > lift > up some of the limits (start with the wide_int/widest_int one, INTEGER_CST > could > be handled by bumping the 2 counters from 8-bit to 16-bit and killing the > cache, > with that we'd be at 65535 as BITINT_MAXWIDTH and whether we'd want to grow > it > further is a question). > > This patch implements some WIP, as the testcases show, it can already do > something, but doesn't have any of the argument/return value passing code > implemented, nor middle-end needed changes (promoting as much as possible to > small INTEGER_TYPEs early for small BITINT_TYPEs and adding a lowering pass > which will turn the larger ones into loops etc.). Also, wb/uwb constants > aren't > really done yet. Another idea is to have a large BITINT_MAXWIDTH (up to what TYPE_PRECISION supports) but restrict constant folding to the cases we can represent in INTEGER_CST. For the cases where the language requires constant evaluation we'd then sorry (). I think we should be able to handle all-ones encoded and since constant initializers are restricted it should handle most practical cases already.
[Bug debug/109805] LTO affecting -fdebug-prefix-map
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109805 --- Comment #13 from rguenther at suse dot de --- On Tue, 16 May 2023, sergiodj at sergiodj dot net wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109805 > > --- Comment #12 from Sergio Durigan Junior --- > Sorry, I have been busy with other things, but I'm paying attention to the > developments here. > > I still have to test the workaround I suggested (passing -fdebug-prefix-map to > LDFLAGS) more broadly, because I think I may have found at least one scenario > where it doesn't work. Something else that's puzzling me is the fact that I > don't see this behaviour everywhere; some packages do have the expected > DW_AT_comp_dir even after being compiled with LTO enabled. Yeah, it's clearly odd and we lack testsuite coverage completely. Having small testcases that show cases that work and cases that do not would be very useful in understanding the bits and how they do (not) work together properly.
[Bug c++/109877] Support for clang-style attributes is needed to parse Darwin SDK headers properly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109877 --- Comment #8 from Iain Sandoe --- (In reply to Richard Biener from comment #7) > can we fixinclude the headers? 1. not yet (PR105719) - although let us hope we can find a way to do that for more limited cases (I've implemented the consumer code, but the generation and install side is more work). 2. In any event, (especially for 'availability') that would be a huge job (essentially re-writing a significant bunch of framework and /usr/include cases), and keeping up with frequent Xcode / SDK updates would be quite a maintenance burden***. 3. It does not help our downstream to use other projects which make use of these features (in non-SDK sources). *** other options considered: for "closed" SDKs for system versions out of vendor support, I suppose we could just have a script that sed'ed the headers into a replacement, but that is still some machinery to implement. It would be nice to have an open SDK - but that is a huge project in its own right, and likewise would need someone with time to maintain the bleeding edge version
[Bug tree-optimization/109885] New: gcc does not generate movmskps and testps instructions (clang does)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109885 Bug ID: 109885 Summary: gcc does not generate movmskps and testps instructions (clang does) Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch Target Milestone: --- in this simple code (on avx2) int sum(float const * x) { int ret = 0; for (int i=0; i<8; ++i) ret +=(0==x[i]); return ret; } int one(float const * x) { int ret = 0; for (int i=0; i<8; ++i) ret |=(0==x[i]); return ret; } int all(float const * x) { int ret = 1; for (int i=0; i<8; ++i) ret &=(0==x[i]); return ret; } clang uses movmskps and testps instructions, gcc does not see for instance https://godbolt.org/z/r11r8xoYz
[Bug target/109885] gcc does not generate movmskps and testps instructions (clang does)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109885 Andrew Pinski changed: What|Removed |Added Keywords||missed-optimization Component|tree-optimization |target Severity|normal |enhancement
[Bug c++/100052] [11/12/13/14 regression] ICE in compiling g++.dg/modules/xtreme-header-3_b.C after r11-8118
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100052 --- Comment #13 from Jiu Fu Guo --- Pass on trunk, gcc-12, gcc-11 for xtreme-header-* cases: make check-gcc-c++ RUNTESTFLAGS="--target_board=unix'{-m64}' modules.exp=xtreme-header-*" === g++ Summary === # of expected passes72
[Bug c++/101853] [12/13/14 Regression] g++.dg/modules/xtreme-header-5_b.C ICE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101853 --- Comment #14 from Jiu Fu Guo --- Pass on trunk, gcc-12, gcc-11 for xtreme-header-* cases: make check-gcc-c++ RUNTESTFLAGS="--target_board=unix'{-m64}' modules.exp=xtreme-header-*" === g++ Summary === # of expected passes72
[Bug tree-optimization/109868] [13/14 regression] ICE: segmentation fault or ICE in min_value with zero sized bitfield
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868 --- Comment #16 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:78327cf06e6b65fc9c614622c98f6a3f3bfb7784 commit r14-927-g78327cf06e6b65fc9c614622c98f6a3f3bfb7784 Author: Jakub Jelinek Date: Wed May 17 10:15:50 2023 +0200 c++: Don't try to initialize zero width bitfields in zero initialization [PR109868] My GCC 12 change to avoid removing zero-sized bitfields as they are important for ABI and are needed for layout compatibility traits apparently causes zero sized bitfields to be initialized in the IL, which at least in 13+ results in ICEs in the ranger which is upset about zero precision types. I think we could even avoid initializing other unnamed bitfields, but unfortunately !CONSTRUCTOR_NO_CLEARING doesn't mean in the middle-end clearing of padding bits and until we have some new flag that represents the request to clear padding bits, I think it is better to keep zeroing non-zero sized unnamed bitfields. In addition to skipping those fields, I have changed the logic how UNION_TYPEs are handled, the current code was a little bit weird in that e.g. if first non-static data member had error_mark_node type, we'd happily zero initialize the second non-static data member, etc. 2023-05-17 Jakub Jelinek PR c++/109868 * init.cc (build_zero_init_1): Don't initialize zero-width bitfields. For unions only initialize the first FIELD_DECL. * g++.dg/init/pr109868.C: New test.
[Bug sanitizer/109882] sanitizer/common_interface_defs.h bogusly defines __has_feature
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109882 --- Comment #5 from Jonathan Wakely --- Libstdc++ itself does this: #if __SANITIZE_THREAD__ # define _GLIBCXX_TSAN 1 #elif defined __has_feature # if __has_feature(thread_sanitizer) # define _GLIBCXX_TSAN 1 # endif #endif The sanitizers could do something similar, although it looks like they don't actually need to. The only use of __has_feature in the public API is in asan_interface.h and that could easily be replaced. Then __has_feature can be redefined in the internal headers, which (I assume) aren't meant to be included by user code. Something like this (untested): diff --git a/libsanitizer/include/sanitizer/asan_interface.h b/libsanitizer/include/sanitizer/asan_interface.h index 9bff21c117b..186269ad694 100644 --- a/libsanitizer/include/sanitizer/asan_interface.h +++ b/libsanitizer/include/sanitizer/asan_interface.h @@ -48,7 +48,15 @@ void __asan_poison_memory_region(void const volatile *addr, size_t size); void __asan_unpoison_memory_region(void const volatile *addr, size_t size); // Macros provided for convenience. -#if __has_feature(address_sanitizer) || defined(__SANITIZE_ADDRESS__) +#ifdef __has_feature +#if __has_feature(address_sanitizer) +#define ASAN_DEFINE_REGION_MACROS +#endif +#elif defined(__SANITIZE_ADDRESS__) +#define ASAN_DEFINE_REGION_MACROS +#endif + +#ifdef ASAN_DEFINE_REGION_MACROS /// Marks a memory region as unaddressable. /// /// \note Macro provided for convenience; defined as a no-op if ASan is not @@ -74,6 +82,7 @@ void __asan_unpoison_memory_region(void const volatile *addr, size_t size); #define ASAN_UNPOISON_MEMORY_REGION(addr, size) \ ((void)(addr), (void)(size)) #endif +#undef ASAN_DEFINE_REGION_MACROS /// Checks if an address is poisoned. /// diff --git a/libsanitizer/include/sanitizer/common_interface_defs.h b/libsanitizer/include/sanitizer/common_interface_defs.h index 2f415bd9e85..2f9c83ef74e 100644 --- a/libsanitizer/include/sanitizer/common_interface_defs.h +++ b/libsanitizer/include/sanitizer/common_interface_defs.h @@ -15,11 +15,6 @@ #include #include -// GCC does not understand __has_feature. -#if !defined(__has_feature) -#define __has_feature(x) 0 -#endif - #ifdef __cplusplus extern "C" { #endif diff --git a/libsanitizer/sanitizer_common/sanitizer_internal_defs.h b/libsanitizer/sanitizer_common/sanitizer_internal_defs.h index 98186c429e9..7574dce7f4a 100644 --- a/libsanitizer/sanitizer_common/sanitizer_internal_defs.h +++ b/libsanitizer/sanitizer_common/sanitizer_internal_defs.h @@ -14,6 +14,11 @@ #include "sanitizer_platform.h" +// GCC does not understand __has_feature. +#if !defined(__has_feature) +#define __has_feature(x) 0 +#endif + #ifndef SANITIZER_DEBUG # define SANITIZER_DEBUG 0 #endif
[Bug sanitizer/109882] sanitizer/common_interface_defs.h bogusly defines __has_feature
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109882 --- Comment #6 from Jakub Jelinek --- Looks ok to me. Now how to convince upstream to apply this? (Or we could keep it as LOCAL_PATCHES.)
[Bug c++/109884] __builtin_Xq returns _Float128 instead of __float128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109884 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #6 from Jakub Jelinek --- Note, these builtins aren't standard builtins, but backend registered ones: grep '"__builtin_[a-z]*q["=]' gcc/config/*/* 2>/dev/null gcc/config/i386/i386-builtins.cc: def_builtin_const (0, 0, "__builtin_infq", gcc/config/i386/i386-builtins.cc: decl = add_builtin_function ("__builtin_nanq", ftype, IX86_BUILTIN_NANQ, gcc/config/i386/i386-builtins.cc: decl = add_builtin_function ("__builtin_nansq", ftype, IX86_BUILTIN_NANSQ, gcc/config/i386/i386-builtins.cc: decl = add_builtin_function ("__builtin_fabsq", ftype, IX86_BUILTIN_FABSQ, gcc/config/i386/i386-builtins.cc: decl = add_builtin_function ("__builtin_copysignq", ftype, gcc/config/ia64/ia64.cc: decl = add_builtin_function ("__builtin_infq", ftype, gcc/config/ia64/ia64.cc: decl = add_builtin_function ("__builtin_nanq", ftype, gcc/config/ia64/ia64.cc: decl = add_builtin_function ("__builtin_nansq", ftype, gcc/config/ia64/ia64.cc: decl = add_builtin_function ("__builtin_fabsq", ftype, gcc/config/ia64/ia64.cc: decl = add_builtin_function ("__builtin_copysignq", ftype, gcc/config/pa/pa.cc: decl = add_builtin_function ("__builtin_fabsq", ftype, gcc/config/pa/pa.cc: decl = add_builtin_function ("__builtin_copysignq", ftype, gcc/config/pa/pa.cc: decl = add_builtin_function ("__builtin_infq", ftype, gcc/config/rs6000/rs6000-c.cc: builtin_define ("__builtin_fabsq=__builtin_fabsf128"); gcc/config/rs6000/rs6000-c.cc: builtin_define ("__builtin_copysignq=__builtin_copysignf128"); gcc/config/rs6000/rs6000-c.cc: builtin_define ("__builtin_nanq=__builtin_nanf128"); gcc/config/rs6000/rs6000-c.cc: builtin_define ("__builtin_nansq=__builtin_nansf128"); gcc/config/rs6000/rs6000-c.cc: builtin_define ("__builtin_infq=__builtin_inff128"); and have been that way before as well. Given how they are defined on rs6000, at least there because it is just a macro for the f128 suffixed ones it really has to return _Float128.
[Bug target/109811] libxjl 0.7 is a lot slower in GCC 13.1 vs Clang 16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811 Jan Hubicka changed: What|Removed |Added CC||hubicka at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2023-05-17 --- Comment #4 from Jan Hubicka --- Confirmed. LTO is not necessary to reproduce the differnce. I got libjxl and the test jpeg file from Phoronix testuiste and configure clang build with: CC=clang CXX=clang++ CFLAGS="-O3 -g-march=native -fno-exceptions" CXXFLAGS="-O3 -g -march=native -fno-exceptions" cmake -DCMAKE_C_FLAGS_RELEASE="$CFLAGS -DNDEBUG" -DCMAKE_CXX_FLAGS_RELEASE="$CXXFLAGS -DNDEBUG" -DBUILD_TESTING=OFF .. and CFLAGS="-O3 -g-march=native -fno-exceptions" CXXFLAGS="-O3 -g -march=native -fno-exceptions" cmake -DCMAKE_C_FLAGS_RELEASE="$CFLAGS -DNDEBUG" -DCMAKE_CXX_FLAGS_RELEASE="$CXXFLAGS -DNDEBUG" -DBUILD_TESTING=OFF .. jh@ryzen3:~/.phoronix-test-suite/installed-tests/pts/jpegxl-1.5.0> ./libjxl-0.7.0/build-gcc/tools/cjxl sample-photo-6000x4000.JPG --quality=90 --lossless_jpeg=0 JPEG XL encoder v0.7.0 [AVX2] No output file specified. Encoding will be performed, but the result will be discarded. Read 6000x4000 image, 7837694 bytes, 926.0 MP/s Encoding [Container | VarDCT, d1.000, effort: 7 | 29424-byte Exif], Compressed to 2288431 bytes including container (0.763 bpp). 6000 x 4000, 11.12 MP/s [11.12, 11.12], 1 reps, 16 threads. jh@ryzen3:~/.phoronix-test-suite/installed-tests/pts/jpegxl-1.5.0> ./libjxl-0.7.0/build-gcc/tools/cjxl sample-photo-6000x4000.JPG --quality=90 --lossless_jpeg=0 test JPEG XL encoder v0.7.0 [AVX2] Read 6000x4000 image, 7837694 bytes, 926.5 MP/s Encoding [Container | VarDCT, d1.000, effort: 7 | 29424-byte Exif], Compressed to 2288431 bytes including container (0.763 bpp). 6000 x 4000, 11.09 MP/s [11.09, 11.09], 1 reps, 16 threads. jh@ryzen3:~/.phoronix-test-suite/installed-tests/pts/jpegxl-1.5.0> ./libjxl-0.7.0/build-gcc/tools/cjxl sample-photo-6000x4000.JPG --quality=90 --lossless_jpeg=0 test JPEG XL encoder v0.7.0 [AVX2] Read 6000x4000 image, 7837694 bytes, 925.6 MP/s Encoding [Container | VarDCT, d1.000, effort: 7 | 29424-byte Exif], Compressed to 2288431 bytes including container (0.763 bpp). 6000 x 4000, 11.12 MP/s [11.12, 11.12], 1 reps, 16 threads. jh@ryzen3:~/.phoronix-test-suite/installed-tests/pts/jpegxl-1.5.0> ./libjxl-0.7.0/build-clang/tools/cjxl sample-photo-6000x4000.JPG --quality=90 --lossless_jpeg=0 test JPEG XL encoder v0.7.0 [AVX2] Read 6000x4000 image, 7837694 bytes, 924.6 MP/s Encoding [Container | VarDCT, d1.000, effort: 7 | 29424-byte Exif], Compressed to 2288430 bytes including container (0.763 bpp). 6000 x 4000, 15.17 MP/s [15.17, 15.17], 1 reps, 16 threads. jh@ryzen3:~/.phoronix-test-suite/installed-tests/pts/jpegxl-1.5.0> ./libjxl-0.7.0/build-clang/tools/cjxl sample-photo-6000x4000.JPG --quality=90 --lossless_jpeg=0 test JPEG XL encoder v0.7.0 [AVX2] Read 6000x4000 image, 7837694 bytes, 922.4 MP/s Encoding [Container | VarDCT, d1.000, effort: 7 | 29424-byte Exif], Compressed to 2288430 bytes including container (0.763 bpp). 6000 x 4000, 15.18 MP/s [15.18, 15.18], 1 reps, 16 threads. So GCC does 11MB/s while clang 15MB/s
[Bug tree-optimization/109759] UBSAN error: shift exponent 64 is too large for 64-bit type 'long unsigned int'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109759 Martin Jambor changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #3 from Martin Jambor --- This test now passed with UBSAN instrumented compiler, so probably indeed a dup. *** This bug has been marked as a duplicate of bug 109788 ***
[Bug fortran/109788] [14 Regression] gcc/hwint.h:293:61: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int since r14-377-gc92b8be9b52b7e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109788 --- Comment #13 from Martin Jambor --- *** Bug 109759 has been marked as a duplicate of this bug. ***
[Bug other/63426] [meta-bug] Issues found with -fsanitize=undefined
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63426 Bug 63426 depends on bug 109759, which changed state. Bug 109759 Summary: UBSAN error: shift exponent 64 is too large for 64-bit type 'long unsigned int' https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109759 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE
[Bug ipa/109886] New: UBSAN error: shift exponent 64 is too large for 64-bit type when compiling gcc.c-torture/compile/pr96796.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109886 Bug ID: 109886 Summary: UBSAN error: shift exponent 64 is too large for 64-bit type when compiling gcc.c-torture/compile/pr96796.c Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: ipa Assignee: unassigned at gcc dot gnu.org Reporter: jamborm at gcc dot gnu.org CC: aldyh at gcc dot gnu.org, marxin at gcc dot gnu.org Target Milestone: --- Host: x86_64-linux-gnu Target: x86_64-linux-gnu Bootstrap with undefined behavior sanitizer and subsequent run of the testsuite (on revision ac3a5bbc629, so check for PR 109788 is included) reports a new error when compiling C torture testcase gcc/testsuite/gcc.c-torture/compile/pr96796.c: $ UBSAN_OPTIONS="halt_on_error=1 print_stacktrace=1" /home/mjambor/gcc/mine/b-obj/gcc/xgcc -B/home/mjambor/gcc/mine/b-obj/gcc/ -fdiagnostics-plain-output -O1 -w -fcommon -c -o pr96796.o /home/mjambor/gcc/mine/src/gcc/testsuite/gcc.c-torture/compile/pr96796.c /home/mjambor/gcc/mine/src/gcc/hwint.h:293:61: runtime error: shift exponent 64 is too large for 64-bit type 'long unsigned int' #0 0xbf8117 in sext_hwi(long, unsigned int) /home/mjambor/gcc/mine/src/gcc/hwint.h:293 #1 0xbf8117 in wi::hwi_with_prec::hwi_with_prec(long, unsigned int, signop) /home/mjambor/gcc/mine/src/gcc/wide-int.h:1622 #2 0xbf8117 in wi::shwi(long, unsigned int) /home/mjambor/gcc/mine/src/gcc/wide-int.h:1631 #3 0xbf8117 in wi::minus_one(unsigned int) /home/mjambor/gcc/mine/src/gcc/wide-int.h:1645 #4 0xbf8117 in irange::set_varying(tree_node*) /home/mjambor/gcc/mine/src/gcc/value-range.h:871 #5 0x2257e45 in range_cast(vrange&, tree_node*) /home/mjambor/gcc/mine/src/gcc/range-op.cc:4860 #6 0x1b119a6 in ipa_compute_jump_functions_for_edge /home/mjambor/gcc/mine/src/gcc/ipa-prop.cc:2325 #7 0x1b14f66 in ipa_compute_jump_functions_for_bb /home/mjambor/gcc/mine/src/gcc/ipa-prop.cc:2449 #8 0x1b14f66 in analysis_dom_walker::before_dom_children(basic_block_def*) /home/mjambor/gcc/mine/src/gcc/ipa-prop.cc:3035 #9 0x65a5ff3 in dom_walker::walk(basic_block_def*) /home/mjambor/gcc/mine/src/gcc/domwalk.cc:311 #10 0x1b0e601 in ipa_analyze_node(cgraph_node*) /home/mjambor/gcc/mine/src/gcc/ipa-prop.cc:3103 #11 0x1991487 in inline_indirect_intraprocedural_analysis /home/mjambor/gcc/mine/src/gcc/ipa-fnsummary.cc:4315 #12 0x1991487 in inline_analyze_function(cgraph_node*) /home/mjambor/gcc/mine/src/gcc/ipa-fnsummary.cc:4334 #13 0x1991afc in ipa_fn_summary_generate /home/mjambor/gcc/mine/src/gcc/ipa-fnsummary.cc:4378 #14 0x21351c1 in execute_ipa_summary_passes(ipa_opt_pass_d*) /home/mjambor/gcc/mine/src/gcc/passes.cc:2304 #15 0x10a2163 in ipa_passes /home/mjambor/gcc/mine/src/gcc/cgraphunit.cc:2235 [...]
[Bug c++/109887] New: Different mangled name for template specialization for clang and gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109887 Bug ID: 109887 Summary: Different mangled name for template specialization for clang and gcc Product: gcc Version: 12.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: yedeng.yd at linux dot alibaba.com Target Milestone: --- (This is a duplication of https://github.com/llvm/llvm-project/issues/62765 since I don't know which one is worse) Reproducer: ``` #include namespace llvm { class StringRef { public: StringRef(const char*); }; template class Optional {}; } namespace n { struct S { template std::enable_if_t::value, llvm::Optional> get(llvm::StringRef) const { return {}; } }; template <> llvm::Optional S::get(llvm::StringRef) const; } void use() { n::S().get("hello"); } ``` For the specialization `S::get(llvm::StringRef)`, gcc will mangle it as: ``` _ZNK1n1S3getIbEENSt9enable_ifIXsrSt11is_integralIT_E5valueEN4llvm8OptionalIS4_EEE4typeENS6_9StringRefE ``` and clang will mangle it as: ``` _ZNK1n1S3getIbEENSt9enable_ifIXsr3std11is_integralIT_EE5valueEN4llvm8OptionalIS3_EEE4typeENS4_9StringRefE ``` Also the c++filt can only recognize the name mangled by gcc. And the llvm-cxxfilt can only recognize the name mangled by clang. So I am not sure if this is bug really or this is by design. But I think clang and gcc are trying to make ABI compatible.
[Bug c++/109884] __builtin_Xq returns _Float128 instead of __float128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109884 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Last reconfirmed||2023-05-17 --- Comment #7 from Jakub Jelinek --- Created attachment 55098 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55098&action=edit gcc14-pr109884.patch Untested fix. For ia64, I think it already uses float128t_type_node, for rs6000 as I wrote it is more difficult because it doesn't have the builtins but macros and in pa case, __float128 is the same as long double.
[Bug c++/109884] __builtin_Xq returns _Float128 instead of __float128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109884 --- Comment #8 from Jakub Jelinek --- (In reply to Andrew Pinski from comment #4) > Q specifies the _Float128 type now. No, Q suffix specifies __float128 actually. F128 or f128 specify _Float128.
[Bug target/109811] libxjl 0.7 is a lot slower in GCC 13.1 vs Clang 16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811 --- Comment #5 from Jan Hubicka --- Also forgot to mention, I used zen3 machine. So Raptor lake is not necessary. Note that build systems appends -O2 after any CFLAGS specified, so it really is -O2 build: # Force build with optimizations in release mode. set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -O2") For Clang other options are appended: -fnew-alignment=8 -fno-cxx-exceptions -fno-slp-vectorize -fno-vectorize -disable-free -disable-llvm-verifier Perf profile mixing both GCC and clang build is: 8.36% cjxl libjxl.so.0.7.0 [.] jxl::(anonymous namespace)::FindTextLikePatches ◆ 5.74% cjxl libjxl.so.0.7.0 [.] jxl::FindBestPatchDictionary ▒ 4.51% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::EstimateEntropy ▒ 4.50% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::(anonymous namespace)::TransformFromPixels ▒ 4.25% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::QuantizeBlockAC ▒ 4.10% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::EstimateEntropy ▒ 3.77% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::(anonymous namespace)::TransformFromPixels ▒ 3.46% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::QuantizeBlockAC ▒ 3.08% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::FindBestMultiplier ▒ 3.04% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::FindBestMultiplier ▒ 2.98% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::(anonymous namespace)::DCT1DImpl<8ul, 8ul>::operator() ▒ 2.80% cjxl libjxl.so.0.7.0 [.] jxl::ThreadPool::RunCallState const&, jxl::RectT const&, jxl::WeightsSymmetric5 const&, jxl::ThreadPool*, jxl::Plane*)::{l▒ 2.75% cjxl libjxl.so.0.7.0 [.] jxl::ThreadPool::RunCallState const&, jxl::RectT const&, jxl::WeightsSymmetric5 const&, jxl::ThreadPool*, jxl::Plane*)::$_▒ 2.26% cjxl libjxl.so.0.7.0 [.] jxl::ThreadPool::RunCallState const&, float const*, jxl::ThreadPool*, jxl::Image3*)::$_0>::CallDataFunc ▒ 2.00% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::(anonymous namespace)::DCT1DWrapper<4ul, 4ul, jxl::N_AVX2::(anonymous namespace)::DCTFrom, jxl::N_AVX2::(anonymous namespace)::DCTTo> ▒ 1.95% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::(anonymous namespace)::DCT1DImpl<16ul, 8ul>::operator() ▒ 1.68% cjxl libjxl.so.0.7.0 [.] jxl::ThreadPool::RunCallState, unsigned long, unsigned long, jxl::ColorEncoding const&, unsigned long, bool, unsigned long, JxlEnd▒ 1.68% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::(anonymous namespace)::DCT1DImpl<32ul, 8ul>::operator()
[Bug libgomp/109875] [OpenMP] nteams-var / OMP_NUM_TEAMS → ICV not passed to the device / default value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109875 --- Comment #1 from Tobias Burnus --- Tested it now also with true offloading. For AMD GCN, I get: host: max_teams: 2 tgt: max_teams: 3 num_teams: 120 For nvptx, I get: host: max_teams: 2 tgt: max_teams: 3 num_teams: 240 And for completeness, for host fallback: host: max_teams: 3 tgt: max_teams: 3 num_teams: 1 i.e. the ICV is handled correctly. However, the ICV is not honored for the target region. By contrast, an explicit 'num_teams(4)' is honored by GCN/nvptx/host fallback ... * * * The spec wording is: "If the *num_teams* clause is not specified on a construct then the effect is as if _upper-bound_ was specified as follows. If the value of the nteams-var ICV is greater than zero, the effect is as if upper-bound was specified to an implementation-defined value greater than zero but less than or equal to the value of the nteams-var ICV."
[Bug libgomp/109875] [OpenMP] nteams-var / OMP_NUM_TEAMS → ICV not passed to the device / default value
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109875 --- Comment #2 from Tobias Burnus --- The host-fallback explicitly sets the number of teams to the lower_bound, if available, and otherwise to 1 - which is fine. Regarding changing the default from 0 to the actually used number, the problem is on the device side it is only known at runtime → issue with OMP_DISPLAY_ENV. BTW, the "[host] OMP_NUM_TEAMS = '0'" should be "[all] OMP_NUM_TEAMS = '0' for the default. For the device side, I think we need (untested): --- a/libgomp/config/gcn/target.c +++ b/libgomp/config/gcn/target.c @@ -51 +51,3 @@ GOMP_teams4 (unsigned int num_teams_lower, unsigned int - num_teams_upper = num_workgroups; +num_teams_upper = ((GOMP_ADDITIONAL_ICVS.nteams > 0 + && num_workgroups > GOMP_ADDITIONAL_ICVS.nteams) + ? GOMP_ADDITIONAL_ICVS.nteams : num_workgroups); diff --git a/libgomp/config/nvptx/target.c b/libgomp/config/nvptx/target.c index f102d7d02d9..125d92a2ea9 100644 --- a/libgomp/config/nvptx/target.c +++ b/libgomp/config/nvptx/target.c @@ -58 +58,3 @@ GOMP_teams4 (unsigned int num_teams_lower, unsigned int -num_teams_upper = num_blocks; +num_teams_upper = ((GOMP_ADDITIONAL_ICVS.nteams > 0 + && num_blocks > GOMP_ADDITIONAL_ICVS.nteams) + ? GOMP_ADDITIONAL_ICVS.nteams : num_blocks);
[Bug libstdc++/109883] Stack Overflow in functions with types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109883 Xi Ruoyao changed: What|Removed |Added CC||xry111 at gcc dot gnu.org --- Comment #1 from Xi Ruoyao --- Cannot reproduce for me. Note that in this case GCC optimizes the entire function call away (see https://godbolt.org/z/968bPTvh9) even with -O0 so I can see no way how this will lead to a runtime error. And GCC for aarch64-darwin target (i. e. "macOS 13.3.1 on M1") is not a part of this project, so are you using another fork?
[Bug libstdc++/109883] Stack Overflow in functions with types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109883 Xi Ruoyao changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-05-17 Status|UNCONFIRMED |WAITING
[Bug c++/109887] Different mangled name for template specialization for clang and gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109887 --- Comment #1 from Jonathan Wakely --- (In reply to Chuanqi Xu from comment #0) > _ZNK1n1S3getIbEENSt9enable_ifIXsrSt11is_integralIT_E5valueEN4llvm8OptionalIS4 > _EEE4typeENS6_9StringRefE > ``` > > and clang will mangle it as: > > ``` > _ZNK1n1S3getIbEENSt9enable_ifIXsr3std11is_integralIT_EE5valueEN4llvm8Optional > IS3_EEE4typeENS4_9StringRefE The difference is that GCC mangles std::is_integral as St11is_integral and Clang mangles it as 3std::is_integral. I think GCC is right. Clang uses St9enable_if for std::enable_if so I don't knwo why it doesn't use the St substitution for std::is_integral.
[Bug middle-end/97048] [meta-bug] bogus/missing -Wstringop-overread warnings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97048 --- Comment #3 from Tony Guilfoyle --- I jumped through enough hoops already, I think. You can take it from here if you want. All the best, Tony On 16/05/2023 18:28, redi at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97048 > > --- Comment #2 from Jonathan Wakely --- > Tony, this is just a meta-bug that has links to the real bugs. Please either > add that as a comment to an existing bug (if it's the same as one of them) or > file a new bug (and set "Blocks: 97048" so that it links back here). But since > your one seems to be about -Wstringop-overflow not -Wstringop-overread I don't > think it is actually related to this meta-bug at all. Maybe it's related to PR > 97185 instead. >
[Bug c++/109888] New: GCC 13 Fails to Compile Code with Explicit Constructor for std::array in Template Class
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109888 Bug ID: 109888 Summary: GCC 13 Fails to Compile Code with Explicit Constructor for std::array in Template Class Product: gcc Version: 13.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: vincent.lebourlot at starqube dot com Target Milestone: --- In a C++ codebase, a class String is defined with an explicit constructor that takes a variable number of arguments and constructs a std::array from them. This constructor is being called when creating a std::pair in a function call. While this code compiles successfully with GCC 12, it fails to compile with GCC 13. The error messages indicate that the std::array must be initialized with a brace-enclosed initializer, which is not what's happening when forwarding the arguments to the std::array's constructor. This issue seems to be specific to GCC 13 and does not occur in GCC 12. It's unclear whether this is due to changes in the C++ standard or in GCC's implementation of the standard. The code snippet that reproduces this issue is provided below: #include #include #include #include #include class String { public: templateString(const char(&s)[n])noexcept:value(){ if constexpr(n<=16){ std::memcpy(value.data(),s,n);} else{value=std::array{};}} template...>, int> = 0> explicit String(Args&&... args) : value(std::forward(args)...) {} private: std::arrayvalue; }; int main() { auto check=[&](std::vector>textInputs){}; check({{{"Hello", "World"}, {"Foo", "Bar"}}}); return 0; } And here's the error message: test.cpp: In function ‘int main()’: test.cpp:20:10: error: converting to ‘const String’ from initializer list would use explicit constructor ‘String::String(Args&& ...) [with Args = {const char (&)[6], const char (&)[6]}; typename std::enable_if, std::allocator >, Args>...>, int>::type = 0]’ 20 | check({{{"Hello", "World"}, {"Foo", "Bar"}}}); | ~^~~~ test.cpp:20:10: error: converting to ‘const String’ from initializer list would use explicit constructor ‘String::String(Args&& ...) [with Args = {const char (&)[4], const char (&)[4]}; typename std::enable_if, std::allocator >, Args>...>, int>::type = 0]’ test.cpp: In instantiation of ‘String::String(Args&& ...) [with Args = {const char (&)[6], const char (&)[6]}; typename std::enable_if, std::allocator >, Args>...>, int>::type = 0]’: test.cpp:20:10: required from here test.cpp:12:39: error: no matching function for call to ‘std::array::array(const char [6], const char [6])’ 12 | explicit String(Args&&... args) : value(std::forward(args)...) {} | ^~ In file included from test.cpp:3: /usr/local/gcc/gcc-13/include/c++/13.1.0/array:94:12: note: candidate: ‘std::array::array()’ 94 | struct array |^ /usr/local/gcc/gcc-13/include/c++/13.1.0/array:94:12: note: candidate expects 0 arguments, 2 provided /usr/local/gcc/gcc-13/include/c++/13.1.0/array:94:12: note: candidate: ‘constexpr std::array::array(const std::array&)’ /usr/local/gcc/gcc-13/include/c++/13.1.0/array:94:12: note: candidate expects 1 argument, 2 provided /usr/local/gcc/gcc-13/include/c++/13.1.0/array:94:12: note: candidate: ‘constexpr std::array::array(std::array&&)’ /usr/local/gcc/gcc-13/include/c++/13.1.0/array:94:12: note: candidate expects 1 argument, 2 provided test.cpp: In instantiation of ‘String::String(Args&& ...) [with Args = {const char (&)[4], const char (&)[4]}; typename std::enable_if, std::allocator >, Args>...>, int>::type = 0]’: test.cpp:20:10: required from here test.cpp:12:39: error: no matching function for call to ‘std::array::array(const char [4], const char [4])’ 12 | explicit String(Args&&... args) : value(std::forward(args)...) {} | ^~ /usr/local/gcc/gcc-13/include/c++/13.1.0/array:94:12: note: candidate: ‘std::array::array()’ 94 | struct array |^ /usr/local/gcc/gcc-13/include/c++/13.1.0/array:94:12: note: candidate expects 0 arguments, 2 provided /usr/local/gcc/gcc-13/include/c++/13.1.0/array:94:12: note: candidate: ‘constexpr std::array::array(const std::array&)’ /usr/local/gcc/gcc-13/include/c++/13.1.0/array:94:12: note: candidate expects 1 argument, 2 provided /usr/local/gcc/gcc-13/include/c++/13.1.0/array:94:12: note: candidate: ‘constexpr std::array::array(std::array&&)’ /usr/local/gcc/gcc-13/include/c++/13.1.0/array:94:12: note: candidate expects 1 argument, 2 provided
[Bug c++/109887] Different mangled name for template specialization for clang and gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109887 --- Comment #2 from Jonathan Wakely --- (In reply to Jonathan Wakely from comment #1) > Clang mangles it as 3std::is_integral. Oops, I mean 3std11is_integral of course.
[Bug c++/109887] Different mangled name for template specialization for clang and gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109887 --- Comment #3 from Jakub Jelinek --- So, simpler testcase would be #include template std::enable_if_t ::value, int> foo() { return 0; } int a = foo (); GCC mangles this as _Z3fooIiENSt9enable_ifIXsrSt11is_integralIT_E5valueEiE4typeEv while clang as _Z3fooIiENSt9enable_ifIXsr3std11is_integralIT_EE5valueEiE4typeEv but c++filt is able to demangle both as std::enable_if::value, int>::type foo() So, the difference between the two is that gcc uses substitution St for std:: while clang doesn't. In https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling sr is ::= [gs] # x or (with "gs") ::x ::= sr # T::x / decltype(p)::x ... and ::= [ ]# T:: or T:: ::= # decltype(p):: ::= and ::= St # ::std:: among other things, so I think st is what should be used instead of 3std.
[Bug c++/109888] GCC 13 Fails to Compile Code with Explicit Constructor for std::array in Template Class
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109888 Jonathan Wakely changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #1 from Jonathan Wakely --- Another dup of Bug 109247 *** This bug has been marked as a duplicate of bug 109247 ***
[Bug c++/109247] [13/14 Regression] optional o; o = {x}; wants to use explicit optional(U) constructor since r13-6765-ga226590fefb35ed6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109247 Jonathan Wakely changed: What|Removed |Added CC||vincent.lebourlot@starqube. ||com --- Comment #13 from Jonathan Wakely --- *** Bug 109888 has been marked as a duplicate of this bug. ***
[Bug libstdc++/109889] New: [13/14 Regression] Segfault in __run_exit_handlers since r13-5309-gc3c6c307792026
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109889 Bug ID: 109889 Summary: [13/14 Regression] Segfault in __run_exit_handlers since r13-5309-gc3c6c307792026 Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: redi at gcc dot gnu.org Target Milestone: --- Target: powerpc64le-unknown-linux-gnu Created attachment 55099 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55099&action=edit Gzipped preprocessed source I'm seeing test failures on powerpc64le when using -D_GLIBCXX_DEBUG, which started with r13-5309-gc3c6c307792026. I don't see anything wrong with that library change, so if I'm not missing something silly, then it might be a latent compiler bug that was revealed by reducing the amount of code run in the library. The attached preprocessed source crashes when built with -O2 -ffunction-sections -Wl,--gc-sections It runs OK with -fno-lifetime-dse or with -fsanitize=undefined or if either of -ffunction-sections or -Wl,--gc-sections is removed. At the crash GDB shows: Program received signal SIGSEGV, Segmentation fault. 0x7765b7cc in __run_exit_handlers (status=, listp=0x77860ad0 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:62 62__exit_funcs_done = true; ─── Assembly ─ 0x7765b7b8 __run_exit_handlers+600 std r9,0(r24) 0x7765b7bc __run_exit_handlers+604 bne 0x7765b8d8 <__run_exit_handlers+888> 0x7765b7c0 __run_exit_handlers+608 li r10,1 0x7765b7c4 __run_exit_handlers+612 nop 0x7765b7c8 __run_exit_handlers+616 li r9,0 0x7765b7cc __run_exit_handlers+620 stb r10,-18040(r2) 0x7765b7d0 __run_exit_handlers+624 lwsync 0x7765b7d4 __run_exit_handlers+628 lwarx r10,0,r31 0x7765b7d8 __run_exit_handlers+632 stwcx. r9,0,r31 0x7765b7dc __run_exit_handlers+636 bne-0x7765b7d4 <__run_exit_handlers+628> ─── Registers r0 0x7765b700 r1 0x7fffe8b0 r2 0x r3 0x r4 0x r5 0x r6 0x r7 0x r8 0x r9 0x r10 0x0001 r11 0x2000 r12 0x77a30960 r13 0x77ffc320 r14 0x r15 0x r16 0x r17 0x r18 0x r19 0x r20 0x r21 0x r22 0x r23 0x0001 r24 0x77860ad0 r25 0x r26 0x r27 0x77862468 r28 0x0001 r29 0x r30 0x77862458 r31 0x77862868 pc 0x7765b7cc msr 0x9000d033 cr 0x24002422 lr 0x7765b700 ctr 0x xer 0x00dd fpscr 0xvscr 0x vrsave 0x ppr 0x000c dscr 0x0010 tar 0x mmcr0 0x mmcr2 0x siar 0xsdar 0x sier 0x orig_r3 0x7765b61c trap 0x0380 ─── Source ─── 57 58if (cur == NULL) 59 { 60/* Exit processing complete. We will not allow any more 61 atexit/on_exit registrations. */ 62__exit_funcs_done = true; 63break; 64 } 65 66while (cur->idx > 0) ─── Stack [0] from 0x7765b7cc in __run_exit_handlers+620 at exit.c:62 [1] from 0x7765b948 in __GI_exit+40 at exit.c:143 [2] from 0x77637fb8 in __libc_start_call_main+168 at ../sysdeps/nptl/libc_start_call_main.h:74 [3] from 0x776381ec in generic_start_main+252 at ../csu/libc-start.c:381 [4] from 0x776381ec in __libc_start_main_impl+428 at ../sysdeps/uni
[Bug libstdc++/109889] [13/14 Regression] Segfault in __run_exit_handlers since r13-5309-gc3c6c307792026
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109889 --- Comment #1 from Jonathan Wakely --- Tulio found out that __gnu_debug::_Safe_iterator_base::_M_reset() is overwriting the stack where r2 (TOC pointer) was saved by __run_exit_handlers() (at address 0x7fffe8e8). This function was called with the wrong address of the object. He was able to track this value back from __gnu_debug::_Safe_sequence_base::_M_detach_all() at debug.cc:325 p *this $1 = { _M_iterators = 0x7fffe8e8, _M_const_iterators = 0x0, _M_version = 1 }
[Bug libstdc++/109889] [13/14 Regression] Segfault in __run_exit_handlers since r13-5309-gc3c6c307792026
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109889 --- Comment #2 from Jakub Jelinek --- r2 is the toc pointer, so having it 0 is weird. Looking at glibc-2.36-10.fc37 (not sure if you are using a different one), I see 0005b560 <__run_exit_handlers>: 5b560: 21 00 4c 3c addis r2,r12,33 5b564: a0 b9 42 38 addir2,r2,-18016 ... 5b5a8: 18 00 41 f8 std r2,24(r1) so wonder what x/1gx $r1+24 is. Most likely some call from that function didn't restore r2 properly? Though, I believe in PowerPC ELFv2 it is the caller's responsibility to restore it and that is why it has the nops after bl (in case the call is guaranteed to be into code with the same TOC) and ld r2,24(r1) otherwise.
[Bug libstdc++/109889] [13/14 Regression] Segfault in __run_exit_handlers since r13-5309-gc3c6c307792026
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109889 --- Comment #3 from Jonathan Wakely --- I wonder if we have a static destructor ordering problem. The libstdc++ test code uses a local static std::map, which will be constructed on first use and destroyed on exit. When built with -D_GLIBCXX_DEBUG that is a __gnu_debug::map which uses checked iterators, so keeps a list of all constructed iterators. On destruction that map locks a mutex, which is another local static, and . Since r13-6282-gd70f49e98245f8 the mutexes are created in a char buffer and never destroyed: // Use a static buffer, so that the mutexes are not destructed // before potential users (or at all) static __attribute__ ((aligned(__alignof__(M char buffer[(sizeof (M)) * (mask + 1)]; static M *m = new (buffer) M[mask + 1]; return m[i]; But something could be wrong with lifetimes of those statics, causing an invalid 'this' pointer to be used somewhere.
[Bug libstdc++/109889] [13/14 Regression] Segfault in __run_exit_handlers since r13-5309-gc3c6c307792026
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109889 --- Comment #4 from Jonathan Wakely --- (In reply to Jakub Jelinek from comment #2) > r2 is the toc pointer, so having it 0 is weird. > Looking at glibc-2.36-10.fc37 (not sure if you are using a different one), I glibc-2.36-9.fc37.ppc64le > see > 0005b560 <__run_exit_handlers>: >5b560: 21 00 4c 3c addis r2,r12,33 >5b564: a0 b9 42 38 addir2,r2,-18016 > ... >5b5a8: 18 00 41 f8 std r2,24(r1) > so wonder what x/1gx $r1+24 is. (gdb) x/1gx $r1+24 0x7fffe8d8: 0x
[Bug sanitizer/109882] sanitizer/common_interface_defs.h bogusly defines __has_feature
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109882 --- Comment #7 from Jonathan Wakely --- I'll do a little more testing and submit it upstream.
[Bug libstdc++/109883] Stack Overflow in functions with types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109883 --- Comment #2 from Matt Borland --- (In reply to Xi Ruoyao from comment #1) > Cannot reproduce for me. Note that in this case GCC optimizes the entire > function call away (see https://godbolt.org/z/968bPTvh9) even with -O0 so I > can see no way how this will lead to a runtime error. Here is an updated reproducer: #include #include #include int main() { auto val = std::pow(0.5F64, 2); std::cout << val << std::endl; } The failure can be seen godbolt here: https://godbolt.org/z/ej5nPn7o4. Running this same snippet locally with ASAN yields: AddressSanitizer:DEADLYSIGNAL = ==110879==ERROR: AddressSanitizer: stack-overflow on address 0x7fff6e2e7ff8 (pc 0x0040126e bp 0x7fff6e2e8010 sp 0x7fff6e2e8000 T0) #0 0x40126e in __gnu_cxx::__promote_2::__value>::__type)(0))+((__gnu_cxx::__promote_2<_Float64, std::__is_integer<_Float64>::__value>::__type)(0))), std::__is_integer::__value>::__type)(0))+((__gnu_cxx::__promote_2<_Float64, std::__is_integer<_Float64>::__value>::__type)(0)))>::__value>::__type std::pow<_Float64, _Float64>(_Float64, _Float64) (/home/mborland/Documents/boost/libs/math/test/so+0x40126e) (BuildId: 6f720390f8d2a24a6dabec3c85e9cf5bb4c192ea) SUMMARY: AddressSanitizer: stack-overflow (/home/mborland/Documents/boost/libs/math/test/so+0x40126e) (BuildId: 6f720390f8d2a24a6dabec3c85e9cf5bb4c192ea) in __gnu_cxx::__promote_2::__value>::__type)(0))+((__gnu_cxx::__promote_2<_Float64, std::__is_integer<_Float64>::__value>::__type)(0))), std::__is_integer::__value>::__type)(0))+((__gnu_cxx::__promote_2<_Float64, std::__is_integer<_Float64>::__value>::__type)(0)))>::__value>::__type std::pow<_Float64, _Float64>(_Float64, _Float64) ==110879==ABORTING For brevity I snipped out 245 more instances of the message next to #0. > And GCC for aarch64-darwin target (i. e. "macOS 13.3.1 on M1") is not a part > of this project, so are you using another fork? It is provided by homebrew as gcc@13. For this reply I am using my Fedora 38 system with "gcc version 13.1.1 20230511 (Red Hat 13.1.1-2) (GCC)"
[Bug libstdc++/109883] Stack Overflow in functions with types and -std=c++23
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109883 Xi Ruoyao changed: What|Removed |Added Known to fail||13.1.0, 14.0 Status|WAITING |NEW Summary|Stack Overflow in|Stack Overflow in |functions with|functions with |types |types and -std=c++23 --- Comment #3 from Xi Ruoyao --- Confirmed. -std=c++23 is needed to reproduce.
[Bug libstdc++/109883] Stack Overflow in functions with types and -std=c++23
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109883 --- Comment #4 from Xi Ruoyao --- It seems the function __gnu_cxx::__promote_2::__value>::__type)(0))+((__gnu_cxx::__promote_2<_Float64, std::__is_integer<_Float64>::__value>::__type)(0))), std::__is_integer::__value>::__type)(0))+((__gnu_cxx::__promote_2<_Float64, std::__is_integer<_Float64>::__value>::__type)(0)))>::__value>::__type std::pow<_Float64, _Float64>(_Float64, _Float64) is recursing infinitely.
[Bug target/109811] libjxl 0.7 is a lot slower in GCC 13.1 vs Clang 16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811 --- Comment #6 from Jan Hubicka --- hottest loop in clang's profile is: for (size_t y = 0; y < opsin.ysize(); y++) { for (size_t x = 0; x < opsin.xsize(); x++) { if (is_background_row[y * is_background_stride + x]) continue; cc.clear(); stack.clear(); stack.emplace_back(x, y); size_t min_x = x; size_t max_x = x; size_t min_y = y; size_t max_y = y; std::pair reference; bool found_border = false; bool all_similar = true; while (!stack.empty()) { std::pair cur = stack.back(); stack.pop_back(); if (visited_row[cur.second * visited_stride + cur.first]) continue; ^^^ closed by this continue. visited_row[cur.second * visited_stride + cur.first] = 1; if (cur.first < min_x) min_x = cur.first; if (cur.first > max_x) max_x = cur.first; if (cur.second < min_y) min_y = cur.second; if (cur.second > max_y) max_y = cur.second; if (paint_ccs) { cc.push_back(cur); } for (int dx = -kSearchRadius; dx <= kSearchRadius; dx++) { for (int dy = -kSearchRadius; dy <= kSearchRadius; dy++) { if (dx == 0 && dy == 0) continue; int next_first = static_cast(cur.first) + dx; int next_second = static_cast(cur.second) + dy; if (next_first < 0 || next_second < 0 || static_cast(next_first) >= opsin.xsize() || static_cast(next_second) >= opsin.ysize()) { continue; } std::pair next{next_first, next_second}; if (!is_background_row[next.second * is_background_stride + next.first]) { stack.push_back(next); } else { if (!found_border) { reference = next; found_border = true; } else { if (!is_similar_b(next, reference)) all_similar = false; } } } } } if (!found_border || !all_similar || max_x - min_x >= kMaxPatchSize || max_y - min_y >= kMaxPatchSize) { continue; } size_t bpos = background_stride * reference.second + reference.first; float ref[3] = {background_rows[0][bpos], background_rows[1][bpos], background_rows[2][bpos]}; bool has_similar = false; for (size_t iy = std::max( static_cast(min_y) - kHasSimilarRadius, 0); iy < std::min(max_y + kHasSimilarRadius + 1, opsin.ysize()); iy++) { for (size_t ix = std::max( static_cast(min_x) - kHasSimilarRadius, 0); ix < std::min(max_x + kHasSimilarRadius + 1, opsin.xsize()); ix++) { size_t opos = opsin_stride * iy + ix; float px[3] = {opsin_rows[0][opos], opsin_rows[1][opos], opsin_rows[2][opos]}; if (pci.is_similar_v(ref, px, kHasSimilarThreshold)) { has_similar = true; } } } if (!has_similar) continue; info.emplace_back(); info.back().second.emplace_back(min_x, min_y); QuantizedPatch& patch = info.back().first; patch.xsize = max_x - min_x + 1; patch.ysize = max_y - min_y + 1; int max_value = 0; for (size_t c : {1, 0, 2}) { for (size_t iy = min_y; iy <= max_y; iy++) { for (size_t ix = min_x; ix <= max_x; ix++) { size_t offset = (iy - min_y) * patch.xsize + ix - min_x; patch.fpixels[c][offset] = opsin_rows[c][iy * opsin_stride + ix] - ref[c]; int val = pci.Quantize(patch.fpixels[c][offset], c); patch.pixels[c][offset] = val; if (std::abs(val) > max_value) max_value = std::abs(val); } } } if (max_value < kMinPeak) { info.pop_back(); continue; } if (paint_ccs) { float cc_color = rng.UniformF(0.5, 1.0); for (std::pair p : cc) { ccs.Row(p.second)[p.first] = cc_color; } } } } I guess such a large loop nest with hottest loop not being the innermost is bad for register pressure. Clangs code is : 0.02 │1196:┌─→cmp %r10,-0xb8(%rbp) ▒ │ │jxl::FindBestPatchDictionary(jxl::Image3 const&, jxl::PassesEncoderState*, JxlCmsInterface co▒ │ │while (!stack.empty()) { ◆ 1.39 │ │↓ je 1690 ▒ │ │std::pair cur = stack.back(); ▒ │11a3:│ mov -0x8(%r10),%rbx
[Bug libstdc++/109883] Stack Overflow in functions with types and -std=c++23
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109883 --- Comment #5 from Xi Ruoyao --- 1203 <_ZSt3powIDF64_DF64_EN9__gnu_cxx11__promote_2IDTplcvNS1_IT_XsrSt12__is_integerIS2_E7__valueEE6__typeELi0EcvNS1_IT0_XsrS3_IS7_E7__valueEE6__typeELi0EEXsrS3_ISB_E7__valueEE6__typeES2_S7_>: 1203: 55 push %rbp 1204: 48 89 e5mov%rsp,%rbp 1207: 48 83 ec 10 sub$0x10,%rsp 120b: f2 0f 11 45 f8 movsd %xmm0,-0x8(%rbp) 1210: f2 0f 11 4d f0 movsd %xmm1,-0x10(%rbp) 1215: f2 0f 10 45 f0 movsd -0x10(%rbp),%xmm0 121a: 48 8b 45 f8 mov-0x8(%rbp),%rax 121e: 66 0f 28 c8 movapd %xmm0,%xmm1 1222: 66 48 0f 6e c0 movq %rax,%xmm0 1227: e8 d7 ff ff ff call 1203 <_ZSt3powIDF64_DF64_EN9__gnu_cxx11__promote_2IDTplcvNS1_IT_XsrSt12__is_integerIS2_E7__valueEE6__typeELi0EcvNS1_IT0_XsrS3_IS7_E7__valueEE6__typeELi0EEXsrS3_ISB_E7__valueEE6__typeES2_S7_> 122c: 66 48 0f 7e c0 movq %xmm0,%rax 1231: 66 48 0f 6e c0 movq %rax,%xmm0 1236: c9 leave 1237: c3 ret This is just stupid...
[Bug libstdc++/109883] Stack Overflow in functions with types and -std=c++23
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109883 --- Comment #6 from Matt Borland --- (In reply to Xi Ruoyao from comment #4) > It seems the function > > __gnu_cxx::__promote_2 std::__is_integer<_Float64>::__value>::__type)(0))+((__gnu_cxx:: > __promote_2<_Float64, std::__is_integer<_Float64>::__value>::__type)(0))), > std::__is_integer std::__is_integer<_Float64>::__value>::__type)(0))+((__gnu_cxx:: > __promote_2<_Float64, > std::__is_integer<_Float64>::__value>::__type)(0)))>::__value>::__type > std::pow<_Float64, _Float64>(_Float64, _Float64) > > is recursing infinitely. For Boost.Math's implementation of promote_2 we found template specializations to be effective: https://github.com/boostorg/math/pull/978/files#diff-2463d99030329b154489b8b34ce1068a34e736cab268c3421b058ca0e516680cR189.
[Bug libstdc++/109883] Stack Overflow in functions with types and -std=c++23
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109883 --- Comment #7 from Jakub Jelinek --- I think we need to move those __promote_{2,3} using templates for atan2, fmod, pow, copysign, fdim, fmax, fmin, hypot, nextafter, remainder, remquo and fma later, because right now we have the overloads with float, double and long double arguments, then these templates and later on _Float{16,32,64,128} and bfloat16_t overloads, and as those __promote_{2,3} templates call itself with promoted arguments, they self-recurse if the promoted arguments are _Float{16,32,64,128} or bfloat16_t.
[Bug c++/109532] -fshort-enums does not pick smallest underlying type for scoped enum
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109532 --- Comment #7 from CVS Commits --- The master branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:d8a656d5b6246457e84934bc35115c134bc38def commit r14-932-gd8a656d5b6246457e84934bc35115c134bc38def Author: Jonathan Wakely Date: Thu Apr 27 12:02:38 2023 +0100 doc: Describe behaviour of enums with fixed underlying type [PR109532] gcc/ChangeLog: PR c++/109532 * doc/invoke.texi (Code Gen Options): Note that -fshort-enums is ignored for a fixed underlying type. (C++ Dialect Options): Likewise for -fstrict-enums. Reviewed-by: Marek Polacek
[Bug libstdc++/109883] Stack Overflow in functions with types and -std=c++23
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109883 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #8 from Jakub Jelinek --- Created attachment 55100 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55100&action=edit gcc14-pr109883.patch Untested fix. Still need to add some testsuite coverage.
[Bug c++/100052] [11/12/13/14 regression] ICE in compiling g++.dg/modules/xtreme-header-3_b.C after r11-8118
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100052 --- Comment #14 from seurer at gcc dot gnu.org --- The failures occur erratically so one clean run doesn't mean much. Scanning the test results mailing list I see failures for this just today in trunk.
[Bug c++/98202] C++ cannot parse F128 suffix for float128 literals
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98202 --- Comment #5 from Jonathan Wakely --- Q can't be used with -std=c++NN strict modes, as noted in bug 87274 limits:2085: error: unable to find numeric literal operator 'operator""Q'
[Bug libstdc++/109890] New: vector's constructor doesn't start object lifetimes during constant evaluation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109890 Bug ID: 109890 Summary: vector's constructor doesn't start object lifetimes during constant evaluation Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: barry.revzin at gmail dot com Target Milestone: --- >From StackOverflow (https://stackoverflow.com/q/76269606/2069064), clang rejects this code when compiling with libstdc++: #include consteval auto bar(int n){ std::vector v(n); return v[0]; } constexpr auto m = bar(5); This is because libstdc++ basically does something like this: #include class V { int* p; int n; std::allocator alloc; public: constexpr V(int n) : n(n) { p = alloc.allocate(n); // fill with 0s? for (int i = 0; i != n; ++i) { p[i] = 0; } } constexpr ~V() { alloc.deallocate(p, n); } }; consteval auto bar(int n) { V v(n); return n; } static_assert(bar(5) == 5); And clang is more picky about the assignment there - it doesn't like just writing p[0] = 0, because the int's lifetime hasn't started yet. gcc accepts the above though. I think that's... technically correct (if pedantic) and libstdc++'s path needs to do a construct_at somewhere.
[Bug c++/109532] -fshort-enums does not pick smallest underlying type for scoped enum
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109532 --- Comment #8 from Jonathan Wakely --- I've updated the docs to make this clear.
[Bug target/109811] libjxl 0.7 is a lot slower in GCC 13.1 vs Clang 16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811 JuzheZhong changed: What|Removed |Added CC||juzhe.zhong at rivai dot ai --- Comment #7 from JuzheZhong --- (In reply to Jan Hubicka from comment #6) > hottest loop in clang's profile is: > for (size_t y = 0; y < opsin.ysize(); y++) { > for (size_t x = 0; x < opsin.xsize(); x++) { > if (is_background_row[y * is_background_stride + x]) continue; > cc.clear(); > stack.clear(); > stack.emplace_back(x, y); > size_t min_x = x; > size_t max_x = x; > size_t min_y = y; > size_t max_y = y; > std::pair reference; > bool found_border = false; > bool all_similar = true; > while (!stack.empty()) { > std::pair cur = stack.back(); > stack.pop_back(); > if (visited_row[cur.second * visited_stride + cur.first]) continue; > ^^^ > closed by this continue. > visited_row[cur.second * visited_stride + cur.first] = 1; > if (cur.first < min_x) min_x = cur.first; > if (cur.first > max_x) max_x = cur.first; > if (cur.second < min_y) min_y = cur.second; > if (cur.second > max_y) max_y = cur.second; > if (paint_ccs) { > cc.push_back(cur); > } > for (int dx = -kSearchRadius; dx <= kSearchRadius; dx++) { > for (int dy = -kSearchRadius; dy <= kSearchRadius; dy++) { > if (dx == 0 && dy == 0) continue; > int next_first = static_cast(cur.first) + dx; > int next_second = static_cast(cur.second) + dy; > if (next_first < 0 || next_second < 0 || > static_cast(next_first) >= opsin.xsize() || > static_cast(next_second) >= opsin.ysize()) { > continue; > } > std::pair next{next_first, next_second}; > if (!is_background_row[next.second * is_background_stride + >next.first]) { > stack.push_back(next); > } else { > if (!found_border) { > reference = next; > found_border = true; > } else { > if (!is_similar_b(next, reference)) all_similar = false; > } > } > } > } > } > if (!found_border || !all_similar || max_x - min_x >= kMaxPatchSize || > max_y - min_y >= kMaxPatchSize) { > continue; > } > size_t bpos = background_stride * reference.second + reference.first; > float ref[3] = {background_rows[0][bpos], background_rows[1][bpos], > background_rows[2][bpos]}; > bool has_similar = false; > for (size_t iy = std::max( >static_cast(min_y) - kHasSimilarRadius, 0); >iy < std::min(max_y + kHasSimilarRadius + 1, opsin.ysize()); > iy++) { > for (size_t ix = std::max( > static_cast(min_x) - kHasSimilarRadius, 0); > ix < std::min(max_x + kHasSimilarRadius + 1, opsin.xsize()); > ix++) { > size_t opos = opsin_stride * iy + ix; > float px[3] = {opsin_rows[0][opos], opsin_rows[1][opos], > opsin_rows[2][opos]}; > if (pci.is_similar_v(ref, px, kHasSimilarThreshold)) { > has_similar = true; > } > } > } > if (!has_similar) continue; > info.emplace_back(); > info.back().second.emplace_back(min_x, min_y); > QuantizedPatch& patch = info.back().first; > patch.xsize = max_x - min_x + 1; > patch.ysize = max_y - min_y + 1; > int max_value = 0; > for (size_t c : {1, 0, 2}) { > for (size_t iy = min_y; iy <= max_y; iy++) { > for (size_t ix = min_x; ix <= max_x; ix++) { > size_t offset = (iy - min_y) * patch.xsize + ix - min_x; > patch.fpixels[c][offset] = > opsin_rows[c][iy * opsin_stride + ix] - ref[c]; > int val = pci.Quantize(patch.fpixels[c][offset], c); > patch.pixels[c][offset] = val; > if (std::abs(val) > max_value) max_value = std::abs(val); > } > } > } > if (max_value < kMinPeak) { > info.pop_back(); > continue; > } > if (paint_ccs) { > float cc_color = rng.UniformF(0.5, 1.0); > for (std::pair p : cc) { > ccs.Row(p.second)[p.first] = cc_color; > } > } > } > } > > I guess such a large loop nest with hottest loop not being the innermost is > bad for register pressure. > Clangs code is : > 0.02 │1196:┌─→cmp %r10,-0xb8(%rbp) > ▒ >│ │jxl::FindBestPatchDictionary(jxl::Image3
[Bug target/109811] libjxl 0.7 is a lot slower in GCC 13.1 vs Clang 16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811 --- Comment #8 from Jan Hubicka --- Created attachment 55101 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55101&action=edit hottest loop jpegxl build machinery adds -fno-vectorize and -fno-slp-vectorize to clang flags. Adding -fno-tree-vectorize -fno-tree-slp-vectorize makes GCC generated code more similar. With this most difference is caused by FindBestPatchDictionary or FindTextLikePatches if that function is not inlined. 15.22% cjxl libjxl.so.0.7.0 [.] jxl::(anonymous namespace)::FindTextLikePatches 10.19% cjxl libjxl.so.0.7.0 [.] jxl::FindBestPatchDictionary 5.27% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::QuantizeBlockAC 5.06% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::EstimateEntropy 4.82% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::EstimateEntropy 4.35% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::QuantizeBlockAC 4.21% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::(anonymous namespace)::TransformFromPixels 3.87% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::(anonymous namespace)::TransformFromPixels 3.78% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::FindBestMultiplier 3.27% cjxl libjxl.so.0.7.0 [.] jxl::N_AVX2::FindBestMultiplier I think it is mostly register allocation not handling well the internal loop quoted above. I am adding preprocessed sources.
[Bug libstdc++/109890] vector's constructor doesn't start object lifetimes during constant evaluation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109890 Jonathan Wakely changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-05-17 Status|UNCONFIRMED |NEW --- Comment #1 from Jonathan Wakely --- For trivial types the std::uninitialized_xxx algos elide the constructors and just do something like memcpy/memset. We need to use std::is_constant_evaluated() to elide the elision in this case.
[Bug target/109811] libjxl 0.7 is a lot slower in GCC 13.1 vs Clang 16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #9 from Jakub Jelinek --- (In reply to JuzheZhong from comment #7) > It seems that Clang has better performance than GCC in case of no vectorizer? That is very general statement. On some particular code, some particular arch, with some particular flags Clang performs better than GCC, on other it is the other way around, on some it is wash. How it performs on larger amounts of code can be seen from standard benchmarks like SPEC, the Phoronix benchmark suite is known not to be a very good benchmark for various reasons, but that doesn't mean it isn't worth looking at it.
[Bug libstdc++/109891] New: Null pointer special handling in ostream's operator << for C-strings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109891 Bug ID: 109891 Summary: Null pointer special handling in ostream's operator << for C-strings Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: mimomorin at gmail dot com Target Milestone: --- This code #include int main() { std::cout << (char*)nullptr; } does not cause any bad things (like SEGV), because libstdc++'s operator<<(ostream, char const*) has special handling of null pointers: template inline basic_ostream<_CharT, _Traits>& operator<<(basic_ostream<_CharT, _Traits>& __out, const _CharT* __s) { if (!__s) __out.setstate(ios_base::badbit); else __ostream_insert(...); return __out; } Passing a null pointer to this operator is a precondition violation, so the current implementation perfectly conforms to the C++ standard. But, why don't we remove this special handling? By doing so, we get - better interoperability with toolings (i.e. sanitizers can find the bug easily) - unnoticeable performace improvement and we lose - deterministic behaviors (of poor codes) on a particular stdlib I believe the first point makes more sense than the last point. It seems that old special handling `if (s == NULL) s = "(null)";` (https://github.com/gcc-mirror/gcc/blob/6599da0/libio/iostream.cc#L638) was removed in GCC 3.0, but reintroduced (in the current form) in GCC 3.2 in response to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=6518 .
[Bug libstdc++/109883] Stack Overflow in functions with types and -std=c++23
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109883 Jakub Jelinek changed: What|Removed |Added Attachment #55100|0 |1 is obsolete|| Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #9 from Jakub Jelinek --- Created attachment 55102 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55102&action=edit gcc14-pr109883.patch Updated patch including testsuite coverage.
[Bug libstdc++/109891] Null pointer special handling in ostream's operator << for C-strings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109891 --- Comment #1 from Jonathan Wakely --- Adding more UB to the library doesn't seem wise. We could make it abort in debug mode, instead of setting badbit, but I don't think we should just make it UB.
[Bug libstdc++/109891] Null pointer special handling in ostream's operator << for C-strings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109891 --- Comment #2 from Jonathan Wakely --- --- a/libstdc++-v3/include/bits/ostream.tcc +++ b/libstdc++-v3/include/bits/ostream.tcc @@ -306,6 +306,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION basic_ostream<_CharT, _Traits>& operator<<(basic_ostream<_CharT, _Traits>& __out, const char* __s) { + _GLIBCXX_DEBUG_PEDANTIC(__s != 0); if (!__s) __out.setstate(ios_base::badbit); else
[Bug tree-optimization/109892] New: SLP failure with explicit fma
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109892 Bug ID: 109892 Summary: SLP failure with explicit fma Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org Target Milestone: --- At -O2 -mfma (x86) or -O3 (arm64) we fail to SLP-vectorize 'f', but succeed in 'g': double f(double x[], long n) { double r0 = 0, r1 = 0; for (; n; x += 2, n--) { r0 = __builtin_fma(x[0], x[0], r0); r1 = __builtin_fma(x[1], x[1], r1); } return r0 + r1; } static double muladd(double x, double y, double z) { return x * y + z; } double g(double x[], long n) { double r0 = 0, r1 = 0; for (; n; x += 2, n--) { r0 = muladd(x[0], x[0], r0); r1 = muladd(x[1], x[1], r1); } return r0 + r1; } It seems we are calling vectorizable_reduction for __builtin_fma even though it would not participate in a reduction when vectorizing for 16-byte vectors?
[Bug fortran/109684] compiling failure: complaining about a final subroutine of a type being not PURE (while it is indeed PURE)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109684 Mianzhi Wang changed: What|Removed |Added Attachment #54964|0 |1 is obsolete|| CC||wangmianzhi1 at linuxmail dot org --- Comment #1 from Mianzhi Wang --- Created attachment 55103 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55103&action=edit slightly more simplified build with cmake
[Bug c++/97340] Spurious rejection of member variable template of reference type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97340 Patrick Palka changed: What|Removed |Added Keywords||rejects-valid CC||ppalka at gcc dot gnu.org See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=108848 --- Comment #3 from Patrick Palka --- We accept the original testcase (where A is not a template) since r13-6380-gd3d205ab440886, but we still incorrectly reject the version where A is a template: template struct A { template static constexpr const int &x=0; }; template struct B { static constexpr int y=A::template x; }; template struct B;
[Bug libstdc++/46906] istreambuf_iterator is late?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46906 --- Comment #13 from Jonathan Wakely --- This seems related to https://cplusplus.github.io/LWG/issue2366 and the changes I'm proposing there.
[Bug rtl-optimization/109858] [14 Regression] r14-172 caused some SPEC2017 bmk to degrade on Power
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109858 --- Comment #10 from Segher Boessenkool --- (In reply to Hongtao.liu from comment #8) > (In reply to Segher Boessenkool from comment #7) > > > The patch will still use GENERAL_REGS when hard_regno_mode_ok for mode and > > > GENERAL_REGS(which is the case in PR109610), hope it can also fix this > > > regression. > > > > That sounds more reasonable. But, why use any heuristics like this? Can't > > you > > just look at the actual costs of using mem and regs? > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109610#c2 That is not an answer to my question at all?
[Bug target/109885] gcc does not generate movmskps and testps instructions (clang does)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109885 --- Comment #1 from Andrew Pinski --- Just FYI, GCC does better on aarch64 with sum. GCC: ldp q29, q30, [x0] moviv31.4s, 0x1 fcmeq v29.4s, v29.4s, 0 fcmeq v30.4s, v30.4s, 0 and v31.16b, v31.16b, v29.16b sub v31.4s, v31.4s, v30.4s addvs31, v31.4s fmovw0, s31 ret vs this mess: sub sp, sp, #16 ldp q1, q0, [x0] adrpx8, .LCPI0_0 fcmeq v1.4s, v1.4s, #0.0 fcmeq v0.4s, v0.4s, #0.0 uzp1v0.8h, v1.8h, v0.8h ldr q1, [x8, :lo12:.LCPI0_0] and v0.16b, v0.16b, v1.16b addvh0, v0.8h fmovw8, s0 and w8, w8, #0xff fmovs0, w8 cnt v0.8b, v0.8b uaddlv h0, v0.8b fmovw0, s0 add sp, sp, #16 ret The reason is it looks like clang/LLVM is tuned to try to use movmskps/testps while GCC is tuned to do just a sum reduction in general. Though I think GCC could be slightly better here too. ldp q29, q30, [x0] fcmeq v29.4s, v29.4s, 0 fcmeq v30.4s, v30.4s, 0 add v31.16b, v29.16b, v30.16b addvs31, v31.4s fmovw0, s31 neg w0, w0 ret I think might be the best code for aarch64 reduction of bools
[Bug middle-end/109849] suboptimal code for vector walking loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849 Jan Hubicka changed: What|Removed |Added Blocks||109811 CC||mjambor at suse dot cz --- Comment #6 from Jan Hubicka --- Here is slightly improved testcase which actually pushes into stack and measures something. It test loops 1000 times and returns. It also makes stack to be local variable so race conditions are not a problem. #include typedef unsigned int uint32_t; std::pair pair; void test() { std::vector> stack; stack.push_back (pair); while (!stack.empty()) { std::pair cur = stack.back(); stack.pop_back(); if (!cur.first) { cur.second++; stack.push_back (cur); } if (cur.second > 1) break; } } int main() { for (int i = 0; i < 1; i++) test(); } Clang code is about twice as fast jan@localhost:/tmp> clang++ -O2 tt.C -fno-exceptions jan@localhost:/tmp> g++ -O2 tt.C -fno-exceptions -o a.out-gcc jan@localhost:/tmp> perf stat ./a.out Performance counter stats for './a.out': 434.24 msec task-clock:u #0.997 CPUs utilized 0 context-switches:u #0.000 /sec 0 cpu-migrations:u #0.000 /sec 129 page-faults:u# 297.073 /sec 1,003,191,657 cycles:u #2.310 GHz 68,927 stalled-cycles-frontend:u#0.01% frontend cycles idle 800,792,619 stalled-cycles-backend:u # 79.82% backend cycles idle 1,904,682,933 instructions:u #1.90 insn per cycle #0.42 stalled cycles per insn 500,912,196 branches:u #1.154 G/sec 23,144 branch-misses:u #0.00% of all branches 0.435340389 seconds time elapsed 0.431409000 seconds user 0.003994000 seconds sys jan@localhost:/tmp> perf stat ./a.out-gcc Performance counter stats for './a.out-gcc': 1,197.28 msec task-clock:u #0.999 CPUs utilized 0 context-switches:u #0.000 /sec 0 cpu-migrations:u #0.000 /sec 131 page-faults:u# 109.415 /sec 2,903,995,656 cycles:u #2.425 GHz 86,204 stalled-cycles-frontend:u#0.00% frontend cycles idle 2,690,907,052 stalled-cycles-backend:u # 92.66% backend cycles idle 2,005,212,311 instructions:u #0.69 insn per cycle #1.34 stalled cycles per insn 401,007,320 branches:u # 334.932 M/sec 23,290 branch-misses:u #0.01% of all branches 1.198388186 seconds time elapsed 1.19845 seconds user 0.0 seconds sys The problem seems to be, like in first example, that we keep updating in-memory stack in the main loop. .L39: movl12(%rsp), %ebx .L30: movq16(%rsp), %rax cmpl$1, %ebx ja .L33 .L40: movq24(%rsp), %rdi cmpq%rdi, %rax je .L28 .L34: movq-8(%rdi), %rax leaq-8(%rdi), %rsi movq%rsi, 24(%rsp) movq%rax, 8(%rsp) testl %eax, %eax jne .L39 While clang does: .LBB0_1:# in Loop: Header=BB0_4 Depth=1 movq%rax, %r14 .LBB0_2:# in Loop: Header=BB0_4 Depth=1 movq%rbx, %r12 movq%r12, %rbx cmpl$10001, %r13d # imm = 0x2711 jae .LBB0_27 .LBB0_4:# =>This Loop Header: Depth=1 # Child Loop BB0_16 Depth 2 # Child Loop BB0_21 Depth 2 cmpq%r14, %rbx je .LBB0_26 # %bb.5:# in Loop: Header=BB0_4 Depth=1 leaq-8(%r14), %rax movq-8(%r14), %rcx movq%rcx, %r13 shrq$32, %r13 testl %ecx, %ecx jne .LBB0_1 Referenced Bugs: https://gcc.gnu.org/bugzilla/sho
[Bug target/109811] libjxl 0.7 is a lot slower in GCC 13.1 vs Clang 16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811 --- Comment #10 from Jan Hubicka --- Actually vectorization hurts on both compilers and bit more with clang. It seems that all important loops are hand vectorized and since register pressure is a problem, vectorizing other loops causes enough of collateral damage to register allocation to regress performance. I believe the core of the problem (or at least one of them) is simply way we compile loops popping data from std::vector based stack. See PR109849 We keep updating stack datastructure in the innermost loop becuase in not too common case reallocation needs to be done and that is done by offlined code.
[Bug tree-optimization/106900] Regression after memchr optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106900 --- Comment #6 from CVS Commits --- The trunk branch has been updated by Andrew Pinski : https://gcc.gnu.org/g:f65af1eeef670f2c249b1896726ef57bbf65fe2f commit r14-937-gf65af1eeef670f2c249b1896726ef57bbf65fe2f Author: Andrew Pinski Date: Tue May 16 14:34:05 2023 -0700 Fix PR 106900: array-bounds warning inside simplify_builtin_call The problem here is that VRP cannot figure out isize could not be 0 due to using integer_zerop. This patch removes the use of integer_zerop and instead checks for 0 directly after converting the tree to an unsigned HOST_WIDE_INT. This allows VRP to figure out isize is not 0 and `isize - 1` will always be >= 0. This patch is just to avoid the warning that GCC could produce sometimes and does not change any code generation or even VRP. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: * tree-ssa-forwprop.cc (simplify_builtin_call): Check against 0 instead of calling integer_zerop.
[Bug analyzer/109570] detect fclose on unopened or NULL files
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109570 --- Comment #5 from Christophe Lyon --- Not sure how to update/fix the testcases though? Since they get the declaration of fclose from stdio.h, we'd need to make dg-error conditional to the glibc version in use, which seems unpractical. Should we instead remove #include and provide suitable declarations in the testcase?
[Bug libstdc++/109889] [13/14 Regression] Segfault in __run_exit_handlers since r13-5309-gc3c6c307792026
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109889 --- Comment #5 from Tulio Magno Quites Machado Filho --- (In reply to Jonathan Wakely from comment #3) > I wonder if we have a static destructor ordering problem. I'm afraid the issue is happening earlier, when these iterators are being initialized. Look at this backtrace taken during initialization: #0 0x77b536e4 in __gnu_debug::_Safe_sequence_base::_M_attach_single (this=0x100414c8 <__gnu_cxx::annotate_base::map_alloc()::_S_map>, __it=0x7fffe8f8, __constant=false) at /home/test/src/gcc/libstdc++-v3/src/c++11/debug.cc:396 #1 0x77b5376c in __gnu_debug::_Safe_sequence_base::_M_attach (this=0x100414c8 <__gnu_cxx::annotate_base::map_alloc()::_S_map>, __it=0x7fffe8f8, __constant=false) at /home/test/src/gcc/libstdc++-v3/src/c++11/debug.cc:383 #2 0x77b53cd8 in __gnu_debug::_Safe_iterator_base::_M_attach (this=0x7fffe8f8, __seq=0x100414c8 <__gnu_cxx::annotate_base::map_alloc()::_S_map>, __constant=false) at /home/test/src/gcc/libstdc++-v3/src/c++11/debug.cc:430 #3 0x10012244 in __gnu_debug::_Safe_iterator_base::_Safe_iterator_base (__constant=false, __seq=0x100414c8 <__gnu_cxx::annotate_base::map_alloc()::_S_map>, this=) at /home/test/gcc-14/include/c++/14.0.0/debug/safe_base.h:91 #4 __gnu_debug::_Safe_iterator > >, std::__debug::map, std::less, std::allocator > > >, std::forward_iterator_tag>::_Safe_iterator (__seq=0x100414c8 <__gnu_cxx::annotate_base::map_alloc()::_S_map>, __i=..., this=0x7fffe8f0) at /home/test/gcc-14/include/c++/14.0.0/debug/safe_iterator.h:162 #5 __gnu_debug::_Safe_iterator > >, std::__debug::map, std::less, std::allocator > > >, std::bidirectional_iterator_tag>::_Safe_iterator (__seq=0x100414c8 <__gnu_cxx::annotate_base::map_alloc()::_S_map>, __i=..., this=0x7fffe8f0) at /home/test/gcc-14/include/c++/14.0.0/debug/safe_iterator.h:539 #6 std::__debug::map, std::less, std::allocator > > >::find (__x=: 0x0, this=0x100414c8 <__gnu_cxx::annotate_base::map_alloc()::_S_map>) at /home/test/gcc-14/include/c++/14.0.0/debug/map.h:583 #7 __gnu_cxx::annotate_base::check_allocated (this=, size=4, p=0x0) at /home/test/gcc-14/include/c++/14.0.0/ext/throw_allocator.h:177 #8 __gnu_cxx::annotate_base::erase (p=p@entry=0x0, size=size@entry=4, this=) at /home/test/gcc-14/include/c++/14.0.0/ext/throw_allocator.h:146 #9 0x10010474 in __gnu_cxx::throw_allocator_base::deallocate (this=, __n=1, __p=0x0) at /home/test/gcc-14/include/c++/14.0.0/ext/throw_allocator.h:888 #10 __gnu_test::check_deallocate_null<__gnu_cxx::throw_allocator_random > () at /home/test/src/gcc/libstdc++-v3/testsuite/util/testsuite_allocator.h:255 #11 main () at /home/test/src/gcc/libstdc++-v3/testsuite/ext/throw_allocator/check_deallocate_null.cc:30 Frame #2 references 0x7fffe8f8, which is part of the stack. Frame #5 is also referencing an object in the stack. After these functions return, these objects shouldn't be used anymore.
[Bug tree-optimization/106900] Regression after memchr optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106900 Andrew Pinski changed: What|Removed |Added Resolution|--- |FIXED Target Milestone|--- |14.0 Status|ASSIGNED|RESOLVED --- Comment #7 from Andrew Pinski --- Fixed on the trunk; not really worth backporting since it is only an issue with --enable-werror-always which almost nobody uses.
[Bug tree-optimization/56456] [meta-bug] bogus/missing -Warray-bounds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56456 Bug 56456 depends on bug 106900, which changed state. Bug 106900 Summary: Regression after memchr optimization https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106900 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug libstdc++/109889] [13/14 Regression] Segfault in __run_exit_handlers since r13-5309-gc3c6c307792026
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109889 --- Comment #6 from Tulio Magno Quites Machado Filho --- Let me elaborate my previous comment... When initializing the object at 0x100414c8, one of its members points to an address in the stack (0x7fffe8f8). All these functions return and when __run_exit_handlers() is called, the address 0x7fffe8f8 is used to save the TOC pointer (r2) before calling the destructors of the library. The destructors manipulate the object at 0x100414c8, zeroing all its members, including the address where the TOC pointer was saved.
[Bug libstdc++/109891] Null pointer special handling in ostream's operator << for C-strings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109891 --- Comment #3 from Michel Morin --- >From the safety point of view, I agree with you. But, at the same time, I thought that detectable UB (with the help of sanitizers) is useful than silent bug. How about `throw`ing as in std::string's constructor?
[Bug libstdc++/109891] Null pointer special handling in ostream's operator << for C-strings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109891 --- Comment #4 from Andrew Pinski --- IIRC this was to added to be similar to glibc's nullptr handling for %s: printf("xyza %s\n", nullptr);
[Bug analyzer/109570] detect fclose on unopened or NULL files
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109570 --- Comment #6 from Xi Ruoyao --- (In reply to Christophe Lyon from comment #5) > Not sure how to update/fix the testcases though? > Since they get the declaration of fclose from stdio.h, we'd need to make > dg-error conditional to the glibc version in use, which seems unpractical. > > Should we instead remove #include and provide suitable > declarations in the testcase? I guess we need to change return ferror (f) || fclose (f) != 0; to return !f || ferror (f) || fclose (f) != 0; Because "failing to check if the file is opened successfully" is definitely a bug, and these tests are intended not to raise warnings for a bug-free program. BTW ferror(f) segfaults as well when f is NULL, so IMO we should mark it nonnull in Glibc as well.
[Bug tree-optimization/109893] New: [14 Regression] Missed Dead Code Elimination when using __builtin_unreachable since r14-160-gf828503eeb79ad1f1ada6db7deccc5abcc2f3ca3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109893 Bug ID: 109893 Summary: [14 Regression] Missed Dead Code Elimination when using __builtin_unreachable since r14-160-gf828503eeb79ad1f1ada6db7deccc5abcc2f3ca3 Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: theodort at inf dot ethz.ch Target Milestone: --- void foo(void); void bar(void); static char a; static int b, e, f; static int *c = &b, *g; int main() { int *j = 0; if (a) { g = 0; if (c) bar(); } else { j = &e; c = 0; } if (c == &f == b || c == &e) ; else __builtin_unreachable(); if (g || e) { if (j == &e || j == 0) ; else foo(); } a = 4; } gcc -O3: main: cmpb$0, a(%rip) je .L2 xorl%esi, %esi cmpq$0, c(%rip) movq%rsi, g(%rip) je .L7 pushq %rdx callbar movb$4, a(%rip) xorl%eax, %eax popq%rcx ret .L2: xorl%eax, %eax movq%rax, c(%rip) .L7: movb$4, a(%rip) xorl%eax, %eax ret c: .quad b gcc-trunk -O3 main: subq$8, %rsp cmpb$0, a(%rip) je .L2 xorl%edx, %edx cmpq$0, c(%rip) movq%rdx, g(%rip) je .L6 callbar xorl%eax, %eax .L4: cmpq$0, g(%rip) je .L9 .L6: movb$4, a(%rip) xorl%eax, %eax addq$8, %rsp ret .L2: xorl%eax, %eax movq%rax, c(%rip) movl$e, %eax jmp .L4 .L9: cmpl$0, e(%rip) je .L6 testq %rax, %rax je .L6 cmpq$e, %rax je .L6 callfoo jmp .L6 c: .quad b Bisects to: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=f828503eeb79ad1f1ada6db7deccc5abcc2f3ca3
[Bug tree-optimization/109893] [14 Regression] Missed Dead Code Elimination when using __builtin_unreachable since r14-160-gf828503eeb79ad1f1ada6db7deccc5abcc2f3ca3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109893 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Keywords||missed-optimization Last reconfirmed||2023-05-17 Status|UNCONFIRMED |NEW Target Milestone|--- |14.0 --- Comment #1 from Andrew Pinski --- Confirmed. A minor regression I suspect.
[Bug tree-optimization/109892] SLP failure with explicit fma
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109892 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-05-17 Status|UNCONFIRMED |NEW Severity|normal |enhancement --- Comment #1 from Andrew Pinski --- Confirmed. I Notice that clang/LLVM does not vectorize the __builtin_fma either. I also noticed for aarch64, GCC does not use faddp for the final reduction (but I saw there was a patch submitted for that in 2021 but had not been updated for the comments on it ...).
[Bug target/109885] gcc does not generate movmskps and testps instructions (clang does)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109885 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2023-05-17 Ever confirmed|0 |1 --- Comment #2 from Andrew Pinski --- Confirmed.
[Bug middle-end/90663] [10/11/12/13/14 Regression] strcmp (&a[i], a + i) not folded for arrays and constant index
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90663 Andrew Pinski changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|pinskia at gcc dot gnu.org |unassigned at gcc dot gnu.org --- Comment #11 from Andrew Pinski --- (In reply to Andrew Pinski from comment #10) > Created attachment 55097 [details] > Patch which I am testing Hmm, one failure +FAIL: c-c++-common/Wrestrict.c -Wc++-compat memcpy (test for warnings, line 120) GCC's code says: /* Avoid diagnosing exact overlap in calls to __builtin_memcpy. It's safe and may even be emitted by GCC itself (see bug 32667). */ Reduced testcase for the missing warning: ``` /* PR 35503 - Warn about restricted pointers { dg-do compile } { dg-options "-O2 -Wrestrict -ftrack-macro-expansion=0" } */ void sink (void*, ...); /* Exercise memcpy with constant or known arguments. */ void test_memcpy_cst (void *d, const void *s) { struct { char a[7]; char b[7]; char c[7]; } x; sink (&x); d = x.a + 7; s = x.b; __builtin_memcpy (d, s, 3); /* { dg-warning "\\\[-Wrestrict" "memcpy" } */ sink (&x); } ``` I am no longer working on this because I am not 100% sure if we want to still warn here or not ...
[Bug libstdc++/109889] [13/14 Regression] Segfault in __run_exit_handlers since r13-5309-gc3c6c307792026
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109889 --- Comment #7 from Jonathan Wakely --- When the function returns the iterator's destructor should detach itself from the sequence's list of iterators, so that it doesn't outlive the stack frame containing the iterator. The commit that caused the regression included this change: _GLIBCXX_DEBUG_VERIFY(this->_M_incrementable(), _M_message(__msg_bad_inc) ._M_iterator(*this, "this")); - _Safe_iterator __ret = *this; + _Safe_iterator __ret(*this, _Unchecked()); ++*this; return __ret; } Maybe this affects how/when the __ret object gets destroyed, so it fails to detach itself.
[Bug libstdc++/109889] [13/14 Regression] Segfault in __run_exit_handlers since r13-5309-gc3c6c307792026
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109889 --- Comment #8 from Jonathan Wakely --- With -std=c++14 there's no crash, with -std=c++17, so that confirms it's something related to copy elision.
[Bug libstdc++/109889] [13/14 Regression] Segfault in __run_exit_handlers since r13-5309-gc3c6c307792026
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109889 --- Comment #9 from Jonathan Wakely --- (In reply to Jonathan Wakely from comment #8) > With -std=c++14 there's no crash, with -std=c++17, Should have said "only with -std=c++17" (and later, of course).
[Bug libstdc++/109889] [13/14 Regression] Segfault in __run_exit_handlers since r13-5309-gc3c6c307792026
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109889 --- Comment #10 from Jonathan Wakely --- (In reply to Jonathan Wakely from comment #9) > Should have said "only with -std=c++17" (and later, of course). Actually, that's wrong, *only* with C++17, not earlier *or* later. So the further changes to elision rules after C++17 changed the behaviour again.
[Bug modula2/109894] New: WriteInt in the ISO libraries should not emit the '+' when writing positive values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109894 Bug ID: 109894 Summary: WriteInt in the ISO libraries should not emit the '+' when writing positive values Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: modula2 Assignee: gaius at gcc dot gnu.org Reporter: gaius at gcc dot gnu.org Target Milestone: --- As reported on the gm2 mailing list. WriteInt in the ISO libraries should not emit the '+' when writing positive values.
[Bug modula2/109894] WriteInt in the ISO libraries should not emit the '+' when writing positive values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109894 Gaius Mulley changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-05-17 Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Gaius Mulley --- Confirmed.
[Bug modula2/109894] WriteInt in the ISO libraries should not emit the '+' when writing positive values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109894 Gaius Mulley changed: What|Removed |Added CC||gaius at gcc dot gnu.org --- Comment #2 from Gaius Mulley --- Created attachment 55104 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55104&action=edit Proposed fix Proposed fix for WriteInt in m2iso.
[Bug modula2/109894] WriteInt in the ISO libraries should not emit the '+' when writing positive values
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109894 Gaius Mulley changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #3 from Gaius Mulley --- Closing now that the patch has been applied.
[Bug fortran/109865] different results when routine moved inside the contains statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865 GARY.WHITE at ColoState dot edu changed: What|Removed |Added Resolution|--- |FIXED Status|WAITING |RESOLVED --- Comment #16 from GARY.WHITE at ColoState dot edu --- I resolved the issue. The parameter ir was declared intent(out) in subroutine mc11ad, but there was a check in an if statement to see if ir == 0, meaning ir was defined on input. This check followed code that set ir when n == 1, and this was never executed when the code did not produce correct answers. Anyway, changing intent(out) to intent(in out) resolved the -O3 optimization issue and the code works as expected. I guess its too much to expect that the compiler would detect that a parameter was actually being access before being set if the parameter is declared intent(out) only.
[Bug libstdc++/109891] Null pointer special handling in ostream's operator << for C-strings
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109891 --- Comment #5 from Jonathan Wakely --- (In reply to Michel Morin from comment #3) > From the safety point of view, I agree with you. But, at the same time, I > thought that detectable UB (with the help of sanitizers) is useful than > silent bug. Detectable UB doesn't guarantee detection. Sanitizers are not suitable for production code. Introducing UB here would be strictly less safe, full stop. And the bug isn't silent, it makes the stream unusable. > How about `throw`ing as in std::string's constructor? Set the exception flag on the stream and you get an exception when badbit is set.
[Bug fortran/109865] different results when routine moved inside the contains statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865 Andrew Pinski changed: What|Removed |Added Resolution|FIXED |INVALID --- Comment #17 from Andrew Pinski --- Since there is no GCC bug changing the issue status to invalid.