[Bug c/78352] GCC lacks support for the Apple "blocks" extension to the C family of languages
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78352 --- Comment #14 from Fabian Groffen --- (In reply to Eric Gallager from comment #13) > If we could get in touch with an actual lawyer to review which laws > specifically are getting in the way here, that could be helpful. I won my > election to the New Hampshire State Legislature so if there's any > legislation I could pass to make it legal to apply those patches here in NH, > I'd love to know how to write it. FWIW: if Iain wrote a new patch, then we don't need Apple's original work which from my experience, frankly is messy. There's lots of stuff in there intertwined, so going by a specification e.g. Clang's (https://clang.llvm.org/docs/BlockLanguageSpec.html) is probably the best way forward in any case.
[Bug c/78352] GCC lacks support for the Apple "blocks" extension to the C family of languages
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78352 --- Comment #15 from Iain Sandoe --- (In reply to Fabian Groffen from comment #14) > (In reply to Eric Gallager from comment #13) > > If we could get in touch with an actual lawyer to review which laws > > specifically are getting in the way here, I would expect that the determination has been made by the FSF lawyers (but I am not an authority here, just repeating the policy put to me when I started work on the Darwin port, years ago). > that could be helpful. I won my > > election to the New Hampshire State Legislature congrats! >>so if there's any > > legislation I could pass to make it legal to apply those patches here in NH, > > I'd love to know how to write it. IMO the technical issues with reusing 4.2.1 code are so significant that it would be a poor use of your time chasing a way to include stuff that we'd need to rewrite anyway (see below) > FWIW: if Iain wrote a new patch, then we don't need Apple's original work > which from my experience, frankly is messy. Indeed, it isn't suitable for the current source base - there have been a lot of changes since 4.2.1. As a secondary consideration, I also want to move Objective-C style metadata generation until after LTO has run (and Apple blocks also makes use of that style meta-data). > There's lots of stuff in there > intertwined, so going by a specification e.g. Clang's > (https://clang.llvm.org/docs/BlockLanguageSpec.html) is probably the best > way forward in any case. Which is what I was doing + 1:1 comparison with clang's output ( on the grounds that the ABI is defined by the actual output regardless of what the documentation says ;) ) Sorry that there hasn't been much progress on this - it *was* top of my GCC11 TODO list, and then Apple Si. came along and torpedoed that...
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #14 from Thomas Koenig --- Created attachment 49520 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49520&action=edit Numbers a, b so that 2^b ≡ 1 mod a up to b=64, larger b taken if several solutions exist, plus the multiplicative inverse for 2^128 I've added the multiplicative inverse to the table, calculated with maxima by inv_mod(x,2^128). Output is in hex, to make it easier to break down into two numbers. Is there any more info that I could provide?
[Bug testsuite/97680] [11 Regression] new test case c-c++-common/zero-scratch-regs-10.c in r11-4578 has excess errors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97680 --- Comment #6 from Iain Sandoe --- (In reply to Iain Sandoe from comment #5) > I added xfail-if for powerpc-darwin (8,9, 10 and 11). > > https://gcc.gnu.org/pipermail/gcc-cvs/2020-November/336720.html > > Since i don't think I will have time this cycle to implement it (there are > much more pressing demands on the time) - at least the tests will then XPASS > if/when the impl. is done. Unfortunately, that's not enough; the XFAIL only covers the run and we have to skip the tests completely to avoid testsuite output noise (which I've done for powerpc-darwin).
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #15 from Jakub Jelinek --- I plan to work on this early in stage3. And we really shouldn't use any tables, GCC should figure it all out. So, for double-word modulo by constant that would be expanded using a libcall, go for x from the word bitsize to double-word bitsize and check if (1max << x) % cst is 1 (and prefer what we've agreed on for 3), and fall back to multiplications (see #c8) if there aren't any other options and the costs don't say it is too costly.
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 --- Comment #3 from Thomas Koenig --- Simplified test case: program main type foo real, allocatable, dimension(:) :: a[:] end type foo type (foo) :: x sync all allocate (x%a(10)[*]) end program main
[Bug c++/97755] New: Explicit default constructor is called during copy-list-initialization with a warning only
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97755 Bug ID: 97755 Summary: Explicit default constructor is called during copy-list-initialization with a warning only Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: egor_suvorov at mail dot ru Target Milestone: --- Consider the following test case: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/testsuite/g%2B%2B.dg/cpp0x/initlist40.C // PR c++/54835, DR 1518 // { dg-do compile { target c++11 } } struct A { explicit A(int = 42); }; int main() { A a1 = { }; // { dg-error "explicit" } A a2 = { 24 };// { dg-error "explicit" } } GCC fails to compile it, but the line with 'a1' emits only a warning: "converting to 'A' from initializer list would use explicit constructor 'A::A(int)'". Hence, if I comment out the line with 'a2', compilation succeeds. However, if I modify the test case slightly: struct A { explicit A(); explicit A(int); }; int main() { A a1 = { }; // { dg-error "explicit" } A a2 = { 24 };// { dg-error "explicit" } } Both messages become errors. I believe it's a regression between GCC 5 (correctly fails both test cases) and GCC 6 (emits warning instead of error): https://godbolt.org/z/1o81h1 Looks like the change was brought by this commit: https://gcc.gnu.org/git/?p=gcc.git&a=commit;h=e7838ec9d2ea06e844ef23660862781b81a26329 from this bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54835 I'm suspicious that the code says "When converting from an init list we consider explicit constructors, but actually trying to call one is an error.", but then proceeds to call `pedwarn` instead of `error` in some cases.
[Bug c++/51242] [C++11] Unable to use strongly typed enums as bit fields
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51242 Barry Revzin changed: What|Removed |Added CC||barry.revzin at gmail dot com --- Comment #31 from Barry Revzin --- Apparently this was fixed in 9.3? enum class Color { Red, Green, Blue }; struct X { Color c : 2; }; auto x = X{.c=Color::Red}; warns on 9.2, but not anymore on 9.3 or 10.
[Bug c++/97755] Explicit default constructor is called during copy-list-initialization with a warning only
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97755 Harald van Dijk changed: What|Removed |Added CC||harald at gigawatt dot nl --- Comment #1 from Harald van Dijk --- This may be in order to ensure that the following valid C++03 code is accepted in C++11 mode as well, to limit the impact when the default language version was changed: struct A { explicit A(int = 24); }; int main() { A a[1] = {}; } This did not get diagnosed in GCC 5 in any mode. GCC 6 accepts it without a warning in C++03 mode, and accepts it with a warning in C++11 mode.
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #16 from Thomas Koenig --- (In reply to Jakub Jelinek from comment #15) > I plan to work on this early in stage3. > And we really shouldn't use any tables, GCC should figure it all out. > So, for double-word modulo by constant that would be expanded using a > libcall, go for x from the word bitsize to double-word bitsize and check if > (1max << x) % cst > is 1 It's probably better to search from high to low, to reduce the number of necessary shifts for division by constants like 9 or 13. > (and prefer what we've agreed on for 3), and fall back to > multiplications (see #c8) if there aren't any other options and the costs > don't say it is too costly. I think for variants where the constants aren't power of two, #define ONE ((__uint128_t) 1) #define TWO_64 (ONE << 64) #define MASK60 ((1ul << 60) - 1) void div_rem_13 (mytype n, mytype *div, unsigned int *rem) { const mytype magic = TWO_64 * 14189803133622732012u + 5675921253449092805u * ONE; /* 0xC4EC4EC4EC4EC4EC4EC4EC4EC4EC4EC5 */ __uint64_t a, b, c; unsigned int r; a = n & MASK60; b = (n >> 60); b = b & MASK60; c = (n >> 120); r = (a+b+c) % 13; n = n - r; *div = n * magic; *rem = r; } should be pretty efficient; there is only one shift which spans two words. (The assembly generated from the function looks weird because of quite a few move instructions, but that should not be an issue for code generated inline). Regarding the approach in comment #8, I think I'll run some benchmarks to see how well that works for other constants which don't fit the pattern of being divisors for 2^n-1.
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #17 from Thomas Koenig --- To be compilable, my previous code lacks typedef __uint128_t mytype; > #define ONE ((__uint128_t) 1)
[Bug rtl-optimization/97756] New: Inefficient handling of 128-bit arguments
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97756 Bug ID: 97756 Summary: Inefficient handling of 128-bit arguments Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- This is an offshoot from PR 97459. The code #define ONE ((__uint128_t) 1) #define TWO_64 (ONE << 64) #define MASK60 ((1ul << 60) - 1) typedef __uint128_t mytype; void div_rem_13_v2 (mytype n, mytype *div, unsigned int *rem) { const mytype magic = TWO_64 * 14189803133622732012u + 5675921253449092805u * ONE; unsigned long a, b, c; unsigned int r; a = n & MASK60; b = (n >> 60); b = b & MASK60; c = (n >> 120); r = (a+b+c) % 13; n = n - r; *div = n * magic; *rem = r; } when compiled on x86_64 on Zen with -O3 -march=native has quite some register shuffling at the beginning: 0: 49 89 f0mov%rsi,%r8 3: 48 89 femov%rdi,%rsi 6: 49 89 d1mov%rdx,%r9 9: 48 ba ff ff ff ff ffmovabs $0xfff,%rdx 10: ff ff 0f 13: 4c 89 c7mov%r8,%rdi 16: 48 89 f0mov%rsi,%rax 19: 49 89 c8mov%rcx,%r8 1c: 48 89 f1mov%rsi,%rcx 1f: 49 89 famov%rdi,%r10 22: 48 0f ac f8 3c shrd $0x3c,%rdi,%rax 27: 48 21 d1and%rdx,%rcx 2a: 41 56 push %r14 2c: 49 c1 ea 38 shr$0x38,%r10 30: 48 21 d0and%rdx,%rax 33: 53 push %rbx 34: 48 bb c5 4e ec c4 4emovabs $0x4ec4ec4ec4ec4ec5,%rbx 3b: ec c4 4e 3e: 4c 01 d1add%r10,%rcx 41: 45 31 dbxor%r11d,%r11d 44: 48 01 c1add%rax,%rcx 47: 48 89 c8mov%rcx,%rax 4a: 48 f7 e3mul%rbx 4d: 48 c1 ea 02 shr$0x2,%rdx 51: 48 8d 04 52 lea(%rdx,%rdx,2),%rax 55: 48 8d 04 82 lea(%rdx,%rax,4),%rax 59: 48 89 camov%rcx,%rdx 5c: 48 b9 ec c4 4e ec c4movabs $0xc4ec4ec4ec4ec4ec,%rcx 63: 4e ec c4 66: 48 29 c2sub%rax,%rdx 69: 48 29 d6sub%rdx,%rsi 6c: 49 89 d6mov%rdx,%r14 6f: 4c 19 dfsbb%r11,%rdi 72: 48 0f af ce imul %rsi,%rcx 76: 48 89 f2mov%rsi,%rdx 79: 48 89 f8mov%rdi,%rax 7c: c4 e2 cb f6 fb mulx %rbx,%rsi,%rdi 81: 48 0f af c3 imul %rbx,%rax 85: 49 89 31mov%rsi,(%r9) 88: 48 01 c8add%rcx,%rax 8b: 48 01 c7add%rax,%rdi 8e: 49 89 79 08 mov%rdi,0x8(%r9) 92: 45 89 30mov%r14d,(%r8) 95: 5b pop%rbx 96: 41 5e pop%r14 98: c3 retq
[Bug tree-optimization/97757] New: [11 Regression] fortran save_6.f90 fails with a segv for -flto -O >= 2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97757 Bug ID: 97757 Summary: [11 Regression] fortran save_6.f90 fails with a segv for -flto -O >= 2 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: iains at gcc dot gnu.org Target Milestone: --- most likely in the range r11-4777 and r11-4781. It doesn't seem to reproduce on Linux - but it shows on a stage#1 built with debug - so probably will show on a darwin cross. gcc/f951 /src-local/gcc-master/gcc/testsuite/gfortran.dg/save_6.f90 -fPIC -quiet -dumpdir a- -dumpbase save_6.f90 -dumpbase-ext .f90 -mmacosx-version-min=10.12.0 -mtune=core2 -O2 -version -fno-automatic -flto -fintrinsic-modules-path finclude -o a-save_6.s looks like a GGC issue GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 during IPA pass: modref /src-local/gcc-master/gcc/testsuite/gfortran.dg/save_6.f90:54:3: internal compiler error: Segmentation fault: 11 54 | end | ^ 0x1017d33c5 crash_signal /src-local/gcc-master/gcc/toplev.c:330 0x1012e52d6 modref_tree::merge(modref_tree*, vec*) /src-local/gcc-master/gcc/ipa-modref-tree.h:420 0x1012e35b9 modref_propagate_in_scc /src-local/gcc-master/gcc/ipa-modref.c:2440 0x1012e3ac9 execute /src-local/gcc-master/gcc/ipa-modref.c:2549 = Process 49712 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT) frame #0: 0x0001012e52d6 f951`modref_tree::merge(this=0xa5a5a5a5a5a5a5a5, other=0x0001469008c0, parm_map=0x7fff5fbff330) at ipa-modref-tree.h:420 417 Return true if something has changed. */ 418bool merge (modref_tree *other, vec *parm_map) 419{ -> 420 if (!other || every_base) 421return false; 422 if (other->every_base) 423{ Target 0: (f951) stopped.
[Bug tree-optimization/97757] [11 Regression] fortran save_6.f90 fails with a segv for -flto -O >= 2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97757 Iain Sandoe changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2020-11-08 Status|UNCONFIRMED |NEW CC||hubicka at gcc dot gnu.org Target||*-*-darwin* Keywords||ice-on-valid-code
[Bug libstdc++/97758] New: bits/std_function.h: error: unknown type name 'type_info' when using -fno-exceptions -fno-rtti
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97758 Bug ID: 97758 Summary: bits/std_function.h: error: unknown type name 'type_info' when using -fno-exceptions -fno-rtti Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: romain.geissler at amadeus dot com Target Milestone: --- Hi, I am using the trunk from today (8th november, git revision b642fca1c31b2e2175e0860daf32b4ee0d918085). When trying to build clang with it I end up with this error (on Linux x86_64): FAILED: lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/ParallelCG.cpp.o /workdir/build/final-system/llvm-build/./bin/clang++ -DGTEST_HAS_RTTI=0 -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/CodeGen -I/workdir/src/llvm-12.0.0/llvm/lib/CodeGen -Iinclude -I/workdir/src/llvm-12.0.0/llvm/include -isystem /workdir/build/final-system/llvm-temporary-static-dependencies/install/include -O2 -I/workdir/build/final-system/llvm-temporary-static-dependencies/install/include -I/workdir/build/final-system/llvm-temporary-static-dependencies/install/include/ncursesw -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -fprofile-instr-generate="/workdir/build/final-system/llvm-build/tools/clang/stage2-instrumented-bins/profiles/%4m.profraw" -flto -O3 -DNDEBUG-fno-exceptions -fno-rtti -std=c++14 -MD -MT lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/ParallelCG.cpp.o -MF lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/ParallelCG.cpp.o.d -o lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/ParallelCG.cpp.o -c /workdir/src/llvm-12.0.0/llvm/lib/CodeGen/ParallelCG.cpp In file included from /workdir/src/llvm-12.0.0/llvm/lib/CodeGen/ParallelCG.cpp:13: In file included from /workdir/src/llvm-12.0.0/llvm/include/llvm/CodeGen/ParallelCG.h:17: In file included from /opt/1A/toolchain/x86_64-v21.0.10/lib64/gcc/x86_64-1a-linux-gnu/11.0.0/../../../../include/c++/11.0.0/functional:59: /opt/1A/toolchain/x86_64-v21.0.10/lib64/gcc/x86_64-1a-linux-gnu/11.0.0/../../../../include/c++/11.0.0/bits/std_function.h:190:31: error: unknown type name 'type_info' __dest._M_access() = nullptr; ^ 1 error generated. Note that apparently these llvm files are compiled with -fno-exceptions -fno-rtti, so it seems triggered by the recent changes around std::function without rtti support. Cheers, Romain
[Bug libstdc++/97759] New: Could std::has_single_bit implementation be faster?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97759 Bug ID: 97759 Summary: Could std::has_single_bit implementation be faster? Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: gcc-bugs at marehr dot dialup.fu-berlin.de Target Milestone: --- Hello gcc-team, we are thrilled that C++20 offers some efficient bit implementation and that we could exchange some of our own implementation with the standardized ones, making the code more accessible. I replaced our implementation and noticed that `std::has_single_bit` was slower than what we had before by around 30%. (The other functions matched our timings.) Additionally, we have a (micro-)benchmark that compares the standard arithmetic bit trick (https://graphics.stanford.edu/~seander/bithacks.html#DetermineIfPowerOf2) with the implementation where popcount == 1. We decided to use the arithmetic version, because we measured that it was faster than popcount on our machines (mostly intel processors). Interestingly, it seems that the popcount benchmark matches the std::has_single_bit time-wise, so I guess that std::has_single_bit is implemented via popcount. Those timings could be reproduced at an unknown location https://quick-bench.com/q/Y28keu_mSh25WwhO05T4SKrbHpk I don't know how to fix this, but I would expect that the optimizer would recognize popcount=1 and knows that there is a more efficient version. Or change the implementation to arithmetic, where again the optimizer could decide to replace that by a popcount if that is more efficient on some architecture? Thank you!
[Bug tree-optimization/97760] New: GCC outputs wrong values when compiling the testcase with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97760 Bug ID: 97760 Summary: GCC outputs wrong values when compiling the testcase with -O3 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: yangyang305 at huawei dot com Target Milestone: --- Hi, gcc-trunk outputs wrong values when compiling the attached testcase with -O3. gcc -O0 test.c -w && ./a.out 159,150,150 150 gcc -O3 test.c -w && ./a.out 159,123,123 123 GCC version: 11.0.0 20201106 (experimental)
[Bug tree-optimization/97760] GCC outputs wrong values when compiling the testcase with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97760 --- Comment #1 from yangyang --- Created attachment 49521 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49521&action=edit testcase
[Bug rtl-optimization/97705] [11 regression] cc.c-torture/unsorted/dump-noaddr.c.*r.ira fails after r11-4637
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97705 --- Comment #5 from CVS Commits --- The master branch has been updated by Kewen Lin : https://gcc.gnu.org/g:ce4ae1f4893e322495c5d24b2f0e807a7f7cf92f commit r11-4827-gce4ae1f4893e322495c5d24b2f0e807a7f7cf92f Author: Kewen Lin Date: Sun Nov 8 20:35:21 2020 -0600 ira: Recompute regstat as max_regno changes [PR97705] As PR97705 shows, the commit r11-4637 caused some dumping comparison difference error on pass ira. It exposed one issue about the newly introduced function remove_scratches, which can increase the largest pseudo reg number if it succeeds, later some function will use the max_reg_num() to get the latest max_regno, when iterating the numbers we can access some data structures which are allocated as the previous max_regno, some out of array bound accesses can occur, the failure can be random since the values beyond the array could be random. This patch is to free/reinit/recompute the relevant data structures that is regstat_n_sets_and_refs and reg_info_p to ensure we won't access beyond some array bounds. Bootstrapped/regtested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8. gcc/ChangeLog: PR rtl-optimization/97705 * ira.c (ira): Refactor some regstat free/init/compute invocation into lambda function regstat_recompute_for_max_regno, and call it when max_regno increases as remove_scratches succeeds.
[Bug libstdc++/97759] Could std::has_single_bit be faster?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97759 Thomas Koenig changed: What|Removed |Added Keywords||missed-optimization Severity|normal |enhancement CC||tkoenig at gcc dot gnu.org --- Comment #1 from Thomas Koenig --- Could you post the benchmark and the exact architecture where the arithmetic version is faster?
[Bug rtl-optimization/97705] [11 regression] cc.c-torture/unsorted/dump-noaddr.c.*r.ira fails after r11-4637
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97705 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #6 from Kewen Lin --- Should be fixed with latest trunk r11-4827.
[Bug rtl-optimization/97756] Inefficient handling of 128-bit arguments
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97756 --- Comment #1 from Thomas Koenig --- Actually, it was on a Ryzen 1700 (for the -march=native). I'm at odds with architecture names...
[Bug c++/93008] Need a way to make inlining heuristics ignore whether a function is inline
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93008 --- Comment #6 from Jan Hubicka --- I just noticed this PR and wonder if there is anything to do on inliner side. It uses DECL_DECLARED_INLINE that was invented to distinguish between implicit inlines and explicit ones. So even if it would be bit misnamed it should mean "this is an inline hint for inliner", so I guess frontend needs to distinguish between constexpr and normal places where inline hint still means "inline more"? Inliner is really not on level to be able to completely ignore used inline hints without regressing various code. I made inline weaker for -O2 in GCC10 but for -O3 we still take it very seriously and I do not see way out of that: in many cases it is very hard to predict how much optimization will happen after inlining and a lot of code is carefully crafted under assumption that some specific inline happens (and a lot of such code is in C++)
[Bug tree-optimization/97761] New: [11 Regression] ICE in vectorizable_live_operation, at tree-vect-loop.c:8689
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97761 Bug ID: 97761 Summary: [11 Regression] ICE in vectorizable_live_operation, at tree-vect-loop.c:8689 Product: gcc Version: 11.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com Target Milestone: --- Target: powerpc-*-linux-gnu Created attachment 49522 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49522&action=edit Testcase gfortran-11.0.0-alpha20201108 snapshot (g:b642fca1c31b2e2175e0860daf32b4ee0d918085) ICEs when compiling the attached testcase w/ -mvsx -O1 -ftree-slp-vectorize -fvect-cost-model=unlimited: % powerpc-e300c3-linux-gnu-gfortran-11.0.0 -mvsx -O1 -ftree-slp-vectorize -fvect-cost-model=unlimited -c ar6dubil.f90 during GIMPLE pass: slp ar6dubil.f90:11:15: 11 | subroutine ni (ps, bf) | ^ internal compiler error: in vectorizable_live_operation, at tree-vect-loop.c:8689 0x6f1f2c vectorizable_live_operation(vec_info*, _stmt_vec_info*, gimple_stmt_iterator*, _slp_tree*, _slp_instance*, int, bool, vec*) /var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-loop.c:8689 0x10b8087 can_vectorize_live_stmts /var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-stmts.c:10510 0x10df928 vect_transform_stmt(vec_info*, _stmt_vec_info*, gimple_stmt_iterator*, _slp_tree*, _slp_instance*) /var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-stmts.c:10894 0x726 vect_schedule_slp_node /var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:5437 0x111d0bc vect_schedule_scc /var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:5599 0x111ce2f vect_schedule_scc /var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:5580 0x111ce2f vect_schedule_scc /var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:5580 0x111ce2f vect_schedule_scc /var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:5580 0x111d40c vect_schedule_slp(vec_info*, vec<_slp_instance*, va_heap, vl_ptr>) /var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:5715 0x111ebba vect_slp_region /var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:4264 0x111ebba vect_slp_bbs /var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:4374 0x111fa9c vect_slp_function(function*) /var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vect-slp.c:4460 0x112208b execute /var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-11.0.0_alpha20201108/work/gcc-11-20201108/gcc/tree-vectorizer.c:1437
[Bug lto/80379] Redundant note: code may be misoptimized unless -fno-strict-aliasing is used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80379 --- Comment #3 from Jan Hubicka --- The problem here is that the hint is output at decl merging and -fno-strict-aliasing is a function local flag. At that time we do not even know what functions will be since units are not streamed in yet. This means that we do not know if some unit has function that is -fno-strict-aliasing. So supressing the warning does not fit the implementation very easily :(