[Bug target/114130] [11/12/13/14 Regression] RISC-V: `__atomic_compare_exchange` does not use sign-extended value for RV64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114130 Richard Biener changed: What|Removed |Added Target Milestone|--- |11.5
[Bug c++/66487] sanitizer/warnings for lifetime DSE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66487 --- Comment #28 from Alexander Monakov --- The bug is about the issue of lacking diagnostics, it should be fine to make note of various approaches to remedy the problem in one bug report. (in any case, all discussion of the Valgrind-based approach happened on the gcc-patches mailing list, not here)
[Bug target/114134] [14 Regression] Extra mov instructions for simple function compared with GCC13
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114134 Richard Biener changed: What|Removed |Added CC||sayle at gcc dot gnu.org Target|X86_64 |x86_64-*-* --- Comment #2 from Richard Biener --- I guess the testcase can be simplified to just show the return value handling issue.
[Bug tree-optimization/102435] gcc 9: aarch64 -ftree-loop-vectorize results in wrong code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102435 Andrew Pinski changed: What|Removed |Added Version|9.4.1 |9.3.0 See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=97236 Status|UNCONFIRMED |RESOLVED Target Milestone|--- |9.4 Resolution|--- |FIXED --- Comment #2 from Andrew Pinski --- So this looks like another testcase for PR 97236 . duration = static_cast(first[1].dts_ - first->dts_); first->duration_ = duration; is getting incorrectly vectorized even though dts_ is only used here and not the rest. So closing as fixed.
[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 102435, which changed state. Bug 102435 Summary: gcc 9: aarch64 -ftree-loop-vectorize results in wrong code https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102435 What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #40 from Lukas Grätz --- (In reply to Jakub Jelinek from comment #30) > (In reply to Lukas Grätz from comment #29) > > I belief this could and should be somehow be fixed by adding DWARF info that > > certain callee-saved registers (= the function parameter values) were > > overwritten. The corrected backtrace could look something like this: > > That can be arranged by emitting those .cfi_undefined directives... > > > #2 0x004011d2 in baz (a=42, b=43, c=44, d=, > > e=, f=, g=48, h=49) at /tmp/1.c:38 > > ... but really will not help users to debug/fix their code. > It seems that the reason for is ultimately -Og, not this patch. See Bug 78685. When compiling and debugging your program with -O0 instead, there is not a single .
[Bug libquadmath/114140] different results for std::fmin/std::fmax and quadmath fminq/fmaxq if one argument=signaling_NaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114140 --- Comment #15 from Richard Biener --- It's the old argument on whether isnan(NaN) should return true or false with -ffinite-math-only. With what we currently do "constant folding" sNaN into NaN would be correct with -fno-signalling-nans, likewise constant folding Inf into 42.0 is "correct" for -ffinite-math-only. You are basically invoking undefined beavior when introducing sNaN into a program without using -fsignalling-nans.
[Bug c++/114128] ice with -fstrub=internal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114128 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-02-28 Status|UNCONFIRMED |WAITING Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- incomplete bugreport
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #41 from Jakub Jelinek --- (In reply to Lukas Grätz from comment #40) > It seems that the reason for is ultimately -Og, not this > patch. See Bug 78685. No. When PR78685 would be fixed by adding artificial hidden uses of variables at the end of their scopes, this bug would trigger far more often. The vars would be live across the calls, so if there would be callee-saved registers available, the compiler would use them to hold the variables across the calls. And this bug would break that. Anyway, I've posted https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646649.html patch which will not revert the #c15/#c24 changes, but guard them with a non-default option. People who don't care about the harder debugging can use that option in their code, but widely used shared libraries with noreturn entrypoints will no longer screw up the debugging for all the packages that use them.
[Bug target/114143] Non-thumb arm32 code in thumb multilib for libgcc and in -mthumb build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114143 Christophe Lyon changed: What|Removed |Added CC||clyon at gcc dot gnu.org --- Comment #1 from Christophe Lyon --- What configure flags did you use? Only --target arm-eabi" ? What does gcc --print-multi-lib say?
[Bug tree-optimization/114041] wrong code with _BitInt() and -O -fgraphite-identity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114041 --- Comment #10 from Richard Biener --- (In reply to Jakub Jelinek from comment #9) > Created attachment 57554 [details] > gcc14-pr114041.patch > > stmt_simple_for_scop_p tests for INTEGRAL_TYPE_P (it used to test > INTEGER_TYPE some years ago), so I think we should do the same here too. Yes, I think the test in add_conditions_to_domain should be an assert, we can, at that point, not simply "ignore" any constraint (and while we technically can fail this function isn't set up for that).
[Bug middle-end/112938] ice with -fstrub=internal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112938 David Binderman changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #9 from David Binderman --- The reduced source code of comment 1 seems to compile ok, but the original attached source code doesn't.
[Bug c++/114128] ice with -fstrub=internal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114128 --- Comment #2 from David Binderman --- (In reply to Richard Biener from comment #1) > incomplete bugreport Sorry, my mistake. I created a new one, when an old one is a better place. See # 112938 for more details.
[Bug tree-optimization/112325] Missed vectorization of reduction after unrolling
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325 --- Comment #15 from rguenther at suse dot de --- On Wed, 28 Feb 2024, liuhongt at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325 > > --- Comment #14 from Hongtao Liu --- > (In reply to rguent...@suse.de from comment #13) > > On Tue, 27 Feb 2024, liuhongt at gcc dot gnu.org wrote: > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325 > > > > > > --- Comment #11 from Hongtao Liu --- > > > > > > >Loop body is likely going to simplify further, this is difficult > > > >to guess, we just decrease the result by 1/3. */ > > > > > > > > > > This is introduced by r0-68074-g91a01f21abfe19 > > > > > > /* Estimate number of insns of completely unrolled loop. We assume > > > + that the size of the unrolled loop is decreased in the > > > + following way (the numbers of insns are based on what > > > + estimate_num_insns returns for appropriate statements): > > > + > > > + 1) exit condition gets removed (2 insns) > > > + 2) increment of the control variable gets removed (2 insns) > > > + 3) All remaining statements are likely to get simplified > > > + due to constant propagation. Hard to estimate; just > > > + as a heuristics we decrease the rest by 1/3. > > > + > > > + NINSNS is the number of insns in the loop before unrolling. > > > + NUNROLL is the number of times the loop is unrolled. */ > > > + > > > +static unsigned HOST_WIDE_INT > > > +estimated_unrolled_size (unsigned HOST_WIDE_INT ninsns, > > > +unsigned HOST_WIDE_INT nunroll) > > > +{ > > > + HOST_WIDE_INT unr_insns = 2 * ((HOST_WIDE_INT) ninsns - 4) / 3; > > > + if (unr_insns <= 0) > > > +unr_insns = 1; > > > + unr_insns *= (nunroll + 1); > > > + > > > + return unr_insns; > > > +} > > > > > > And r0-93444-g08f1af2ed022e0 try do it more accurately by marking > > > likely_eliminated stmt and minus that from total insns, But 2 / 3 is still > > > keeped. > > > > > > +/* Estimate number of insns of completely unrolled loop. > > > + It is (NUNROLL + 1) * size of loop body with taking into account > > > + the fact that in last copy everything after exit conditional > > > + is dead and that some instructions will be eliminated after > > > + peeling. > > > > > > - NINSNS is the number of insns in the loop before unrolling. > > > - NUNROLL is the number of times the loop is unrolled. */ > > > + Loop body is likely going to simplify futher, this is difficult > > > + to guess, we just decrease the result by 1/3. */ > > > > > > static unsigned HOST_WIDE_INT > > > -estimated_unrolled_size (unsigned HOST_WIDE_INT ninsns, > > > +estimated_unrolled_size (struct loop_size *size, > > > unsigned HOST_WIDE_INT nunroll) > > > { > > > - HOST_WIDE_INT unr_insns = 2 * ((HOST_WIDE_INT) ninsns - 4) / 3; > > > + HOST_WIDE_INT unr_insns = ((nunroll) > > > +* (HOST_WIDE_INT) (size->overall > > > + - > > > size->eliminated_by_peeling)); > > > + if (!nunroll) > > > +unr_insns = 0; > > > + unr_insns += size->last_iteration - > > > size->last_iteration_eliminated_by_peeling; > > > + > > > + unr_insns = unr_insns * 2 / 3; > > >if (unr_insns <= 0) > > > unr_insns = 1; > > > - unr_insns *= (nunroll + 1); > > > > > > It looks to me 1 / 3 overestimates the instructions that can be optimised > > > away, > > > especially if we've subtracted eliminated_by_peeling > > > > Yes, that 1/3 reduction is a bit odd - you could have the same effect > > by increasing the instruction limit by 1/3, but that means it doesn't > > really matter, does it? It would be interesting to see if increasing > > the limit by 1/3 and removing the above is neutral on SPEC? > > Remove 1/3 reduction get ~2% improvement for 525.x264_r on SPR with > -march=native -O3, no big impact on other integer benchmark. 454.calculix was always the benchmark to cross check as that benefits from much unrolling. I'm all for removing the 1/3 for innermost loop handling (in cunroll the unrolled loop is then innermost). I'm more concerned about unrolling more than one level which is exactly what's required for 454.calculix.
[Bug tree-optimization/112325] Missed vectorization of reduction after unrolling
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325 --- Comment #16 from Hongtao Liu --- > I'm all for removing the 1/3 for innermost loop handling (in cunroll > the unrolled loop is then innermost). I'm more concerned about > unrolling more than one level which is exactly what's required for > 454.calculix. Removing 1/3 for the innermost loop would be sufficient to solve both the issue in the PR and x264_pixel_var_8x8 from 525.x264_r. I'll try to benchmark that.
[Bug middle-end/113988] during GIMPLE pass: bitintlower: internal compiler error: in lower_stmt, at gimple-lower-bitint.cc:5470
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113988 --- Comment #28 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:cc383e9702897dd783657ea3dce4aecf48318441 commit r14-9203-gcc383e9702897dd783657ea3dce4aecf48318441 Author: Jakub Jelinek Date: Wed Feb 28 09:40:15 2024 +0100 gimple-fold: Use bitwise vector types rather than barely supported huge integral types in memcpy etc. folding [PR113988] The following patch changes the memcpy etc. folding to use bitwise vector types rather than huge INTEGER_TYPEs for copying of > MAX_FIXED_MODE_SIZE lengths. The problem with the huge INTEGER_TYPEs is that they aren't supported very much, usually there are just optabs to handle moves of them, perhaps misaligned moves and that is it, so they pose problems e.g. to BITINT_TYPE lowering. 2024-02-28 Jakub Jelinek PR tree-optimization/113988 * stor-layout.h (bitwise_mode_for_size): Declare. * stor-layout.cc (bitwise_mode_for_size): New function. * gimple-fold.cc (gimple_fold_builtin_memory_op): Use it. Use bitwise_type_for_mode instead of build_nonstandard_integer_type. Use BITS_PER_UNIT instead of 8. * gcc.dg/bitint-91.c: New test.
[Bug tree-optimization/114041] wrong code with _BitInt() and -O -fgraphite-identity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114041 --- Comment #11 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:d6479050ecef10fd5e67b4da989229e4cfac53ee commit r14-9204-gd6479050ecef10fd5e67b4da989229e4cfac53ee Author: Jakub Jelinek Date: Wed Feb 28 09:59:45 2024 +0100 graphite: Fix non-INTEGER_TYPE integral comparison handling [PR114041] The following testcases are miscompiled, because graphite ignores boolean, enumerated or _BitInt comparisons, rewrites the code as if the comparisons were always true or always false. The INTEGER_TYPE checks were initially added in r6-2239 but at that point it was both in add_conditions_to_domain and in parameter_index_in_region. Later on the check was also added to stmt_simple_for_scop_p, and finally r8-3931 changed the stmt_simple_for_scop_p check to INTEGRAL_TYPE_P and turned the parameter_index_in_region -> assign_parameter_index_in_region into INTEGRAL_TYPE_P assertion, but the add_conditions_to_domain check for INTEGER_TYPE remained. The following patch uses INTEGRAL_TYPE_P to complete the change. 2024-02-28 Jakub Jelinek PR tree-optimization/114041 * graphite-sese-to-poly.cc (add_conditions_to_domain): Check for INTEGRAL_TYPE_P check rather than INTEGER_TYPE. * gcc.dg/graphite/run-id-pr114041-1.c: New test. * gcc.dg/graphite/run-id-pr114041-2.c: New test.
[Bug middle-end/113988] during GIMPLE pass: bitintlower: internal compiler error: in lower_stmt, at gimple-lower-bitint.cc:5470
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113988 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Status|NEW |RESOLVED --- Comment #29 from Jakub Jelinek --- Fixed.
[Bug tree-optimization/114041] wrong code with _BitInt() and -O -fgraphite-identity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114041 --- Comment #12 from Jakub Jelinek --- I can change the comparison into assert, or defer that for stage1?
[Bug tree-optimization/59859] [meta-bug] GRAPHITE issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59859 Bug 59859 depends on bug 114041, which changed state. Bug 114041 Summary: wrong code with _BitInt() and -O -fgraphite-identity https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114041 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/114041] wrong code with _BitInt() and -O -fgraphite-identity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114041 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #13 from Jakub Jelinek --- Anyway, miscompilation now fixed.
[Bug target/98877] [AArch64] Inefficient code generated for tbl NEON intrinsics
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877 --- Comment #9 from Tamar Christina --- While RA should be able to deal with this, shouldn't we also just lower TBLs in gimple? This no reason why this can't be a VEC_PERM_EXPR which would also get the copies removed at the gimple level and allows us to optimize this to something else depending on the index.
[Bug tree-optimization/114041] wrong code with _BitInt() and -O -fgraphite-identity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114041 --- Comment #14 from rguenther at suse dot de --- On Wed, 28 Feb 2024, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114041 > > --- Comment #12 from Jakub Jelinek --- > I can change the comparison into assert, or defer that for stage1? Defer I think, if you want to bother ...
[Bug tree-optimization/114145] New: Missed optimization of loop deletion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114145 Bug ID: 114145 Summary: Missed optimization of loop deletion Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: 652023330028 at smail dot nju.edu.cn Target Milestone: --- Hello, we noticed that in the code below, looping is not necessary, but gcc seems to have missed this optimization. https://godbolt.org/z/sYqzh8M3c int a, b; void func(int c){ for(int i=0;i<700;i++){ b=c; c=a; } } GCC -O3: func(int): mov edx, DWORD PTR a[rip] mov eax, 700 jmp .L2 .L3: sub eax, 3 mov edi, edx .L2: cmp eax, 1 jne .L3 mov DWORD PTR b[rip], edi ret Expected code (Clang): func(int): # @func(int) mov eax, dword ptr [rip + a] mov dword ptr [rip + b], eax ret Thank you very much for your time and effort! We look forward to hearing from you.
[Bug fortran/114146] New: REPEATABLE argument of RANDOM_INIT and repeated execution of the program
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114146 Bug ID: 114146 Summary: REPEATABLE argument of RANDOM_INIT and repeated execution of the program Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: wxcvbn789456123-nw6wda at yahoo dot com Target Milestone: --- Reading the documentation for the random_init subroutine at this address: https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gfortran/RANDOM_005fINIT.html This documentation states the following when this subroutine is called with an argument REPEATABLE set to .TRUE. : "If it [REPEATABLE] is .true., the seed is set to a processor-dependent value that is the same each time RANDOM_INIT is called from the same image. The term “same image” means a single instance of program execution. The sequence of random numbers is different for repeated execution of the program." (The same text appears in version 11.4.0 of the documentation). However, in the example below, repeated executions of the program "a.exe" generate the same sequence of random numbers. bash 1 : uname -smo CYGWIN_NT-10.0-19045 x86_64 Cygwin bash 2 : gfortran --version | head -2 GNU Fortran (GCC) 11.4.0 Copyright (C) 2021 Free Software Foundation, Inc. bash 3 : cat a.f90 PROGRAM random_init_test REAL :: x(2) CALL random_init(REPEATABLE=.TRUE., IMAGE_DISTINCT=.TRUE.) CALL random_number(x) PRINT *, x END PROGRAM random_init_test bash 4 : gfortran a.f90 -o a.exe bash 5 : ./a.exe 0.825262189 0.191325366 bash 6 : ./a.exe 0.825262189 0.191325366 bash 7 : sleep 10 bash 8 : ./a.exe 0.825262189 0.191325366
[Bug libstdc++/114103] FAIL: 29_atomics/atomic/lock_free_aliases.cc -std=gnu++20 (test for excess errors)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114103 Jonathan Wakely changed: What|Removed |Added Target Milestone|--- |13.3 Assignee|unassigned at gcc dot gnu.org |redi at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #42 from Lukas Grätz --- (In reply to Jakub Jelinek from comment #41) > (In reply to Lukas Grätz from comment #40) > > It seems that the reason for is ultimately -Og, not this > > patch. See Bug 78685. > > No. When PR78685 would be fixed by adding artificial hidden uses of > variables at the end of their scopes, this bug would trigger far more often. > The vars would be live across the calls, so if there would be callee-saved > registers available, the compiler > would use them to hold the variables across the calls. And this bug would > break that. It could be done that way. But I think a better fix for PR78685 would be to save the function parameter values to the stack (and than this problem will not trigger that often). For the following reasons: (1) Timing for push and mov instructions are similar, so the execution speed wouldn't be much affected. (2) A callee needs to somehow restore callee-saved registers, but only if it returns. So the calling conventions cannot guarantee that callee-saved registers are saved somewhere for noreturn functions. But of course, if you disregard this optimization, this would not trigger that often. (3) Potential register pressure when saving additional variables to callee-saved registers: If the execution itself no longer needs the value of a function parameter, there is no need to hold it in a (callee-saved) register accross calls for a quick access. The stack is sufficient for accessing the values with the debugger. (4) The entry values of function parameters should be more helpful, not some later values. E.g., for int foo(int i) { if (i == 42) { h(); } i = 7; bar(); } we would be more interested in the original value of "i" and not the later value "i = 7" as saved by "artificial hidden uses of variables at the end of their scopes". By saving original values to the stack before they are modified, we can keep inspecting the original values. The helpful backtrace from within bar() could be: #1 bar() #2 foo(i@entry=42) The other version would be a bit counter-intuitive, when the argument to foo really was i=42: #1 bar() #2 foo(i=7) Btw., function parameters are not normally part of the backtrace (this is just a nice gdb feature), see Wikipedia: https://en.wikipedia.org/wiki/Stack_trace > Anyway, I've posted > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646649.html > patch which will not revert the #c15/#c24 changes, but guard them with a > non-default option. People who don't care about the harder debugging can > use that option in their code, but widely used shared libraries with > noreturn entrypoints will no longer screw up the debugging for all the > packages that use them. Yes, it took me long, but I agree, it would be better to not worsen debugging experience.
[Bug target/114134] [14 Regression] Extra mov instructions for simple function compared with GCC13
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114134 --- Comment #3 from Pilar Latiesa --- (In reply to Richard Biener from comment #2) > I guess the testcase can be simplified to just show the return value > handling issue. I think this suffices: struct TVec3D { double x, y, z; }; struct TKey { int i, j, k; }; TKey Key(TVec3D const &r) { return {int(r.x), int(r.y), int(r.z)}; }
[Bug middle-end/94083] inefficient soft-float x!=Inf code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94083 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org, ||jsm28 at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- Shall __builtin_isinf (x) or __builtin_isinf_sign (x) raise exception if x is a sNaN? Or never? Or it can but doesn't have to? glibc's int64_t hx,lx; GET_LDOUBLE_WORDS64(hx,lx,x); lx |= (hx & 0x7fffLL) ^ 0x7fffLL; lx |= -lx; return ~(lx >> 63) & (hx >> 62); doesn't, but I think when we lower __builtin_isinf to fabs (x) (which should just clear the sign bit, not raise exception) u<= , it would. If we wouldn't need to raise exception, I think fastest would be to pattern recognize the fabs (x) <= and emit there the (lx | ((hx & 0x7fffLL) ^ 0x7fffLL)) != 0. But __builtin_islessequal (__builtin_fabsf128 (x), __builtin_nextafterf128 (__builtin_inff128 (), 0.0f128)) I think should raise exception and those 2 will be indistinguishable, so maybe just recognize that case during expansion if 2 libcalls would be needed and emit the equality comparison instead. Or do both depending on if -fsignaling-nans is specified or not?
[Bug c++/114013] [14 Regression] Specializations of var templates no longer emitted since r14-8987
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114013 Jakub Jelinek changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #5 from Jakub Jelinek --- Fixed.
[Bug tree-optimization/114075] [14 Regression] s390x miscompilation since r14-322
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114075 --- Comment #6 from GCC Commits --- The master branch has been updated by Juergen Christ : https://gcc.gnu.org/g:82ebfd35da49e5df87da132a7b8c41baeebc57b4 commit r14-9205-g82ebfd35da49e5df87da132a7b8c41baeebc57b4 Author: Juergen Christ Date: Mon Feb 19 10:10:35 2024 +0100 Only emulate integral vectors. The emulation via word mode tries to perform integer arithmetic on floating point values instead of floating point arithmetic. This leads to mis-compilations. Failure occured on s390x on these existing test cases: gcc.dg/vect/tsvc/vect-tsvc-s112.c gcc.dg/vect/tsvc/vect-tsvc-s113.c gcc.dg/vect/tsvc/vect-tsvc-s119.c gcc.dg/vect/tsvc/vect-tsvc-s121.c gcc.dg/vect/tsvc/vect-tsvc-s131.c gcc.dg/vect/tsvc/vect-tsvc-s132.c gcc.dg/vect/tsvc/vect-tsvc-s2233.c gcc.dg/vect/tsvc/vect-tsvc-s421.c gcc.dg/vect/vect-alias-check-14.c gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c gcc/ChangeLog: PR tree-optimization/114075 * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating point vectors Signed-off-by: Juergen Christ
[Bug middle-end/94083] inefficient soft-float x!=Inf code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94083 Harald van Dijk changed: What|Removed |Added CC||harald at gigawatt dot nl --- Comment #4 from Harald van Dijk --- (In reply to Jakub Jelinek from comment #3) > Shall __builtin_isinf (x) or __builtin_isinf_sign (x) raise exception if x > is a sNaN? > Or never? Or it can but doesn't have to? Never. See also bug #66462 which also has a not-quite-right patch that was committed and reverted, and the fixed patch posted but never committed and then forgotten. I'm not 100% sure of the impact of that patch on soft-float but at a quick glance it seems to use bitwise integer arithmetic which should avoid libcalls entirely.
[Bug tree-optimization/114145] Missed optimization of loop deletion
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114145 Richard Biener changed: What|Removed |Added Last reconfirmed||2024-02-28 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Store motion turns this into [local count: 10737416]: b_lsm.3_3 = _15(D); c_7 = a; [local count: 1063004408]: # c_12 = PHI # i_13 = PHI b_lsm.3_2 = c_12; i_8 = i_13 + 1; if (i_8 != 700) goto ; [98.99%] else goto ; [1.01%] [local count: 1052266995]: goto ; [100.00%] [local count: 10737416]: # b_lsm.3_11 = PHI b = b_lsm.3_11; return; where ultimatively final value replacement fails because SCEV fails here: (analyze_scalar_evolution (loop_nb = 1) (scalar = c_7) (get_scalar_evolution (scalar = c_7) (scalar_evolution = )) ) (instantiate_scev (instantiate_below = 2 -> 3) (evolution_loop = 1) (chrec = c_7) (res = c_7)) (evolution_function = scev_not_known)) indeed there's no way to express the evolution of this induction variable which has just two values. Might be a simple thing to special case in final value replacement though. There we see just [local count: 1063004408]: # c_12 = PHI ... [local count: 10737416]: # c_9 = PHI so the final value is niter == 0 ? c_4(D) : c_7.
[Bug middle-end/94083] inefficient soft-float x!=Inf code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94083 --- Comment #5 from Jakub Jelinek --- Ah, ok. So then expansion should just concentrate on the fabs (x) <= nextafter (inf, 0) case for soft-float case and defer the rest to PR66462 which would handle that much earlier.
[Bug libstdc++/114147] New: tuple allocator-extended constructor requires non-explicit default constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114147 Bug ID: 114147 Summary: tuple allocator-extended constructor requires non-explicit default constructor Product: gcc Version: 10.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: victor.dyachenko at protonmail dot com Target Milestone: --- GCC 10.1 fails to compile this code (GCC 9.1 is OK): #include #include struct C { explicit C() = default; }; int main() { std::tuple t(std::allocator_arg, std::allocator{}); }
[Bug libstdc++/101203] Remove unnecessary empty check in std::function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101203 Toni Neubert changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |WONTFIX --- Comment #5 from Toni Neubert --- Improvement not possible for all environment.
[Bug libstdc++/114147] tuple allocator-extended constructor requires non-explicit default constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114147 --- Comment #1 from __vic --- Why _ImplicitDefaultCtor is required here? template::value, _T1, _T2> = true> _GLIBCXX20_CONSTEXPR tuple(allocator_arg_t __tag, const _Alloc& __a) : _Inherited(__tag, __a) { } Missing overload for explicit tuple(allocator_arg_t, ...)?
[Bug tree-optimization/114075] [14 Regression] s390x miscompilation since r14-322
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114075 --- Comment #7 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:db465230cccf0844e803dd6701756054fe97244a commit r14-9206-gdb465230cccf0844e803dd6701756054fe97244a Author: Jakub Jelinek Date: Wed Feb 28 11:49:29 2024 +0100 testsuite: Add testcase for recently fixed PR [PR114075] This adds testcase from PR114075 which has been fixed by the r14-9205 change on s390x-linux with -march=z13. 2024-02-28 Jakub Jelinek PR tree-optimization/114075 * gcc.dg/gomp/pr114075.c: New test.
[Bug tree-optimization/91567] [10 Regression] Spurious -Wformat-overflow warnings building glibc (32-bit only)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91567 --- Comment #4 from GCC Commits --- The master branch has been updated by Rainer Orth : https://gcc.gnu.org/g:6864a2aa78a893afea26eb8fc1aa4b7ade3e940f commit r14-9207-g6864a2aa78a893afea26eb8fc1aa4b7ade3e940f Author: Rainer Orth Date: Wed Feb 28 11:55:47 2024 +0100 testsuite: Fix gcc.dg/tree-ssa/builtin-snprintf-6.c XPASS on i?86 -m64 [PR91567] gcc.dg/tree-ssa/builtin-snprintf-6.c currently XPASSes on i?86-*-* configurations with -m64: XPASS: gcc.dg/tree-ssa/builtin-snprintf-6.c scan-tree-dump-times optimized "Function test_assign_aggregate" 1 (seen e.g. on i386-pc-solaris2.11, i686-pc-linux-gnu, or i386-apple-darwin*). The problem is that the xfail only handles x86_64, ignoring that i?86 configurations can also be multilibbed. This patch fixes the by handling both forms alike. Tested on i386-pc-solaris2.11, amd64-pc-solaris2.11, sparc-sun-solaris2.11, and sparcv9-sun-solaris2.11. 2024-02-28 Rainer Orth gcc/testsuite: PR tree-optimization/91567 * gcc.dg/tree-ssa/builtin-snprintf-6.c (scan-tree-dump-times): Treat i?86-*-* like x86_64-*-*.
[Bug tree-optimization/114075] [14 Regression] s390x miscompilation since r14-322
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114075 Jakub Jelinek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jchrist at linux dot ibm.com Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #8 from Jakub Jelinek --- Fixed, thanks.
[Bug c++/106851] [modules] Name conflict for exported using-declaration
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106851 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-28 Ever confirmed|0 |1 Keywords|rejects-valid |diagnostic --- Comment #3 from Jonathan Wakely --- (In reply to Nathaniel Shead from comment #2) > This behaviour should be as expected right? Quite possibly, I still don't know how to use modules. I was just trying to figure out how to define a 'std' module that includes all the library headers and then exports everything. Let's reclassify this as a diagnostic bug then.
[Bug testsuite/111462] [14 regression] gcc.dg/tree-ssa/ssa-sink-18.c fails after r14-4089-gd45ddc2c04e471
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111462 --- Comment #12 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:92f07eb406612fa341dc33d9d6e4f3781dc09452 commit r14-9208-g92f07eb406612fa341dc33d9d6e4f3781dc09452 Author: Jakub Jelinek Date: Wed Feb 28 12:09:04 2024 +0100 testsuite: XFAIL ssa-sink-18.c also on powerpc64 [PR111462] powerpc64-linux apparently (not very surprisingly) behaves the same way as powerpc64le-linux and has 4 sunk statements rather than 5, so we should xfail it on powerpc64*-*-* rather than just powerpc64le-*-*. powerpc-linux has 3 sunk statements, but the scan pattern is done for lp64 only as the comment explains. 2024-02-28 Jakub Jelinek PR testsuite/111462 * gcc.dg/tree-ssa/ssa-sink-18.c: XFAIL also on powerpc64.
[Bug testsuite/111462] [14 regression] gcc.dg/tree-ssa/ssa-sink-18.c fails after r14-4089-gd45ddc2c04e471
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111462 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org Status|REOPENED|RESOLVED Resolution|--- |FIXED --- Comment #13 from Jakub Jelinek --- Fixed.
[Bug libstdc++/114147] tuple allocator-extended constructor requires non-explicit default constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114147 --- Comment #2 from __vic --- Shouldn't this be added? template::value, _T1, _T2> = true> explicit _GLIBCXX20_CONSTEXPR tuple(allocator_arg_t __tag, const _Alloc& __a) : _Inherited(__tag, __a) { }
[Bug libstdc++/114147] [10/11/12/13 Regression] tuple allocator-extended constructor requires non-explicit default constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114147 Jonathan Wakely changed: What|Removed |Added Last reconfirmed||2024-02-28 Known to work||14.0 Keywords||rejects-valid Target Milestone|--- |11.5 Summary|tuple allocator-extended|[10/11/12/13 Regression] |constructor requires|tuple allocator-extended |non-explicit default|constructor requires |constructor |non-explicit default ||constructor Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #3 from Jonathan Wakely --- N.B. GCC 10 is no longer supported. This was fixed on trunk by r14-7225
[Bug target/113960] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 --- Comment #13 from mfarca --- Would you please backport this to 12 when the patch lands?
[Bug target/113960] [11/12/13/14 Regression] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 Jonathan Wakely changed: What|Removed |Added Summary|std::map with std::vector |[11/12/13/14 Regression] |as input overwrites itself |std::map with std::vector |with c++20, on s390x|as input overwrites itself |platform|with c++20, on s390x ||platform Target Milestone|--- |11.5 --- Comment #14 from Jonathan Wakely --- Yes, definitely.
[Bug target/113960] [11/12/13/14 Regression] std::map with std::vector as input overwrites itself with c++20, on s390x platform
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113960 Jonathan Wakely changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |redi at gcc dot gnu.org
[Bug libquadmath/114140] different results for std::fmin/std::fmax and quadmath fminq/fmaxq if one argument=signaling_NaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114140 Xi Ruoyao changed: What|Removed |Added Keywords||documentation CC||xry111 at gcc dot gnu.org --- Comment #16 from Xi Ruoyao --- (In reply to Richard Biener from comment #15) > It's the old argument on whether isnan(NaN) should return true or false with > -ffinite-math-only. With what we currently do "constant folding" sNaN into > NaN would be correct with -fno-signalling-nans, likewise constant folding > Inf into 42.0 is "correct" for -ffinite-math-only. > > You are basically invoking undefined beavior when introducing sNaN into a > program without using -fsignalling-nans. Then we should make it more clear in invoke.texi. Currently the doc is implying the worst consequence using sNaN with -fno-signalling-nans is "changing the number of raised exceptions."
[Bug target/114134] [14 Regression] Extra mov instructions for simple function compared with GCC13 since r14-2386
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114134 Jakub Jelinek changed: What|Removed |Added Summary|[14 Regression] Extra mov |[14 Regression] Extra mov |instructions for simple |instructions for simple |function compared with |function compared with |GCC13 |GCC13 since r14-2386 CC||jakub at gcc dot gnu.org --- Comment #4 from Jakub Jelinek --- Started with r14-2386-gbdf2737cda53a83332db1a1a021653447b05a7e7
[Bug middle-end/94083] inefficient soft-float x!=Inf code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94083 --- Comment #6 from Joseph S. Myers --- Contrary to what was claimed in bug 66462, I don't think there ever was a fixed patch. Note that in bug 66462 comment 19, "June" is June 2017 but "November" is November 2016 - the "November" one is the *older* one.
[Bug target/114143] Non-thumb arm32 code in thumb multilib for libgcc and in -mthumb build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114143 Richard Earnshaw changed: What|Removed |Added Last reconfirmed||2024-02-28 Status|UNCONFIRMED |WAITING Ever confirmed|0 |1 --- Comment #2 from Richard Earnshaw --- You probably haven't built the correct multilibs. See Christophe's comments
[Bug c++/92687] decltype of a structured binding to a tuple component is a reference type inside a template function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92687 Christopher Nerz changed: What|Removed |Added CC||Christopher.Nerz at de dot bosch.c ||om --- Comment #2 from Christopher Nerz --- Same error happens for all other gcc versions I checked, ranging from 8.3 to 13.2. Note that the problem does not arise if you replace the tuple with a struct { int a; int b;}. Seems strongly related: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102116
[Bug modula2/102344] gm2/pim/fail/TestLong4.mod FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102344 Rainer Orth changed: What|Removed |Added Status|RESOLVED|REOPENED Ever confirmed|0 |1 Resolution|FIXED |--- Last reconfirmed||2024-02-28 --- Comment #6 from Rainer Orth --- Unfortunately, the test still FAILs for 32-bit configurations like i386-pc-solaris2.11 or sparc-sun-solaris2.11 with -m32 or x86_64-pc-linux-gnu with -m32: FAIL: gm2/pim/pass/TestLong4.mod, -O FAIL: gm2/pim/pass/TestLong4.mod, -O -g FAIL: gm2/pim/pass/TestLong4.mod, -O3 -fomit-frame-pointer FAIL: gm2/pim/pass/TestLong4.mod, -O3 -fomit-frame-pointer -finline-functions FAIL: gm2/pim/pass/TestLong4.mod, -Os FAIL: gm2/pim/pass/TestLong4.mod, -g /vol/gcc/src/hg/master/local/gcc/testsuite/gm2/pim/pass/TestLong4.mod:26:6: warning: attempting to assign a value '9223372036854775808' to a designator 'l' which will exceed the range of type 'LONGCARD'
[Bug target/114134] [14 Regression] Extra mov instructions for simple function compared with GCC13 since r14-2386
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114134 --- Comment #5 from Pilar Latiesa --- Another testcase: struct TKey { int i, j, k, w; }; TKey Key(int x) { return {x, 0, x, 0}; }
[Bug c++/92687] decltype of a structured binding to a tuple component is a reference type inside a template function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92687 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- Testcase without using library: namespace std { template struct tuple_size; template struct tuple_element; } struct A { int i; template int& get() { return i; } }; template<> struct std::tuple_size { static const int value = 2; }; template struct std::tuple_element { using type = int; }; template struct is_reference { static const bool value = false; }; template struct is_reference { static const bool value = true; }; template struct is_reference { static const bool value = true; }; template void foo () { auto [x, y] = A {}; static_assert (!is_reference::value, ""); } void bar () { auto [x, y] = A {}; static_assert (!is_reference::value, ""); } template void baz () { auto [x, y] = T {}; static_assert (!is_reference::value, ""); } void qux () { foo<0> (); baz (); } which shows it is only for the non-dependent structured binding in a template case.
[Bug libquadmath/114140] different results for std::fmin/std::fmax and quadmath fminq/fmaxq if one argument=signaling_NaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114140 --- Comment #17 from Richard Biener --- (In reply to Xi Ruoyao from comment #16) > (In reply to Richard Biener from comment #15) > > It's the old argument on whether isnan(NaN) should return true or false with > > -ffinite-math-only. With what we currently do "constant folding" sNaN into > > NaN would be correct with -fno-signalling-nans, likewise constant folding > > Inf into 42.0 is "correct" for -ffinite-math-only. > > > > You are basically invoking undefined beavior when introducing sNaN into a > > program without using -fsignalling-nans. > > Then we should make it more clear in invoke.texi. Currently the doc is > implying the worst consequence using sNaN with -fno-signalling-nans is > "changing the number of raised exceptions." Yeah, the options that are not enabled by default (-fno-signalling-nans, -fno-trapping-math) could see improvement here. Usually -fno-X enables optimizations that would be invalid when X happens/is present. But that is nothing else than giving a free ticked to undefined behavior, maybe "constrained undefined behavior", but I'm not 100% sure we'd live up to that. Aka isnan (NaN) is optimized to false with -ffinite-math-only, something that's not valid when non-finite numbers are present. Whether doing so for literal NaN is "nice" remains questionable, but it's at least consistent with the behavior of x = NaN; isnan (x); So similarly -fno-signaling-nans enables optimizations that are not valid when sNaNs are present. Indeed "optimizations that may change the number of exceptions visible with signaling NaNs" is over-promising even if effectively this is all that happens.
[Bug target/114148] New: gcc.target/i386/pr106010-7b.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114148 Bug ID: 114148 Summary: gcc.target/i386/pr106010-7b.c FAILs Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ro at gcc dot gnu.org CC: liuhongt at gcc dot gnu.org Target Milestone: --- Target: i?86-pc-solaris2.11, amd64-pc-solaris2.11 The gcc.target/i386/pr106010-7b.c test FAILs on Solaris/x86 (32 and 64-bit) since 20230426: FAIL: gcc.target/i386/pr106010-7b.c execution test The test aborts, which only happens at -g3 -O0: Thread 2 received signal SIGABRT, Aborted. [Switching to Thread 1 (LWP 1)] 0xfe02ede5 in __lwp_sigqueue () from /lib/libc.so.1 (gdb) bt #0 0xfe02ede5 in __lwp_sigqueue () from /lib/libc.so.1 #1 0xfe026aef in thr_kill () from /lib/libc.so.1 #2 0xfdf5e142 in raise () from /lib/libc.so.1 #3 0xfdf2b474 in abort () from /lib/libc.so.1 #4 0x080526b8 in avx_test () at /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr106010-7b.c:52 #5 0x08052107 in do_test () at /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/avx-check.h:12 #6 0x0805215d in main () at /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/avx-check.h:40 I've found that removing all but the ps_* and epi8_* tests still lets the test abort; once you also remove the epi8_* tests, the abort is gone.
[Bug target/114148] gcc.target/i386/pr106010-7b.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114148 --- Comment #1 from Rainer Orth --- Created attachment 57557 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57557&action=edit 32- bit i386-pc-solaris2.11 assembler output
[Bug target/114148] gcc.target/i386/pr106010-7b.c FAILs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114148 --- Comment #2 from Rainer Orth --- Created attachment 57558 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57558&action=edit 32-bit i686-pc-linux-gnu assembler output I'm attaching the assembler output for the reduced (all but ps_* and epi8_* removed) tests for both Solaris and Linux.
[Bug libstdc++/114149] New: lexicographical_compare should use memcmp for C++20 contiguous iterators as well as pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114149 Bug ID: 114149 Summary: lexicographical_compare should use memcmp for C++20 contiguous iterators as well as pointers Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: redi at gcc dot gnu.org Target Milestone: --- __lexicographical_compare_aux1 has: const bool __simple = (__is_memcmp_ordered_with<_ValueType1, _ValueType2>::__value && __is_pointer<_II1>::__value && __is_pointer<_II2>::__value #if __cplusplus > 201703L && __glibcxx_concepts // For C++20 iterator_traits::value_type is non-volatile // so __is_byte could be true, but we can't use memcmp with // volatile data. && !is_volatile_v>> && !is_volatile_v>> #endif We should use memcmp for contiguous iterators, not only pointers. There's a similar condition in ranges::lexicographical_compare.
[Bug tree-optimization/114108] [14 regression] ICE when building opencv-4.8.1 (error: type mismatch in binary expression) since r14-1833
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114108 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org Keywords|needs-bisection | Summary|[14 regression] ICE when|[14 regression] ICE when |building opencv-4.8.1 |building opencv-4.8.1 |(error: type mismatch in|(error: type mismatch in |binary expression) |binary expression) since ||r14-1833 --- Comment #5 from Jakub Jelinek --- Close. Started with r14-1833-gea616f687dccbe42012f786c0ebade5b05850206
[Bug c++/114129] Inaccurate error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114129 --- Comment #2 from Theodore.Papadopoulo at inria dot fr --- OK thank you... I did not realize that. C/C++ sometimes has a weird syntax. Sorry for the noise
[Bug c++/114129] Inaccurate error message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114129 Theodore.Papadopoulo at inria dot fr changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #3 from Theodore.Papadopoulo at inria dot fr --- Wrong report Sorry.
[Bug target/114150] New: gcc.target/i386/avx512cd-vpbroadcastmb2q-2.c etc. FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114150 Bug ID: 114150 Summary: gcc.target/i386/avx512cd-vpbroadcastmb2q-2.c etc. FAIL Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ro at gcc dot gnu.org CC: ubizjak at gmail dot com Target Milestone: --- Target: i?86-pc-solaris2.11, amd64-pc-solaris2.11 Two tests FAIL on 32 and 64-bit Solaris/x86 with the native asembler in use: FAIL: gcc.target/i386/avx512cd-vpbroadcastmb2q-2.c (test for excess errors) UNRESOLVED: gcc.target/i386/avx512cd-vpbroadcastmb2q-2.c compilation failed to produce executable FAIL: gcc.target/i386/avx512cd-vpbroadcastmw2d-2.c (test for excess errors) UNRESOLVED: gcc.target/i386/avx512cd-vpbroadcastmw2d-2.c compilation failed to produce executable Excess errors: Assembler: avx512cd-vpbroadcastmb2q-2.c "/var/tmp//ccs_9lod.s", line 42 : Invalid instruction argument Near line: "vpbroadcastmb2q %k0, %zmm0" Assembler: avx512cd-vpbroadcastmw2d-2.c "/var/tmp//ccevT6Rd.s", line 35 : Invalid instruction argument Near line: "vpbroadcastmw2d %k0, %zmm0" I suspect this is just an as bug. While I thought about adding tests for the two vpbroadcastm* insns to check_effective_target_avx512cd to guard against this, it's probably best to just xfail the tests on Solaris/x86 with as, especially since the native assembler isn't seeing any more fixes these days.
[Bug rtl-optimization/38534] gcc 4.2.1 and above: No need to save called-saved registers in 'noreturn' function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38534 --- Comment #43 from Lukas Grätz --- (In reply to Lukas Grätz from comment #42) > (In reply to Jakub Jelinek from comment #41) > > > > No. When PR78685 would be fixed by adding artificial hidden uses of > > variables at the end of their scopes, this bug would trigger far more often. > > The vars would be live across the calls, so if there would be callee-saved > > registers available, the compiler > > would use them to hold the variables across the calls. And this bug would > > break that. > > It could be done that way. But I think a better fix for PR78685 would be to > save the function parameter values to the stack (and than this problem will > not trigger that often). For the following reasons: > Just to be complete with the arguments: (5) Artificial hidden uses of variables at the end of their scopes would not always help when variables are overwritten. For example: int main (int argc, char **argv) { if (argc == 42) { h(); } might_not_return(0); argc = bar(); // here would be the hidden use of argc and argv } The "artificial hidden use" approach would only save the last value of argc, here the result of bar() in line 4 and not the argument argc. The argument value of argc is not used from line 3 on. So that approach would still produce a backtrace with argc=, something like: #1 might_not_return(i=0) #2 main (argc=, argv=0x7fffe0) (6) When the goal is just to have a more helpful gdb bt output, then we don't need to save any variables other than function parameters. In the original example in Bug 78685 and Comment 28 here, this seemed to be the main goal, to get gdb bt more conclusive. If interested in other variable values, too, -O0 might be better then trying hard to patch -Og to save all variable values. (7) Bug 78685 is for x86-64 with -Og. For 32 bit x86 with -Og, we don't run into that problem: there are no function parameters, since they are already on the stack by the 32 bit calling conventions. So saving parameters on the stack for -Og on x86-64 and similar targets without stack-parameters would just be consequent.
[Bug c++/92687] decltype of a structured binding to a tuple component is a reference type inside a template function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92687 --- Comment #4 from Jakub Jelinek --- finish_decltype_type does: /* decltype of a decomposition name drops references in the tuple case (unlike decltype of a normal variable) and keeps cv-qualifiers from the containing object in the other cases (unlike decltype of a member access expression). */ if (DECL_DECOMPOSITION_P (expr)) { if (DECL_HAS_VALUE_EXPR_P (expr)) /* Expr is an array or struct subobject proxy, handle bit-fields properly. */ return unlowered_expr_type (expr); else /* Expr is a reference variable for the tuple case. */ return lookup_decomp_type (expr); } The problem is that if processing_template_decl (though, finish_decltype_type has processing_template_decl temporarily cleared here) and expr is not dependent (otherwise finish_decltype_type would defer handling it) DECL_HAS_VALUE_EXPR_P is actually set on all the structured binding decls, not just when it is array/struct/vector/complex etc. subobject proxy.
[Bug libstdc++/114147] [11/12/13 Regression] tuple allocator-extended constructor requires non-explicit default constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114147 --- Comment #4 from __vic --- The latest gcc-14-20240225 snapshot doesn't include this fix. Is there any chance to have this fixed in 14.1 release?
[Bug tree-optimization/114151] New: [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151 Bug ID: 114151 Summary: [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tnfchris at gcc dot gnu.org CC: rguenth at gcc dot gnu.org Target Milestone: --- Target: aarch64* Created attachment 57559 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57559&action=edit testcase The attached C++ testcase compiled with: -O3 -mcpu=neoverse-n2 used to compile a nice and simple loop. But after g:a0b1798042d033fd2cc2c806afbb77875dd2909b The codegen is weird and it uses horrible addressing modes. The first odd part is that it's decided to split the loop, the "main" loop has a guard after it to branch to the exit is the iteration count is 1. If not instead of just loop again it falls through the a copy of the main loop, but has destroyed addressing modes. The copy of the loop seems to have unshared the address calculations. Before we had: _128 = (void *) ivtmp.11_20; _54 = MEM <__SVFloat16_t> [(__fp16 *)_128]; _10 = MEM <__SVFloat16_t> [(__fp16 *)_128 + POLY_INT_CST [16B, 16B]]; _75 = MEM <__SVFloat16_t> [(__fp16 *)_128 + POLY_INT_CST [32B, 32B]]; etc, so all as an offset from _128. Now we have: col_i_61 = (int) ivtmp.11_100; _60 = (long unsigned int) col_i_61; _59 = _60 * 2; _58 = a_j_69 + _59; _54 = MEM <__SVFloat16_t> [(__fp16 *)_58]; _53 = _59 + POLY_INT_CST [16, 16]; _13 = a_j_69 + _53; _10 = MEM <__SVFloat16_t> [(__fp16 *)_13]; _74 = _59 + POLY_INT_CST [32, 32]; _19 = a_j_69 + _74; _75 = MEM <__SVFloat16_t> [(__fp16 *)_19]; and similarly for the stores as well. it also weirdly creates some very complicated addressing computations. Before we had: _144 = p_mat_16(D) + 6; _64 = MEM <__SVFloat16_t> [(__fp16 *)_144 + ivtmp.10_100 * 2]; _143 = p_mat_16(D) + 4; _84 = MEM <__SVFloat16_t> [(__fp16 *)_143 + ivtmp.10_100 * 2]; and after: ivtmp.23_130 = (unsigned long) p_mat_16(D); _123 = 2 - ivtmp.23_130; _124 = &MEM <__SVFloat16_t> [(__fp16 *)0B + _123 + ivtmp.12_109 * 2]; _64 = MEM <__SVFloat16_t> [(__fp16 *)_124]; _122 = -ivtmp.23_130; _120 = &MEM <__SVFloat16_t> [(__fp16 *)0B + _122 + ivtmp.12_109 * 2]; _84 = MEM <__SVFloat16_t> [(__fp16 *)_120]; This results in quite the codesize increase, and a 7-10% performance loss.
[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121 --- Comment #15 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:c841144a94363ff26e40ab3f26b14702c32987a8 commit r14-9215-gc841144a94363ff26e40ab3f26b14702c32987a8 Author: Richard Biener Date: Wed Feb 28 12:37:07 2024 +0100 tree-optimization/114121 - wrong VN with context sensitive range info When VN ends up exploiting range-info specifying the ao_ref offset and max_size we have to make sure to reflect this in the hashtable entry for the recorded expression. The PR113831 fix handled the case where we can encode this in the operands themselves but this bug shows the issue is more widespread. So instead of altering the operands the following instead records this extra info that's possibly used, only throwing it away when the value-numbering didn't come up with a non-VARYING value which is an important detail to preserve CSE as opposed to constant folding which is where all cases currently known popped up. With this the original PR113831 fix can be reverted. PR tree-optimization/114121 * tree-ssa-sccvn.h (vn_reference_s::offset, vn_reference_s::max_size): New fields. (vn_reference_insert_pieces): Adjust prototype. * tree-ssa-pre.cc (phi_translate_1): Preserve offset/max_size. * tree-ssa-sccvn.cc (vn_reference_eq): Compare offset and size, allow using "don't know" state. (vn_walk_cb_data::finish): Pass along offset/max_size. (vn_reference_lookup_or_insert_for_pieces): Take offset and max_size as argument and use it. (vn_reference_lookup_3): Properly adjust offset and max_size according to the adjusted ao_ref. (vn_reference_lookup_pieces): Initialize offset and max_size. (vn_reference_lookup): Likewise. (vn_reference_lookup_call): Likewise. (vn_reference_insert): Likewise. (visit_reference_op_call): Likewise. (vn_reference_insert_pieces): Take offset and max_size as argument and use it. * gcc.dg/torture/pr114121.c: New testcase.
[Bug tree-optimization/113831] [11/12/13 Regression] Wrong VN with structurally identical ref since r9-398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113831 --- Comment #9 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:5c01ede02a1f9ba1a58ab8d96a73e46e0484d820 commit r14-9216-g5c01ede02a1f9ba1a58ab8d96a73e46e0484d820 Author: Richard Biener Date: Wed Feb 28 13:45:57 2024 +0100 tree-optimization/113831 - revert original fix This reverts the original fix for PR113831 which is better fixed by the PR114121 fix. I've XFAILed instead of removing the PR108355 testcase again. PR tree-optimization/113831 PR tree-optimization/108355 * tree-ssa-sccvn.cc (copy_reference_ops_from_ref): Revert PR113831 fix. * gcc.dg/tree-ssa/ssa-fre-104.c: XFAIL.
[Bug tree-optimization/108355] [13 Regression] Dead Code Elimination Regression at -O2 since r13-2772-g9baee6181b4e42
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108355 --- Comment #8 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:5c01ede02a1f9ba1a58ab8d96a73e46e0484d820 commit r14-9216-g5c01ede02a1f9ba1a58ab8d96a73e46e0484d820 Author: Richard Biener Date: Wed Feb 28 13:45:57 2024 +0100 tree-optimization/113831 - revert original fix This reverts the original fix for PR113831 which is better fixed by the PR114121 fix. I've XFAILed instead of removing the PR108355 testcase again. PR tree-optimization/113831 PR tree-optimization/108355 * tree-ssa-sccvn.cc (copy_reference_ops_from_ref): Revert PR113831 fix. * gcc.dg/tree-ssa/ssa-fre-104.c: XFAIL.
[Bug tree-optimization/114121] wrong code with _BitInt() arithmetics at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114121 Richard Biener changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #16 from Richard Biener --- Fixed.
[Bug tree-optimization/108355] [13/14 Regression] Dead Code Elimination Regression at -O2 since r13-2772-g9baee6181b4e42
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108355 Richard Biener changed: What|Removed |Added Keywords||xfail Summary|[13 Regression] Dead Code |[13/14 Regression] Dead |Elimination Regression at |Code Elimination Regression |-O2 since |at -O2 since |r13-2772-g9baee6181b4e42|r13-2772-g9baee6181b4e42 Known to work|14.0| --- Comment #9 from Richard Biener --- gcc.dg/tree-ssa/ssa-fre-104.c has been XFAILed.
[Bug libstdc++/114152] New: Wrong exception specifiers for LFTSv3 scope guard destructors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114152 Bug ID: 114152 Summary: Wrong exception specifiers for LFTSv3 scope guard destructors Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: victor at westerhu dot is Target Milestone: --- Created attachment 57560 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57560&action=edit Patch According to the (draft) specification of the C++ Extensions for Library Fundamentals, Version 3 (https://cplusplus.github.io/fundamentals-ts/v3.html#scopeguard.exit), the destructors of std::experimental::scope_{exit,failure} should be unconditionally noexcept. The destructor of std::experimental::scope_success should be noexcept if calling the exit function is noexcept. The current implementation has noexcept(noexcept(this->_M_exit_function)) for all three, which is wrong for all. It is even wrong for std::experimental::scope_success, because it's missing the needed `()' for actually testing the function call. This error is present since the first addition of the scope guards. I have attached the 3-line patch needed to fix this.
[Bug libstdc++/114153] New: std::less prefers operator const void*() over operator<=>() in C++20 mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114153 Bug ID: 114153 Summary: std::less prefers operator const void*() over operator<=>() in C++20 mode Product: gcc Version: 12.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: marc.mutz at hotmail dot com Target Milestone: --- std::less (and other related types like std::greater_equal, etc) is implemented in the following way: * if `operator<(T, U)` is defined for the argument types, it is called. * otherwise, if the argument types are convertible to `const volatile void *`, such conversion is performed, and it boils down to comparing the pointers. Now, assume a type which has an `operator const void *() const`, and provides `operator==()` and `operator<=>()` to generate all relational operators, the same way the std types do. So std::less will not use `operator<=>()`, but cast to `const void *` and compare pointers. This is wrong, because `operator<=>()` implies all relational operators, so it can be used to do the proper comparison. libc++ gets this right: // https://godbolt.org/z/E55eeosP9 // Courtesy of Ivan Solovev #include #include #include struct S { int val; S(int v) : val(v) {} operator const void *() const { std::cout << "cast\n"; return &val; } friend bool operator==(S lhs, S rhs) noexcept { std::cout << "op==\n"; return lhs.val == rhs.val; } friend std::strong_ordering operator<=>(S lhs, S rhs) noexcept { std::cout << "op<=>\n"; return lhs.val <=> rhs.val; } }; int main() { const S arr[] = {S{2}, S{1}}; // In C++20 mode it compares pointers, and so considers that arr[1] > arr[0], // which is wrong! return std::greater_equal<>{}(arr[0], arr[1]) ? 0 : 1; }
[Bug libstdc++/114153] std::less<> prefers operator const void*() over operator<=>() in C++20 mode
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114153 --- Comment #1 from Marc Mutz --- It's only the C++14 "diamond"/is_transparent version of std::less/greater_equal that is affected. If you replace the return from main with greater_equal{}, then it calls op<=>, too: // https://godbolt.org/z/cnjssh3ss return std::greater_equal{}(arr[0], arr[1]) ? 0 : 1; //^ added
[Bug libstdc++/114147] [11/12/13/14 Regression] tuple allocator-extended constructor requires non-explicit default constructor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114147 Jonathan Wakely changed: What|Removed |Added Known to work|14.0| Summary|[11/12/13 Regression] tuple |[11/12/13/14 Regression] |allocator-extended |tuple allocator-extended |constructor requires|constructor requires |non-explicit default|non-explicit default |constructor |constructor --- Comment #5 from Jonathan Wakely --- Ah it does include it, but it only affects C++20 and later. For older standards the original code is still used.
[Bug c++/92687] decltype of a structured binding to a tuple component is a reference type inside a template function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92687 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #5 from Jakub Jelinek --- Created attachment 57561 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57561&action=edit gcc14-pr92687.patch So var only very lightly tested fix. Another possibility would be instead of the cp/*.cc changes in the patch change lookup_decomp_type such that for NULL get it would return NULL_TREE, and either always or just if ptds.saved try to call lookup_decomp_type and return its result if it returned true, regardless of whether DECL_HAS_VALUE_EXPR_P or not. Guess that would be cleaner, but slower.
[Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-28 Ever confirmed|0 |1 CC||amacleod at redhat dot com, ||rsandifo at gcc dot gnu.org Target Milestone|--- |14.0 --- Comment #1 from Richard Biener --- Do we have POLY_INT_CSTs in CHRECs? Huh, yeah - we do. So in IVOPTs the differences are like (get_scalar_evolution (scalar = col_i_61) - (scalar_evolution = {iftmp.0_11 * _105, +, iftmp.0_11}_2)) + (scalar_evolution = (int) {(unsigned int) col_stride_10 * (unsigned int) _105, +, (unsigned int) col_stride_10}_2)) but also (set_scalar_evolution instantiated_below = 22 (scalar = _58) - (scalar_evolution = {(__fp16 *) p_mat_16(D) + ((long unsigned int) _105 + (long unsigned int) (iftmp.0_11 * _105)) * 2, +, ((long unsigned int) iftmp.0_11 + 1) * 2}_2)) + (scalar_evolution = _58)) (that's completely missed) Likewise, with POLY_INT_CST: - (scalar_evolution = {(__fp16 *) p_mat_16(D) + (((long unsigned int) (iftmp.0_11 * _105) + (long unsigned int) _105) * 2 + POLY_INT_CST [16, 16]), +, ((long unsigned int) iftmp.0_11 + 1) * 2}_2)) + (scalar_evolution = _13)) The special-casing of CHREC * x we allow to be expressed works by looking at value-ranges and signs of INTEGER_CSTs: + if (!ANY_INTEGRAL_TYPE_P (type) + || TYPE_OVERFLOW_WRAPS (type) + || integer_zerop (CHREC_LEFT (op0)) + || (TREE_CODE (CHREC_LEFT (op0)) == INTEGER_CST + && TREE_CODE (CHREC_RIGHT (op0)) == INTEGER_CST + && (tree_int_cst_sgn (CHREC_LEFT (op0)) + == tree_int_cst_sgn (CHREC_RIGHT (op0 + || (get_range_query (cfun)->range_of_expr (rl, CHREC_LEFT (op0 ... possibly there might be a way to adapt the "same sign" check to also work for POLY_INT_CSTs which I think have known signs? Possibly rewriting that by using poly_int_tree_p () isntead of checking for INTEGER_CST and then using known_lt (wi::to_poly_wide (), 0) && known_lt (..., 0) || known_gt (..., 0) && known_gt (..., 0) helps? Nope, the following doesn't make a difference here. diff --git a/gcc/tree-chrec.cc b/gcc/tree-chrec.cc index 2e6c7356d3b..366ab914c8f 100644 --- a/gcc/tree-chrec.cc +++ b/gcc/tree-chrec.cc @@ -442,10 +442,12 @@ chrec_fold_multiply (tree type, if (!ANY_INTEGRAL_TYPE_P (type) || TYPE_OVERFLOW_WRAPS (type) || integer_zerop (CHREC_LEFT (op0)) - || (TREE_CODE (CHREC_LEFT (op0)) == INTEGER_CST - && TREE_CODE (CHREC_RIGHT (op0)) == INTEGER_CST - && (tree_int_cst_sgn (CHREC_LEFT (op0)) - == tree_int_cst_sgn (CHREC_RIGHT (op0 + || (poly_int_tree_p (CHREC_LEFT (op0)) + && poly_int_tree_p (CHREC_RIGHT (op0)) + && ((known_lt (wi::to_poly_widest (CHREC_LEFT (op0)), 0) + && known_lt (wi::to_poly_widest (CHREC_RIGHT (op0)), 0)) + || (known_ge (wi::to_poly_widest (CHREC_LEFT (op0)), 0) + && known_ge (wi::to_poly_widest (CHREC_RIGHT (op0)), 0 || (get_range_query (cfun)->range_of_expr (rl, CHREC_LEFT (op0)) && !rl.undefined_p () && (rl.nonpositive_p () || rl.nonnegative_p ()) This was a correctness fix btw, so I'm not sure we can easily recover - we could try using niter information for CHREC_VARIABLE but then there's variable niter here so I don't see a chance. This is mainly IVs like col_i_61 = col_stride_10 * j_73; _60 = (long unsigned int) col_i_61; _59 = _60 * 2; _58 = a_j_69 + _59; _54 = MEM <__SVFloat16_t> [(__fp16 *)_58]; where we compose for example the scalar evolution of col_i_61 by multiplyinig that of j_73 which is {_105, +, 1}_2 with col_stride_10. Possibly adding a ranger instance to IVOPTs could help, for this instance since [local count: 118111600]: # col_stride_10 = PHI if (size_15(D) > 0) goto ; [89.00%] else goto ; [11.00%] [local count: 118111600]: return; so col_stride_10 should be positive, and _105 as well: _12 = MAX_EXPR <_103, 0>; _3 = (unsigned int) _12; _4 = _3 + 1; _105 = (int) _4; OTOH the +1 could make it overflow for large size. Can you test the above? It should be an incremental improvement. Adding enable_ranger (cfun); / disable_ranger (cfun); around the IVOPTs pass doesn't seem to help (but see above - there might not be enough info, also the code added doesn't pass in a context stmt so ranger might not do much/anything here). Confirmed.
[Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151 --- Comment #2 from Richard Biener --- Yep, it seems to only pick up global ranges that way. diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc index 7cae5bdefea..626fc5bf5d7 100644 --- a/gcc/tree-ssa-loop-ivopts.cc +++ b/gcc/tree-ssa-loop-ivopts.cc @@ -132,6 +132,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-vectorizer.h" #include "dbgcnt.h" #include "cfganal.h" +#include "gimple-range.h" /* For lang_hooks.types.type_for_mode. */ #include "langhooks.h" @@ -8280,6 +8281,8 @@ tree_ssa_iv_optimize (void) tree_ssa_iv_optimize_init (&data); mark_ssa_maybe_undefs (); + enable_ranger (cfun); + /* Optimize the loops starting with the innermost ones. */ for (auto loop : loops_list (cfun, LI_FROM_INNERMOST)) { @@ -8292,6 +8295,8 @@ tree_ssa_iv_optimize (void) tree_ssa_iv_optimize_loop (&data, loop, toremove); } + disable_ranger (cfun); + /* Remove eliminated IV defs. */ release_defs_bitset (toremove);
[Bug libstdc++/114152] Wrong exception specifiers for LFTSv3 scope guard destructors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114152 Jonathan Wakely changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2024-02-28 --- Comment #1 from Jonathan Wakely --- Thanks - please send patches to the mailing list instead of attaching them here. https://gcc.gnu.org/contribute.html#patches See https://gcc.gnu.org/contribute.html#legal as well.
[Bug libstdc++/114152] Wrong exception specifiers for LFTSv3 scope guard destructors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114152 --- Comment #2 from Victor --- Will do!
[Bug libstdc++/114152] Wrong exception specifiers for LFTSv3 scope guard destructors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114152 Jonathan Wakely changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |redi at gcc dot gnu.org Target Milestone|--- |13.3 Status|NEW |ASSIGNED
[Bug tree-optimization/96147] [11 regression] gcc.dg/vect/slp-43.c etc. FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96147 Rainer Orth changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #10 from Rainer Orth --- Unfortunately, I missed that one of those tests still XPASSes: XPASS: gcc.dg/vect/bb-slp-32.c -flto -ffat-lto-objects scan-tree-dump slp2 "vectorization is not profitable" XPASS: gcc.dg/vect/bb-slp-32.c scan-tree-dump slp2 "vectorization is not profitable"
[Bug tree-optimization/96147] [11 regression] gcc.dg/vect/slp-43.c etc. FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96147 --- Comment #11 from Rainer Orth --- Created attachment 57562 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57562&action=edit 32-bit sparc-sun-solaris2.11 bb-slp-32.c.191t.slp2
[Bug libstdc++/114152] Wrong exception specifiers for LFTSv3 scope guard destructors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114152 --- Comment #3 from Jonathan Wakely --- I can take care of it this time, since I think I can figure out how to fix it just from your detailed report :-) and I've already written a testcase: // { dg-do compile { target c++20 } } // PR libstdc++/114152 // Wrong exception specifiers for LFTSv3 scope guard destructors #include using namespace std::experimental; struct F { void operator()() noexcept(false); }; static_assert( noexcept(std::declval&>().~scope_exit()) ); static_assert( noexcept(std::declval&>().~scope_fail()) ); static_assert( ! noexcept(std::declval&>().~scope_success()) ); struct G { void operator()() noexcept(true); }; static_assert( noexcept(std::declval&>().~scope_exit()) ); static_assert( noexcept(std::declval&>().~scope_fail()) ); static_assert( noexcept(std::declval&>().~scope_success()) );
[Bug libstdc++/114152] Wrong exception specifiers for LFTSv3 scope guard destructors
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114152 --- Comment #4 from GCC Commits --- The master branch has been updated by Jonathan Wakely : https://gcc.gnu.org/g:80c386cb20d38ebc55f30a79418fabfbed904b87 commit r14-9219-g80c386cb20d38ebc55f30a79418fabfbed904b87 Author: Jonathan Wakely Date: Wed Feb 28 14:45:18 2024 + libstdc++: Fix noexcept on dtors in [PR114152] The PR points out that the destructors all have incorrect noexcept-specifiers. libstdc++-v3/ChangeLog: PR libstdc++/114152 * include/experimental/scope (scope_exit scope_fail): Make destructor unconditionally noexcept. (scope_sucess): Fix noexcept-specifier. * testsuite/experimental/scopeguard/114152.cc: New test.
[Bug tree-optimization/113431] [14 Regression] Wrong code at -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113431 Rainer Orth changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED|REOPENED --- Comment #19 from Rainer Orth --- The SPARC dump suggests /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr113431.c:12:15: missed: unsupported unaligned access /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.dg/vect/pr113431.c:12:15: missed: not vectorized: relevant stmt not supported: a[0][1] = _60; that the tests needs vect_hw_misalign?
[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 62283, which changed state. Bug 62283 Summary: basic-block vectorization fails https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62283 What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |---
[Bug fortran/62283] basic-block vectorization fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62283 Rainer Orth changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED|REOPENED --- Comment #31 from Rainer Orth --- gcc.dg/vect/vect-33.c still FAILs on 32 and 64-bit SPARC: FAIL: gcc.dg/vect/vect-33.c -flto -ffat-lto-objects scan-tree-dump-not optimized "Invalid sum" FAIL: gcc.dg/vect/vect-33.c scan-tree-dump-not optimized "Invalid sum"
[Bug fortran/62283] basic-block vectorization fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62283 --- Comment #32 from Rainer Orth --- Created attachment 57563 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57563&action=edit 32-bit sparc-sun-solaris2.11 vect-33.c.265t.optimized
[Bug tree-optimization/114154] New: gcc.dg/vect/vect-alias-check-1.c XPASSes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114154 Bug ID: 114154 Summary: gcc.dg/vect/vect-alias-check-1.c XPASSes Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ro at gcc dot gnu.org Target Milestone: --- Target: sparc*-sun-solaris2.11 vect-alias-check-1.c XPASSes on 32 and 64-bit SPARC since 20200614: XPASS: gcc.dg/vect/vect-alias-check-1.c -flto -ffat-lto-objects scan-tree-dump vect "using an address-based overlap test" XPASS: gcc.dg/vect/vect-alias-check-1.c scan-tree-dump vect "using an address-based overlap test"
[Bug tree-optimization/114154] gcc.dg/vect/vect-alias-check-1.c XPASSes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114154 --- Comment #1 from Rainer Orth --- Created attachment 57564 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57564&action=edit 32-bit sparc-sun-solaris2.11 vect-alias-check-1.c.179t.vect
[Bug testsuite/102954] [12/13/14 regression] gcc.dg/vect/pr33804.c XPASSes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102954 --- Comment #6 from Rainer Orth --- Created attachment 57565 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57565&action=edit 64-bit sparc-sun-solaris2.11 pr33804.c.179t.vect The issue persists as of 20240228.
[Bug testsuite/102954] [12/13/14 regression] gcc.dg/vect/pr33804.c XPASSes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102954 --- Comment #7 from Rainer Orth --- Created attachment 57566 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57566&action=edit 64-bit sparc-sun-solaris2.11 slp-multitypes-3.c.179t.vect
[Bug testsuite/113685] [14 regression] gcc.dg/vect/vect-117.c fails profile checking with Invalid sum after r14-4089-gd45ddc2c04e471
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113685 Rainer Orth changed: What|Removed |Added CC||ro at gcc dot gnu.org Host|powerpc64le-linux-gnu |powerpc64le-linux-gnu, ||sparc*-sun-solaris2.11 Build|powerpc64le-linux-gnu |powerpc64le-linux-gnu, ||sparc*-sun-solaris2.11 Target|powerpc64le-linux-gnu |powerpc64le-linux-gnu, ||sparc*-sun-solaris2.11 --- Comment #2 from Rainer Orth --- The same issue exists on 64-bit Solaris/SPARC: +FAIL: gcc.dg/vect/vect-117.c -flto -ffat-lto-objects scan-tree-dump-not optimized "Invalid sum" +FAIL: gcc.dg/vect/vect-117.c scan-tree-dump-not optimized "Invalid sum"
[Bug testsuite/113685] [14 regression] gcc.dg/vect/vect-117.c fails profile checking with Invalid sum after r14-4089-gd45ddc2c04e471
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113685 --- Comment #3 from Rainer Orth --- Created attachment 57567 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57567&action=edit 64-bit sparc-sun-solaris2.11 vect-117.c.265t.optimized
[Bug target/112868] GCC passes -many to the assembler for --enable-checking=release builds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112868 --- Comment #7 from Sam James --- (In reply to Peter Bergner from comment #6) Thanks Peter. We're happy to help with that in Gentoo. If you remember, please CC me on the patch and we'll give it a spin.
[Bug fortran/114141] ASSOCIATE and complex part ref when associate target is a function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114141 kargl at gcc dot gnu.org changed: What|Removed |Added Priority|P3 |P4 --- Comment #3 from kargl at gcc dot gnu.org --- (In reply to Jerry DeLisle from comment #2) > It looks like the 'selector' in this case is an expr. > > The expr must be a pointer object or a 'designator' > > A designator must be: > > R901 > designator > > object-name > array-element > array-section > coindexed-named-object > complex-part-designator > structure-component > substring > > I am not seeing the expr in the example as one of these listed. ??? Yep, agreed. I went back an re-read the section about ASSOCIATE. Not sure how I convinced myself that a constant expression, which reduces to a constant is okay. I suppose the question is "do we generate a better error message or simply close the PR?"