[Bug ipa/116068] [15 Regression] ICE: in bitmap_alloc, at bitmap.cc:785 with -Os -flto -ffat-lto-objects -floop-parallelize-all

2025-01-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116068 --- Comment #5 from Jan Hubicka --- > > ? All the other spots that execute some pass list in cgraphunit.cc wrap > > that > > with bitmap_obstack_initialize/release. > > That looks correct to me. Looks good to me too. Does double-initializin

[Bug tree-optimization/115825] [12/13/14 Regression] Loop unrolling increases code size with -Os

2025-01-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115825 --- Comment #22 from Jan Hubicka --- > /* If there is pure/const call in the function, then we can > still optimize the unrolled loop body if it contains some > other interesting code than the calls and code s

[Bug ipa/117892] [15 Regression] ICE on valid code at -O1 and above on x86_64-linux-gnu: in single_succ_edge, at basic-block.h:332 since r15-5336-gcee7d080d5c2a5

2024-12-17 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117892 --- Comment #4 from Jan Hubicka --- > Deleted dead store: # .MEM_5 = VDEF <.MEM_3(D)> > That started in GCC 12. That is weird. I would expect CFG verification run between passes to catch this...

[Bug libstdc++/94960] extern template prevents inlining of standard library objects

2024-12-13 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960 --- Comment #14 from Jan Hubicka --- > Could we just add 'inline' to the functions that are 'constexpr' in later > standards? It would make sense to me - that would reduce differences between codegens with different -std= options. Also we may use

[Bug libstdc++/94960] extern template prevents inlining of standard library objects

2024-12-13 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960 --- Comment #15 from Jan Hubicka --- > Oh, sorry, that was linked earlier. But still, isn't the problem that "inline" > is too strong? Do we have some data on this? I plan to do some inliner benchmarking over christmas like I do every year. Wit

[Bug c++/103827] function which takes an argument via (hidden) reference should assume the argument does not escape or is only read from

2024-12-10 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103827 --- Comment #16 from Jan Hubicka --- > > Note that this is the same for non-parameter local variables > > Just want to emphasize this point: this property is in no way specific to > parameters, it applies to any object created as const. If som

[Bug c++/103827] function which takes an argument via (hidden) reference should assume the argument does not escape or is only read from

2024-12-10 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103827 --- Comment #13 from Jan Hubicka --- > Yes, that object is defined const so can't be changed. But is this something > we > care about? Is it important to apply this optimization to noinline functions? There are few things where this helps. Fir

[Bug rtl-optimization/117964] duplicate computed gotos will happily duplicate blocks with 9189 successors

2024-12-09 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117964 --- Comment #4 from Jan Hubicka --- > r5-1621-gfc56f9d2843266 last moved the pass earlier Doing kind of a fake "PHI" basic block to factor out the edges is possible and perhaps a good idea which I did not think of while working on orgiinal CFG c

[Bug c++/103827] function which takes an argument via (hidden) reference should assume the argument does not escape or is only read from

2024-12-09 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103827 --- Comment #7 from Jan Hubicka --- > > > What about escape bits? Is it OK to save the address to global memory > > > and then check it in the destructor? > > > > Yes, but does that matter? After the function returns the pointer is invalid > >

[Bug c++/103827] function which takes an argument via (hidden) reference should assume the argument does not escape or is only read from

2024-12-09 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103827 --- Comment #4 from Jan Hubicka --- > That would be undefined, because s is defined const and so doing const_cast > and > then modifying it is undefined behaviour. However, this would be fine: Cool, then I will look into getting modref and PTA

[Bug libstdc++/87502] Poor code generation for std::string("c-style string")

2024-12-09 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87502 --- Comment #14 from Jan Hubicka --- > > So I think all we can hope for is merging memcpy with the extra write of 0. > > That's not actually clear. > > It would be reasonable to assume that foo isn't likely to change the string > and have the i

[Bug libstdc++/87502] Poor code generation for std::string("c-style string")

2024-12-08 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87502 --- Comment #12 from Jan Hubicka --- > (In reply to Jakub Jelinek from comment #10) > > __builtin_memcpy (&D.35539.D.25336._M_local_buf, "abc", 3); > > MEM[(char_type &)&D.35539 + 11] = 0; > > change to > > __builtin_memcpy (&D.35539.D.2533

[Bug target/117957] [15 regression] vectorization pesimises std::vector push/pop test

2024-12-08 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117957 --- Comment #5 from Jan Hubicka --- > I suspect the issue is very similar (or the same) as PR 115777 . Yep, I think it store-to-load forwarding. The stack is organized in pairs that are likely written independetly and loaded together. Sadly I t

[Bug tree-optimization/117935] [14/15 Regression] [[likely]] attribute is lost by phiopt1 in some cases since r14-203-ga2339e0fe9dbef

2024-12-08 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117935 --- Comment #5 from Jan Hubicka --- Note that propagation of branch probabilities from callee to caller works only by kind of accident. I originally made branch prediction to be done after early inlining since it makes some patterns branch pred

[Bug tree-optimization/117924] unused std::vector are not optimized out fully at gimple level

2024-12-06 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117924 --- Comment #3 from Jan Hubicka --- Actually the main problem is that copying of bitvectors is done by loop copying every bit individually. This loop stays until loop optimizers and then we are quite late in optimization. Have patch for that.

[Bug tree-optimization/117875] [15 Regression] 28% regression for 456.hmmer on Zen4 with -Ofast -march=native

2024-12-05 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117875 --- Comment #9 from Jan Hubicka --- > > But maybe I'm missing something? > > I guess the issue is that with > > # k_24 = PHI <1(13), k_29(16)> > > to easily see this we'd have to compute the range of > (unsigned int) M_9(D) - 1 and the range

[Bug target/117088] [15 regression] 548.exchange_r regressed by ~11% with -O2 -march=x86-64-v3 on EMR after r15-4225-g70c3db511ba14f

2024-12-03 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117088 --- Comment #6 from Jan Hubicka --- > void digits_2.isra (integer(kind=4) ISRA.6607) > { > integer(kind=4) ISRA.6607_927(D) = ISRA.6607; > ... > # RANGE [irange] integer(kind=4) [-2147483647, 8][10, +INF] > _494 = ISRA.6607_927(D) + 1; >

[Bug target/117088] [15 regression] 548.exchange_r regressed by ~11% with -O2 -march=x86-64-v3 on EMR after r15-4225-g70c3db511ba14f

2024-11-29 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117088 --- Comment #3 from Jan Hubicka --- > digits_2.isra (1); > > so we at least know row is [1, +INF] since the add is signed. > > We might be able to use a SCEV-like range computation for recursive cases like > this, then being able to compute

[Bug ipa/86590] Codegen is poor when passing std::string by value with _GLIBCXX_EXTERN_TEMPLATE undefined

2024-11-28 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86590 --- Comment #38 from Jan Hubicka --- > _M_create is at line 144 of basic_string.tcc It is not visible to middle-end though. If you check gimple dump, there are calls jan@padlo:/tmp> grep _M_create a-tt.C.*gimple _4 = std::__cxx11::basic_st

[Bug tree-optimization/117793] missed copy propagation across memcpy

2024-11-27 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117793 --- Comment #2 from Jan Hubicka --- > This is aggregate copy prop. What we could do is replace the last copy > by > > __builtin_memcpy (_108, "this text is longer than 15 characters", 38); > > but this might be a pessimization in case none of

[Bug tree-optimization/117764] [15 Regression] cddce should handle __builtin_unreachable guards

2024-11-25 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117764 --- Comment #6 from Jan Hubicka --- > But the inlining argument basically says CDDCE shouldn't handle > __builtin_unreachable control stmts optimistically given a use could appear > only after inlining ... doesn't this then imply WONTFIX? I am

[Bug tree-optimization/117764] [15 Regression] cddce should handle __builtin_unreachable guards

2024-11-24 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117764 --- Comment #3 from Jan Hubicka --- > I don't think IPA-SRA does that. Is this something that is happening in the > testcase from the bug summary? Do I need to use some inlining parameters to > reproduce it? Problem is that at ipa analysis we

[Bug libstdc++/109442] Dead local copy of std::vector not removed from function

2024-11-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109442 --- Comment #28 from Jan Hubicka --- > vector::size() is called **very often** so needs to be as fast as possible. > Does this still inline identically? Last year I made patch for inliner to ignore conditions guarding __builtin_unreachable. Ric

[Bug libstdc++/109442] Dead local copy of std::vector not removed from function

2024-11-12 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109442 --- Comment #25 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109442 > > --- Comment #24 from Jason Merrill --- > (In reply to Jan Hubicka from comment #23) > > So I guess we are missing somewhere __builtin_assert that th

[Bug lto/116535] LTO partitioning vs. offloading compilation

2024-09-03 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116535 --- Comment #7 from Jan Hubicka --- > void > output_offload_tables (void) > { > ... > > /* In WHOPR mode during the WPA stage the joint offload tables need to be > streamed to one partition only. That's why we free offload_funcs and

[Bug ipa/116410] modref doesn't generate LTO summaries with -ffat-lto-objects (-ffat-lto-objects generates different and inefficient code compared with -fno-fat-lto-objects)

2024-09-03 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116410 --- Comment #11 from Jan Hubicka --- > We plan to adopt -ffat-lto-objects ourselves soon for at least a subset of > packages, so this was good timing. :) Note that -ffat-lto-objects has various issues, especially with library archives. The prob

[Bug tree-optimization/115679] inlining failed in call to 'foo': function not considered for inlining

2024-06-27 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115679 --- Comment #2 from Jan Hubicka --- > With -Og it's usually that the always-inline function is called indirectly - > that's an unsupported case. We can probably add CIF code for functions that were called indirectly but are no more, so this is r

[Bug ipa/114531] Feature proposal for an `-finline-functions-aggressive` compiler option

2024-06-25 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531 --- Comment #18 from Jan Hubicka --- > different issue from the one that is raised in the PR. (Unless we think that > -O2 and -O3 should always have the same inlining heuristics henceforward, but > that seems unlikely.) Yes, I think point of -

[Bug ipa/114531] Feature proposal for an `-finline-functions-aggressive` compiler option

2024-06-25 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531 --- Comment #14 from Jan Hubicka --- As for bit of history on this. I have introduced the split -O2 and -O3 limits in order to be able to enable -finline-small-functions at -O2 which we found to be really importnat for C++ codebases which no lo

[Bug ipa/114531] Feature proposal for an `-finline-functions-aggressive` compiler option

2024-06-25 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531 --- Comment #12 from Jan Hubicka --- If this is without LTO, can you also try the LTO numbers? Inliner behaves sifniciantly different with and without LTO, since LTO introduces many (and often too many) inlining oppurtunities, which sometimes ma

[Bug c++/110137] implement clang -fassume-sane-operator-new

2024-06-05 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110137 --- Comment #15 from Jan Hubicka --- > As pointed out in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035#c13 , > gcc > already assume operator new's retuned pointer cannot alias any existing > pointer. So no change is needed there. Seems yo

[Bug c++/110137] implement clang -fassume-sane-operator-new

2024-06-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110137 --- Comment #13 from Jan Hubicka --- > Is the option supposed to be only about the standard global scope operator > new/delete (_Znam etc.) or also user operator new/delete class methods? If > the > former, then I agree it is a global property

[Bug c++/110137] implement clang -fassume-sane-operator-new

2024-06-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110137 --- Comment #9 from Jan Hubicka --- Doing global flag has a problem since with LTO or using optimize attribute user may mix code compiled with and without sane operator new. When function with insane operator new gets inlined to a function wit

[Bug middle-end/115277] [13/14/15 regression] ICF needs to match loop bound estimates

2024-05-30 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115277 --- Comment #3 from Jan Hubicka --- > What about gcc 13? GCC 13 also misoptimizes. Honza

[Bug ipa/109914] --suggest-attribute=pure misdiagnoses static functions

2024-05-26 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109914 --- Comment #5 from Jan Hubicka --- > (In reply to Jan Hubicka from comment #2) > > The reason why gcc warns is that it is unable to prove that the function is > > always finite. > > I don't see why finiteness matters. If a pure function return

[Bug ipa/96059] ICE: in remove_unreachable_nodes, at ipa.c:575 with -fdevirtualize-at-ltrans

2024-05-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96059 --- Comment #7 from Jan Hubicka --- > Actually, let me drop the PR59859 blocker, as IIRC we've had reports of this > downstream w/o graphite. I think you edited wrong PR here.

[Bug ipa/115097] Strange suboptimal codegen specifically at -O2 when copying struct type

2024-05-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115097 --- Comment #7 from Jan Hubicka --- > and then we inline them back, introducing the extra copy. Why do we use > tail-calls here instead of aliases? Why do we lack cost modeling here? Because the function is exported and we must keep addresses

[Bug libstdc++/109442] Dead local copy of std::vector not removed from function

2024-05-14 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109442 --- Comment #21 from Jan Hubicka --- This patch attempts to add __builtin_operator_new/delete. So far they are not optimized, which will need to be done by extra flag of BUILT_IN_ code. also the decl.cc code can be refactored to be less of cut&

[Bug tree-optimization/114959] incorrect TBAA for drived types involving function types

2024-05-07 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114959 --- Comment #4 from Jan Hubicka --- > > I think function types are somewhat special in that they do not denote > objects in the classical sense. They are also most complex and probably > target-dependent to handle. > > Note there's LTO where

[Bug tree-optimization/114774] Missed DSE in simple code due to interleaving sotres

2024-04-19 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114774 --- Comment #5 from Jan Hubicka --- > > Looking into it, instead of having simple outer loop it needs to > > maintain worklist of defs to proceed each annotated with live bitmap, > > rigt? > > Yeah, I have some patch on some branch somewhere ..

[Bug tree-optimization/114774] Missed DSE in simple code due to interleaving sotres

2024-04-19 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114774 --- Comment #3 from Jan Hubicka --- > Yes, DSE walking doesn't "branch" but goes to some length handling some > trivial > branches only. Mainly to avoid compile-time issues. It needs larger > re-structuring to fix that, but in principle it sh

[Bug ipa/114703] Missed devirtualization in rather simple case

2024-04-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114703 --- Comment #3 from Jan Hubicka --- > Yep, 'new' memory escapes. Yep, this is blocking a lot of propagation in common C++ code. Here it may help to do speculative devirtualization during IPA stage that will let the late optimization to get rid o

[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled on x86 since r14-5109-ga291237b628f41

2024-04-09 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #76 from Jan Hubicka --- There is still problem with loop bounds. I am testing patch on that and then we should be (finally) finally safe.

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-04-03 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115 --- Comment #15 from Jan Hubicka --- > Fixed for GCC 14 so far It is simple patch, so backporting is OK after a week in mainline.

[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled on x86 since r14-5109-ga291237b628f41

2024-04-02 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #70 from Jan Hubicka --- Hello, over easter I did some analysis of the cases where ICF is now disabled due to jump function miscompare. Most common case (seen also on GCC) is the situation where function is originally static inline

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-03-27 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 --- Comment #14 from Jan Hubicka --- > This patch fixes the ICE for me. > Seems we already did something like that in other spots (e.g. in apply_scale). In general if the overflow happens, some pass must have misbehaved and do something crazy w

[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-03-19 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #64 from Jan Hubicka --- > Are you going to apply this patch, even if it just helps partially with some > tests and not others? I think we should fix this completely, since it is source of very suprising bugs. I discussed it with Ma

[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-03-13 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #57 from Jan Hubicka --- > So, we can punt on differences there (that is desirable for backporting and > maybe GCC 14 too), or we could at that point populate an int vector, which > maps Yep, that is what I do. I had bug in that so

[Bug ipa/114317] Missing optimization for multiple condition statements

2024-03-12 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114317 --- Comment #2 from Jan Hubicka --- > (it would need to elide the stores of course). We do have way to elide stores, since we can optimize out write-only values. What we do not have readilly available is the value written to a reference (ipa-r

[Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function

2024-03-07 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114262 --- Comment #6 from Jan Hubicka --- > Note GCC has not retuned its -Os heurstics for a long time because it has been > decent enough for most folks and corner cases like this is almost never come > up. There were quite few changes to -Os heurist

[Bug lto/114241] False-positive -Wodr warning when using -flto and -fno-semantic-interposition

2024-03-06 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114241 --- Comment #2 from Jan Hubicka --- This indeed looks like bug caused by fact that the class is keyed into one of the two units. Outputting translation unit names is unfortunately hard, since they are object files and often comming from .a arch

[Bug target/114232] [14 regression] ICE when building rr-5.7.0 with LTO on x86

2024-03-05 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114232 --- Comment #26 from Jan Hubicka --- > I think optimize_function_for_size_p (cfun) isn't always true if > optimize_size is since it looks at the function-specific setting > of that flag, so you'd have to use opt_for_fn (cfun, optimize_size). Wh

[Bug target/114232] [14 regression] ICE when building rr-5.7.0 with LTO on x86

2024-03-05 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114232 --- Comment #21 from Jan Hubicka --- Looking at the prototype patch, why need to change also the splitters? My original goal was to use splitters to expand to faster code sequences while having patterns necessary for both variants. This makes

[Bug target/114232] [14 regression] ICE when building rr-5.7.0 with LTO on x86

2024-03-05 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114232 --- Comment #18 from Jan Hubicka --- optimize_function_for_size_p is not really affected by LTO or non-LTO. It does take into account node->count and node->frequency, which is updated during IPA, so it may change between early opts and late opt

[Bug tree-optimization/114052] [11/12/13/14 Regression] Wrong code at -O2 for well-defined infinite loop

2024-02-22 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114052 --- Comment #7 from Jan Hubicka --- > I see it doesn't do anything if mark_dfs_back_edges returns false, so it > will claim the function is finite even when it calls a non-finite function? > So I assume this is local analysis only and call edges

[Bug ipa/111960] [14 Regression] ICE: during GIMPLE pass: rebuild_frequencies: SIGSEGV (Invalid read of size 4) with -fdump-tree-rebuild_frequencies-all

2024-02-22 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111960 --- Comment #13 from Jan Hubicka --- > Should be fixed now. Thanks! I was testing with stage3 compiler, so that is the reason. Indeed dropping the buffer is a good idea.

[Bug middle-end/113907] [12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-16 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #45 from Jan Hubicka --- > > "Once legacy evrp is removed, this won't be an issue, as ranges in the IL > > will tell the truth. However, this will mean that we will no longer > > remove the first __builtin_unreachable combo. But

[Bug middle-end/113907] [12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-16 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #43 from Jan Hubicka --- > // See discussion here: > // https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571709.html Discussion says: "Once legacy evrp is removed, this won't be an issue, as ranges in the IL will tell the truth.

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #37 from Jan Hubicka --- > Also remember we like to have a fix that's easily backportable, and > that's probably going to be resetting the info. We can do something > more fancy for GCC 15 Rejecting to merge function with different

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #36 from Jan Hubicka --- > > Having a testcase is great. I was just playing with crafting one. > > I am still concerned about value ranges in ipa-prop's jump functions. > > Maybe my imagination is too limited, but if the ipa-prop's

[Bug tree-optimization/113787] [12/13/14 Regression] Wrong code at -O with ipa-modref on aarch64

2024-02-14 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113787 --- Comment #19 from Jan Hubicka --- > Note I didn't check if it helps the testcase .. I will check. > > > > > > > A "nicer" solution might be to add a informational operand > > > to TARGET_MEM_REF, representing the base pointer to be used fo

[Bug tree-optimization/113787] [12/13/14 Regression] Wrong code at -O with ipa-modref on aarch64

2024-02-14 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113787 --- Comment #17 from Jan Hubicka --- > > I guess PTA gets around by tracking points-to set also for non-pointer > > types and consequently it also gives up on any such addition. > > It does. But note it does _not_ for POINTER_PLUS where it tre

[Bug tree-optimization/113787] [12/13/14 Regression] Wrong code at -O with ipa-modref on aarch64

2024-02-13 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113787 --- Comment #15 from Jan Hubicka --- > > IVOPTs does the above but it does it (or should) as > > offset = (uintptr)&base2 - (uintptr)&base1; > val = *((T *)((uintptr)base1 + i + offset)) > > which is OK for points-to as no POINTER_PLUS_EX

[Bug gcov-profile/113646] PGO hurts run-time of 538.imagick_r as much as 68% at -Ofast -march=native

2024-02-01 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113646 --- Comment #4 from Jan Hubicka --- > > With -fprofile-partial-training the znver4 LTO vs LTOPGO regression (on a > newer > master) goes down from 66% to 54%. > > So far I did not find a way to easily train with the reference run (when I ad

[Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2 since r11-4987-g602c6cfc79ce4a

2024-01-30 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665 --- Comment #8 from Jan Hubicka --- > Honza - ICF seems to fixup points-to sets when merging variables, so there > should be a way to kill off flow-sensitive info inside prevailing bodies > as well. But would that happen before inlining the bod

[Bug gcov-profile/113646] PGO hurts run-time of 538.imagick_r as much as 68% at -Ofast -march=native

2024-01-29 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113646 --- Comment #2 from Jan Hubicka --- > Did you try with -fprofile-partial-training (is that default on? it probably > should ...). Can you please try training with the rate data instead of train It is not on by default - the problem of partial

[Bug ipa/113478] -Os does not inline single instruction function

2024-01-19 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113478 --- Comment #4 from Jan Hubicka --- > Possibly, at least when we know it doesn't expand to a libatomic call? OTOH > even then a function just wrapping such call should probably be inlined, > so the question is whether the problem that > is esti

[Bug ipa/113478] -Os does not inline single instruction function

2024-01-19 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113478 --- Comment #2 from Jan Hubicka --- Probably is_inexpensive_bulitin_p should return true here?

[Bug c++/109753] [13/14 Regression] pragma GCC target causes std::vector not to compile (always_inline on constructor)

2024-01-11 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109753 --- Comment #14 from Jan Hubicka --- > I think the issue might be that whoever is creating > __static_initialization_and_destruction_0 fails to honor the active > target pragma. Which means back to my suggestion to have multiple ones > when dif

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852 --- Comment #14 from Jan Hubicka --- > I thought the goal was to handle what is in predict-18.c, i.e. > b * __builtin_expect (c, 0) > or similar. If it is about > __builtin_expect_with_probability (b, 42, 0.25) * > __builtin_expect_with_probabi

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852 --- Comment #11 from Jan Hubicka --- > > + int p1 = get_predictor_value (*predictor, *probability); > > + int p2 = get_predictor_value (predictor2, probability2); > > + /* If both predictors agrees, it does not matter fro

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852 --- Comment #9 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852 > > --- Comment #7 from Jakub Jelinek --- > So, what about following patch (which also fixes the ICE, would of course need > to add the testcase) and doe

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852 --- Comment #6 from Jan Hubicka --- > which fixes the ICE by preferring PRED_BUILTIN_EXPECT* over others. > At least in this case when one operand is a constant and another one is > __builtin_expect* result that seems like the right choice to me

[Bug target/113233] LoongArch: target options from LTO objects not respected during linking

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113233 --- Comment #3 from Jan Hubicka --- > Confirm. But option save/restore has been always implemented: > > .section.gnu.lto_.opts,"",@progbits > .ascii "'-fno-openmp' '-fno-openacc' '-fno-pie' '-fcf-protection" > .ascii "=none'

[Bug middle-end/88345] -Os overrides -falign-functions=N on the command line

2023-12-06 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345 --- Comment #20 from Jan Hubicka --- > > Live patching (user-space) doesn't depend on any particular alignment of > functions, on x86-64 at least. (The plan for other architectures wouldn't > need > any specific alignment either). Note that t

[Bug middle-end/109849] suboptimal code for vector walking loop

2023-11-29 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849 --- Comment #32 from Jan Hubicka --- > /tmp/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/stl_algobase.h:437: > warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' > writing between 2 and 9223372036854775806 bytes into

[Bug middle-end/112653] PTA should handle correctly escape information of values returned by a function

2023-11-27 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112653 --- Comment #15 from Jan Hubicka --- Thanks a lot for working on this! I think it is quite importnat part of the puzzle of making libstdc++ vector working reasonably well.

[Bug tree-optimization/112706] missed simplification in FRE

2023-11-24 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112706 --- Comment #3 from Jan Hubicka --- Thanks, new pattern looks like noticeable improvement :) Base+offset is effective for alias analysis and I suppose it happens reasonably enough for compares as well. > _76 = _71 + 4; > # .MEM_154 = VDEF <.

[Bug tree-optimization/112678] [14 regression] Massive slowdown of compilation time with PGO

2023-11-23 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112678 --- Comment #2 from Jan Hubicka --- Seems we changed default to locking increments. jh@ryzen4:/tmp> cat t.C void test() { } jh@ryzen4:/tmp> ~/trunk-install/bin/g++ -O2 -fprofile-generate t.C -S ; grep lock t.s lock addl $1, __gcov

[Bug middle-end/112653] We should optimize memmove to memcpy using alias oracle

2023-11-22 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112653 --- Comment #5 from Jan Hubicka --- > but the issue is that test2 escapes which makes this conflict: It is passed to memmove which is noescape and returned. Why local PTA considers returned values to escape?

[Bug tree-optimization/111498] 951% profile quality regression between g:93996cfb308ffc63 (2023-09-18 03:40) and g:95d2ce05fb32e663 (2023-09-19 03:22)

2023-09-22 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111498 --- Comment #2 from Jan Hubicka --- > That just might cause a tid more early threading. That is, expose latent > profile updating issues elsewhere. Looking at the graph we're also still very > good compared to July. Early threading should not

[Bug ipa/111157] [14 Regression] 416.gamess fails with a run-time abort when compiled with -O2 -flto after r14-3226-gd073e2d75d9ed4

2023-08-29 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57 --- Comment #8 from Jan Hubicka --- > This is what I wanted to ask about. Looking at the dumps, ipa-modref > knows it is "killed." Is that enough or does it need to be also not > read to be know to be useless? The killed info means that the d

[Bug tree-optimization/110628] [14 regression] gcc.dg/tree-ssa/update-threading.c fails after r14-2383-g768f00e3e84123

2023-08-24 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110628 --- Comment #8 from Jan Hubicka --- patch posted https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628231.html

[Bug ipa/111088] useless 'xor eax,eax' inserted when a value is not returned and icf

2023-08-21 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111088 --- Comment #3 from Jan Hubicka --- > But adds a return with a value. And then the inliner inlines foo into foo2 but > we still have the return with a value around ... I guess ICF can special case unused return value, but why this is not taken c

[Bug tree-optimization/110628] [14 regression] gcc.dg/tree-ssa/update-threading.c fails after r14-2383-g768f00e3e84123

2023-08-17 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110628 --- Comment #6 from Jan Hubicka --- The mismatch happens on: void foo (unsigned int x) { if (x != 0x800 && x != 0x810) abort (); } It is bug in reassoc turning: void foo (unsigned int x) { ;; basic block 2, loop depth 0, count 107374

[Bug tree-optimization/106293] [13/14 Regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022

2023-07-28 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293 --- Comment #19 from Jan Hubicka --- > This heuristic wants to catch > > > if (foo) abort (); > > > and avoid sinking "too far" across a path with "similar enough" > execution count (I think the original motivation was to fix some > sp

[Bug middle-end/110832] 14% capacita -O2 regression between g:9fdbd7d6fa5e0a76 (2023-07-26 01:45) and g:ca912a39cccdd990 (2023-07-27 03:44) on zen3 and core

2023-07-27 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110832 --- Comment #2 from Jan Hubicka --- I tested that the profile change makes no difference.

[Bug target/110758] [14 Regression] 8% hmmer regression on zen1/3 with -Ofast -march=native -flto between g:8377cf1bf41a0a9d (2023-07-05 01:46) and g:3a61ca1b9256535e (2023-07-06 16:56); g:d76d19c9bc5

2023-07-21 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110758 --- Comment #2 from Jan Hubicka --- > I suspect this is most likely the profile updates changes ... Quite possibly. The goal of this excercise is to figure out if there are some bugs in profile estimate or whether passes somehow preffer broken p

[Bug tree-optimization/110628] [14 regression] gcc.dg/tree-ssa/update-threading.c fails after r14-2383-g768f00e3e84123

2023-07-13 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110628 --- Comment #3 from Jan Hubicka --- > -fdump-tree-all-blocks-details produced more than 100 dump files. Which > one(s) > do you want? Can you just zip them an attach all? Thank you! Honza

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-07-11 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #23 from Jan Hubicka --- But it would be nice to see why the functions are not early inlined.

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-07-11 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #22 from Jan Hubicka --- I will cook up the patch to keep multiple variants of nodes pre-inline and we will see how much that affects compile time & how hard it will be to get unit size esitmates right.

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-28 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #16 from Jan Hubicka --- > > We already have plenty of GF_CALL_ flags, so adding one should be easy? > > We have 3 bits left :/ I was hoping that cgraph_edge lives long > enough? But I suppose we're not keeping them across the ear

[Bug tree-optimization/109689] [14 Regression] ICE at -O1 with "-ftree-vectorize": in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:645 since r14-301-gf2d6beb7a4ddf1

2023-06-28 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109689 --- Comment #10 from Jan Hubicka --- > > So perhaps simply: > > rewrite_into_loop_closed_ssa (NULL, 0); > > in case we unlooped in loop closed ssa form (which is not that common). > > Would that be acceptable? > > Yes, we do that in other pla

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-28 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #14 from Jan Hubicka --- > > why disallow caller->indirect_calls? See testcase in comment #9 > > > + return false; > > + for (cgraph_edge *e2 = callee->callees; e2; e2 = e2->next_callee) > > I don't think this flys - it

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-26 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #11 from Jan Hubicka --- Hi, what about this. It should make at least quite basic inlining to happen to always_inline. I do not think many critical always_inlines have indirect calls in them. The test for lto is quite bad and I can

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-23 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #9 from Jan Hubicka --- Just so it is somewhere, here is a testcase that we can't inline leaf functions to always_inlines unless we do some tracking of what calls were formerly indirect calls. We really overloaded always_inline from

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-23 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #8 from Jan Hubicka --- > > I was playing with the idea of warning when at lto time when comdats have > > different command line options, but this triggers way too often in practice. > > Really? :/ Yep, for example firefox consist o

[Bug libstdc++/110287] _M_check_len is expensive

2023-06-19 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110287 --- Comment #7 from Jan Hubicka --- > > There is no guarantee that std::vector::max_size() is PTRDIFF_MAX. It > depends on the Allocator type, A. A user-defined allocator could have > max_size() == 100. If inliner we see path to the throw func

[Bug libstdc++/110287] _M_check_len is expensive

2023-06-18 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110287 --- Comment #5 from Jan Hubicka --- > Do you mean something like this? I sent my own version, but yours looks nicer. > > diff --git a/libstdc++-v3/include/bits/stl_vector.h > b/libstdc++-v3/include/bits/stl_vector.h > index 70ced3d101f..a4dbfeb

[Bug target/109812] GraphicsMagick resize is a lot slower in GCC 13.1 vs Clang 16 on Intel Raptor Lake

2023-05-31 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109812 --- Comment #12 from Jan Hubicka --- > /home/sdp/jun/btl0/install/bin/ld: /tmp/ccnX75zI.ltrans0.ltrans.o: in > function `main': > :(.text.startup+0x1): undefined reference to `GMCommand' I wonder if your plugin is configured correctly. Can you

  1   2   3   4   5   6   7   8   9   10   >