https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116068
--- Comment #5 from Jan Hubicka ---
> > ? All the other spots that execute some pass list in cgraphunit.cc wrap
> > that
> > with bitmap_obstack_initialize/release.
>
> That looks correct to me.
Looks good to me too. Does double-initializin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115825
--- Comment #22 from Jan Hubicka ---
> /* If there is pure/const call in the function, then we can
> still optimize the unrolled loop body if it contains some
> other interesting code than the calls and code s
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117892
--- Comment #4 from Jan Hubicka ---
> Deleted dead store: # .MEM_5 = VDEF <.MEM_3(D)>
> That started in GCC 12.
That is weird. I would expect CFG verification run between passes to
catch this...
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960
--- Comment #14 from Jan Hubicka ---
> Could we just add 'inline' to the functions that are 'constexpr' in later
> standards?
It would make sense to me - that would reduce differences between
codegens with different -std= options.
Also we may use
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94960
--- Comment #15 from Jan Hubicka ---
> Oh, sorry, that was linked earlier. But still, isn't the problem that "inline"
> is too strong?
Do we have some data on this? I plan to do some inliner benchmarking
over christmas like I do every year. Wit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103827
--- Comment #16 from Jan Hubicka ---
> > Note that this is the same for non-parameter local variables
>
> Just want to emphasize this point: this property is in no way specific to
> parameters, it applies to any object created as const. If som
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103827
--- Comment #13 from Jan Hubicka ---
> Yes, that object is defined const so can't be changed. But is this something
> we
> care about? Is it important to apply this optimization to noinline functions?
There are few things where this helps. Fir
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117964
--- Comment #4 from Jan Hubicka ---
> r5-1621-gfc56f9d2843266 last moved the pass earlier
Doing kind of a fake "PHI" basic block to factor out the edges is
possible and perhaps a good idea which I did not think of while working
on orgiinal CFG c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103827
--- Comment #7 from Jan Hubicka ---
> > > What about escape bits? Is it OK to save the address to global memory
> > > and then check it in the destructor?
> >
> > Yes, but does that matter? After the function returns the pointer is invalid
> >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103827
--- Comment #4 from Jan Hubicka ---
> That would be undefined, because s is defined const and so doing const_cast
> and
> then modifying it is undefined behaviour. However, this would be fine:
Cool, then I will look into getting modref and PTA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87502
--- Comment #14 from Jan Hubicka ---
> > So I think all we can hope for is merging memcpy with the extra write of 0.
>
> That's not actually clear.
>
> It would be reasonable to assume that foo isn't likely to change the string
> and have the i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87502
--- Comment #12 from Jan Hubicka ---
> (In reply to Jakub Jelinek from comment #10)
> > __builtin_memcpy (&D.35539.D.25336._M_local_buf, "abc", 3);
> > MEM[(char_type &)&D.35539 + 11] = 0;
> > change to
> > __builtin_memcpy (&D.35539.D.2533
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117957
--- Comment #5 from Jan Hubicka ---
> I suspect the issue is very similar (or the same) as PR 115777 .
Yep, I think it store-to-load forwarding. The stack is organized in
pairs that are likely written independetly and loaded together.
Sadly I t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117935
--- Comment #5 from Jan Hubicka ---
Note that propagation of branch probabilities from callee to caller
works only by kind of accident. I originally made branch prediction to
be done after early inlining since it makes some patterns branch
pred
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117924
--- Comment #3 from Jan Hubicka ---
Actually the main problem is that copying of bitvectors is done by loop
copying every bit individually. This loop stays until loop optimizers
and then we are quite late in optimization. Have patch for that.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117875
--- Comment #9 from Jan Hubicka ---
> > But maybe I'm missing something?
>
> I guess the issue is that with
>
> # k_24 = PHI <1(13), k_29(16)>
>
> to easily see this we'd have to compute the range of
> (unsigned int) M_9(D) - 1 and the range
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117088
--- Comment #6 from Jan Hubicka ---
> void digits_2.isra (integer(kind=4) ISRA.6607)
> {
> integer(kind=4) ISRA.6607_927(D) = ISRA.6607;
> ...
> # RANGE [irange] integer(kind=4) [-2147483647, 8][10, +INF]
> _494 = ISRA.6607_927(D) + 1;
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117088
--- Comment #3 from Jan Hubicka ---
> digits_2.isra (1);
>
> so we at least know row is [1, +INF] since the add is signed.
>
> We might be able to use a SCEV-like range computation for recursive cases like
> this, then being able to compute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86590
--- Comment #38 from Jan Hubicka ---
> _M_create is at line 144 of basic_string.tcc
It is not visible to middle-end though. If you check gimple dump, there
are calls
jan@padlo:/tmp> grep _M_create a-tt.C.*gimple
_4 = std::__cxx11::basic_st
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117793
--- Comment #2 from Jan Hubicka ---
> This is aggregate copy prop. What we could do is replace the last copy
> by
>
> __builtin_memcpy (_108, "this text is longer than 15 characters", 38);
>
> but this might be a pessimization in case none of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117764
--- Comment #6 from Jan Hubicka ---
> But the inlining argument basically says CDDCE shouldn't handle
> __builtin_unreachable control stmts optimistically given a use could appear
> only after inlining ... doesn't this then imply WONTFIX?
I am
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117764
--- Comment #3 from Jan Hubicka ---
> I don't think IPA-SRA does that. Is this something that is happening in the
> testcase from the bug summary? Do I need to use some inlining parameters to
> reproduce it?
Problem is that at ipa analysis we
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109442
--- Comment #28 from Jan Hubicka ---
> vector::size() is called **very often** so needs to be as fast as possible.
> Does this still inline identically?
Last year I made patch for inliner to ignore conditions guarding
__builtin_unreachable. Ric
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109442
--- Comment #25 from Jan Hubicka ---
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109442
>
> --- Comment #24 from Jason Merrill ---
> (In reply to Jan Hubicka from comment #23)
> > So I guess we are missing somewhere __builtin_assert that th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116535
--- Comment #7 from Jan Hubicka ---
> void
> output_offload_tables (void)
> {
> ...
>
> /* In WHOPR mode during the WPA stage the joint offload tables need to be
> streamed to one partition only. That's why we free offload_funcs and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116410
--- Comment #11 from Jan Hubicka ---
> We plan to adopt -ffat-lto-objects ourselves soon for at least a subset of
> packages, so this was good timing. :)
Note that -ffat-lto-objects has various issues, especially with library
archives. The prob
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115679
--- Comment #2 from Jan Hubicka ---
> With -Og it's usually that the always-inline function is called indirectly -
> that's an unsupported case.
We can probably add CIF code for functions that were called indirectly
but are no more, so this is r
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #18 from Jan Hubicka ---
> different issue from the one that is raised in the PR. (Unless we think that
> -O2 and -O3 should always have the same inlining heuristics henceforward, but
> that seems unlikely.)
Yes, I think point of -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #14 from Jan Hubicka ---
As for bit of history on this. I have introduced the split -O2 and -O3
limits in order to be able to enable -finline-small-functions at -O2
which we found to be really importnat for C++ codebases which no lo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #12 from Jan Hubicka ---
If this is without LTO, can you also try the LTO numbers?
Inliner behaves sifniciantly different with and without LTO, since LTO
introduces many (and often too many) inlining oppurtunities, which
sometimes ma
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110137
--- Comment #15 from Jan Hubicka ---
> As pointed out in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110035#c13 ,
> gcc
> already assume operator new's retuned pointer cannot alias any existing
> pointer. So no change is needed there.
Seems yo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110137
--- Comment #13 from Jan Hubicka ---
> Is the option supposed to be only about the standard global scope operator
> new/delete (_Znam etc.) or also user operator new/delete class methods? If
> the
> former, then I agree it is a global property
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110137
--- Comment #9 from Jan Hubicka ---
Doing global flag has a problem since with LTO or using optimize
attribute user may mix code compiled with and without sane operator new.
When function with insane operator new gets inlined to a function wit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115277
--- Comment #3 from Jan Hubicka ---
> What about gcc 13?
GCC 13 also misoptimizes.
Honza
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109914
--- Comment #5 from Jan Hubicka ---
> (In reply to Jan Hubicka from comment #2)
> > The reason why gcc warns is that it is unable to prove that the function is
> > always finite.
>
> I don't see why finiteness matters. If a pure function return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96059
--- Comment #7 from Jan Hubicka ---
> Actually, let me drop the PR59859 blocker, as IIRC we've had reports of this
> downstream w/o graphite.
I think you edited wrong PR here.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115097
--- Comment #7 from Jan Hubicka ---
> and then we inline them back, introducing the extra copy. Why do we use
> tail-calls here instead of aliases? Why do we lack cost modeling here?
Because the function is exported and we must keep addresses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109442
--- Comment #21 from Jan Hubicka ---
This patch attempts to add __builtin_operator_new/delete. So far they
are not optimized, which will need to be done by extra flag of BUILT_IN_
code. also the decl.cc code can be refactored to be less of cut&
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114959
--- Comment #4 from Jan Hubicka ---
>
> I think function types are somewhat special in that they do not denote
> objects in the classical sense. They are also most complex and probably
> target-dependent to handle.
>
> Note there's LTO where
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114774
--- Comment #5 from Jan Hubicka ---
> > Looking into it, instead of having simple outer loop it needs to
> > maintain worklist of defs to proceed each annotated with live bitmap,
> > rigt?
>
> Yeah, I have some patch on some branch somewhere ..
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114774
--- Comment #3 from Jan Hubicka ---
> Yes, DSE walking doesn't "branch" but goes to some length handling some
> trivial
> branches only. Mainly to avoid compile-time issues. It needs larger
> re-structuring to fix that, but in principle it sh
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114703
--- Comment #3 from Jan Hubicka ---
> Yep, 'new' memory escapes.
Yep, this is blocking a lot of propagation in common C++ code.
Here it may help to do speculative devirtualization during IPA stage
that will let the late optimization to get rid o
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907
--- Comment #76 from Jan Hubicka ---
There is still problem with loop bounds. I am testing patch on that and
then we should be (finally) finally safe.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115
--- Comment #15 from Jan Hubicka ---
> Fixed for GCC 14 so far
It is simple patch, so backporting is OK after a week in mainline.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907
--- Comment #70 from Jan Hubicka ---
Hello,
over easter I did some analysis of the cases where ICF is now disabled
due to jump function miscompare. Most common case (seen also on GCC) is
the situation where function is originally static inline
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303
--- Comment #14 from Jan Hubicka ---
> This patch fixes the ICE for me.
> Seems we already did something like that in other spots (e.g. in apply_scale).
In general if the overflow happens, some pass must have misbehaved and
do something crazy w
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907
--- Comment #64 from Jan Hubicka ---
> Are you going to apply this patch, even if it just helps partially with some
> tests and not others?
I think we should fix this completely, since it is source of very
suprising bugs. I discussed it with Ma
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907
--- Comment #57 from Jan Hubicka ---
> So, we can punt on differences there (that is desirable for backporting and
> maybe GCC 14 too), or we could at that point populate an int vector, which
> maps
Yep, that is what I do.
I had bug in that so
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114317
--- Comment #2 from Jan Hubicka ---
> (it would need to elide the stores of course).
We do have way to elide stores, since we can optimize out write-only
values. What we do not have readilly available is the value written to
a reference (ipa-r
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114262
--- Comment #6 from Jan Hubicka ---
> Note GCC has not retuned its -Os heurstics for a long time because it has been
> decent enough for most folks and corner cases like this is almost never come
> up.
There were quite few changes to -Os heurist
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114241
--- Comment #2 from Jan Hubicka ---
This indeed looks like bug caused by fact that the class is keyed into
one of the two units.
Outputting translation unit names is unfortunately hard, since they are
object files and often comming from .a arch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114232
--- Comment #26 from Jan Hubicka ---
> I think optimize_function_for_size_p (cfun) isn't always true if
> optimize_size is since it looks at the function-specific setting
> of that flag, so you'd have to use opt_for_fn (cfun, optimize_size).
Wh
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114232
--- Comment #21 from Jan Hubicka ---
Looking at the prototype patch, why need to change also the splitters?
My original goal was to use splitters to expand to faster code sequences
while having patterns necessary for both variants. This makes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114232
--- Comment #18 from Jan Hubicka ---
optimize_function_for_size_p is not really affected by LTO or non-LTO.
It does take into account node->count and node->frequency, which is
updated during IPA, so it may change between early opts and late opt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114052
--- Comment #7 from Jan Hubicka ---
> I see it doesn't do anything if mark_dfs_back_edges returns false, so it
> will claim the function is finite even when it calls a non-finite function?
> So I assume this is local analysis only and call edges
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111960
--- Comment #13 from Jan Hubicka ---
> Should be fixed now.
Thanks! I was testing with stage3 compiler, so that is the reason.
Indeed dropping the buffer is a good idea.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907
--- Comment #45 from Jan Hubicka ---
> > "Once legacy evrp is removed, this won't be an issue, as ranges in the IL
> > will tell the truth. However, this will mean that we will no longer
> > remove the first __builtin_unreachable combo. But
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907
--- Comment #43 from Jan Hubicka ---
> // See discussion here:
> // https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571709.html
Discussion says:
"Once legacy evrp is removed, this won't be an issue, as ranges in the IL
will tell the truth.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907
--- Comment #37 from Jan Hubicka ---
> Also remember we like to have a fix that's easily backportable, and
> that's probably going to be resetting the info. We can do something
> more fancy for GCC 15
Rejecting to merge function with different
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907
--- Comment #36 from Jan Hubicka ---
> > Having a testcase is great. I was just playing with crafting one.
> > I am still concerned about value ranges in ipa-prop's jump functions.
>
> Maybe my imagination is too limited, but if the ipa-prop's
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113787
--- Comment #19 from Jan Hubicka ---
> Note I didn't check if it helps the testcase ..
I will check.
>
> > >
> > > A "nicer" solution might be to add a informational operand
> > > to TARGET_MEM_REF, representing the base pointer to be used fo
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113787
--- Comment #17 from Jan Hubicka ---
> > I guess PTA gets around by tracking points-to set also for non-pointer
> > types and consequently it also gives up on any such addition.
>
> It does. But note it does _not_ for POINTER_PLUS where it tre
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113787
--- Comment #15 from Jan Hubicka ---
>
> IVOPTs does the above but it does it (or should) as
>
> offset = (uintptr)&base2 - (uintptr)&base1;
> val = *((T *)((uintptr)base1 + i + offset))
>
> which is OK for points-to as no POINTER_PLUS_EX
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113646
--- Comment #4 from Jan Hubicka ---
>
> With -fprofile-partial-training the znver4 LTO vs LTOPGO regression (on a
> newer
> master) goes down from 66% to 54%.
>
> So far I did not find a way to easily train with the reference run (when I ad
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665
--- Comment #8 from Jan Hubicka ---
> Honza - ICF seems to fixup points-to sets when merging variables, so there
> should be a way to kill off flow-sensitive info inside prevailing bodies
> as well. But would that happen before inlining the bod
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113646
--- Comment #2 from Jan Hubicka ---
> Did you try with -fprofile-partial-training (is that default on? it probably
> should ...). Can you please try training with the rate data instead of train
It is not on by default - the problem of partial
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113478
--- Comment #4 from Jan Hubicka ---
> Possibly, at least when we know it doesn't expand to a libatomic call? OTOH
> even then a function just wrapping such call should probably be inlined,
> so the question is whether the problem that
> is esti
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113478
--- Comment #2 from Jan Hubicka ---
Probably is_inexpensive_bulitin_p should return true here?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109753
--- Comment #14 from Jan Hubicka ---
> I think the issue might be that whoever is creating
> __static_initialization_and_destruction_0 fails to honor the active
> target pragma. Which means back to my suggestion to have multiple ones
> when dif
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852
--- Comment #14 from Jan Hubicka ---
> I thought the goal was to handle what is in predict-18.c, i.e.
> b * __builtin_expect (c, 0)
> or similar. If it is about
> __builtin_expect_with_probability (b, 42, 0.25) *
> __builtin_expect_with_probabi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852
--- Comment #11 from Jan Hubicka ---
> > + int p1 = get_predictor_value (*predictor, *probability);
> > + int p2 = get_predictor_value (predictor2, probability2);
> > + /* If both predictors agrees, it does not matter fro
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852
--- Comment #9 from Jan Hubicka ---
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852
>
> --- Comment #7 from Jakub Jelinek ---
> So, what about following patch (which also fixes the ICE, would of course need
> to add the testcase) and doe
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852
--- Comment #6 from Jan Hubicka ---
> which fixes the ICE by preferring PRED_BUILTIN_EXPECT* over others.
> At least in this case when one operand is a constant and another one is
> __builtin_expect* result that seems like the right choice to me
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113233
--- Comment #3 from Jan Hubicka ---
> Confirm. But option save/restore has been always implemented:
>
> .section.gnu.lto_.opts,"",@progbits
> .ascii "'-fno-openmp' '-fno-openacc' '-fno-pie' '-fcf-protection"
> .ascii "=none'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345
--- Comment #20 from Jan Hubicka ---
>
> Live patching (user-space) doesn't depend on any particular alignment of
> functions, on x86-64 at least. (The plan for other architectures wouldn't
> need
> any specific alignment either). Note that t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849
--- Comment #32 from Jan Hubicka ---
> /tmp/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/stl_algobase.h:437:
> warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)'
> writing between 2 and 9223372036854775806 bytes into
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112653
--- Comment #15 from Jan Hubicka ---
Thanks a lot for working on this! I think it is quite importnat part of
the puzzle of making libstdc++ vector working reasonably well.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112706
--- Comment #3 from Jan Hubicka ---
Thanks, new pattern looks like noticeable improvement :)
Base+offset is effective for alias analysis and I suppose it happens
reasonably enough for compares as well.
> _76 = _71 + 4;
> # .MEM_154 = VDEF <.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112678
--- Comment #2 from Jan Hubicka ---
Seems we changed default to locking increments.
jh@ryzen4:/tmp> cat t.C
void
test()
{
}
jh@ryzen4:/tmp> ~/trunk-install/bin/g++ -O2 -fprofile-generate t.C -S ; grep
lock t.s
lock addl $1, __gcov
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112653
--- Comment #5 from Jan Hubicka ---
> but the issue is that test2 escapes which makes this conflict:
It is passed to memmove which is noescape and returned. Why local PTA
considers returned values to escape?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111498
--- Comment #2 from Jan Hubicka ---
> That just might cause a tid more early threading. That is, expose latent
> profile updating issues elsewhere. Looking at the graph we're also still very
> good compared to July.
Early threading should not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57
--- Comment #8 from Jan Hubicka ---
> This is what I wanted to ask about. Looking at the dumps, ipa-modref
> knows it is "killed." Is that enough or does it need to be also not
> read to be know to be useless?
The killed info means that the d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110628
--- Comment #8 from Jan Hubicka ---
patch posted
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628231.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111088
--- Comment #3 from Jan Hubicka ---
> But adds a return with a value. And then the inliner inlines foo into foo2 but
> we still have the return with a value around ...
I guess ICF can special case unused return value, but why this is not
taken c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110628
--- Comment #6 from Jan Hubicka ---
The mismatch happens on:
void foo (unsigned int x)
{
if (x != 0x800 && x != 0x810)
abort ();
}
It is bug in reassoc turning:
void foo (unsigned int x)
{
;; basic block 2, loop depth 0, count 107374
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293
--- Comment #19 from Jan Hubicka ---
> This heuristic wants to catch
>
>
> if (foo) abort ();
>
>
> and avoid sinking "too far" across a path with "similar enough"
> execution count (I think the original motivation was to fix some
> sp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110832
--- Comment #2 from Jan Hubicka ---
I tested that the profile change makes no difference.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110758
--- Comment #2 from Jan Hubicka ---
> I suspect this is most likely the profile updates changes ...
Quite possibly. The goal of this excercise is to figure out if there are
some bugs in profile estimate or whether passes somehow preffer broken
p
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110628
--- Comment #3 from Jan Hubicka ---
> -fdump-tree-all-blocks-details produced more than 100 dump files. Which
> one(s)
> do you want?
Can you just zip them an attach all?
Thank you!
Honza
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334
--- Comment #23 from Jan Hubicka ---
But it would be nice to see why the functions are not early inlined.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334
--- Comment #22 from Jan Hubicka ---
I will cook up the patch to keep multiple variants of nodes pre-inline
and we will see how much that affects compile time & how hard it will be
to get unit size esitmates right.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334
--- Comment #16 from Jan Hubicka ---
> > We already have plenty of GF_CALL_ flags, so adding one should be easy?
>
> We have 3 bits left :/ I was hoping that cgraph_edge lives long
> enough? But I suppose we're not keeping them across the ear
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109689
--- Comment #10 from Jan Hubicka ---
> > So perhaps simply:
> > rewrite_into_loop_closed_ssa (NULL, 0);
> > in case we unlooped in loop closed ssa form (which is not that common).
> > Would that be acceptable?
>
> Yes, we do that in other pla
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334
--- Comment #14 from Jan Hubicka ---
>
> why disallow caller->indirect_calls?
See testcase in comment #9
>
> > + return false;
> > + for (cgraph_edge *e2 = callee->callees; e2; e2 = e2->next_callee)
>
> I don't think this flys - it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334
--- Comment #11 from Jan Hubicka ---
Hi,
what about this. It should make at least quite basic inlining to happen
to always_inline. I do not think many critical always_inlines have
indirect calls in them. The test for lto is quite bad and I can
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334
--- Comment #9 from Jan Hubicka ---
Just so it is somewhere, here is a testcase that we can't inline leaf
functions to always_inlines unless we do some tracking of what calls
were formerly indirect calls.
We really overloaded always_inline from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334
--- Comment #8 from Jan Hubicka ---
> > I was playing with the idea of warning when at lto time when comdats have
> > different command line options, but this triggers way too often in practice.
>
> Really? :/
Yep, for example firefox consist o
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110287
--- Comment #7 from Jan Hubicka ---
>
> There is no guarantee that std::vector::max_size() is PTRDIFF_MAX. It
> depends on the Allocator type, A. A user-defined allocator could have
> max_size() == 100.
If inliner we see path to the throw func
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110287
--- Comment #5 from Jan Hubicka ---
> Do you mean something like this?
I sent my own version, but yours looks nicer.
>
> diff --git a/libstdc++-v3/include/bits/stl_vector.h
> b/libstdc++-v3/include/bits/stl_vector.h
> index 70ced3d101f..a4dbfeb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109812
--- Comment #12 from Jan Hubicka ---
> /home/sdp/jun/btl0/install/bin/ld: /tmp/ccnX75zI.ltrans0.ltrans.o: in
> function `main':
> :(.text.startup+0x1): undefined reference to `GMCommand'
I wonder if your plugin is configured correctly. Can you
1 - 100 of 211 matches
Mail list logo