> Thanks both!
For modules the Makefile needs to be adjusted to run final LTO before
modpost etc. These were the respective hunks from the old patchkit
(may need some tweaks)
@@ -154,7 +154,7 @@ is-single-obj-m = $(and $(part-of-module),$(filter $@,
$(obj-m)),y)
# When a module consists of a si
Sam James writes:
> Michal Jires writes:
>
>> I did handle node->iterate_referring, but forgot cnode->callers.
>>
>> Only change are contents of the newly separated
>> mark_symbol_referenced_from_asm
>
> Thanks, I'll try the new patch now.
>
> With the workaround I mentioned earlier, I managed t
Jan Hubicka writes:
> With -O2 we automatically enable several loop optimizations with
> -fprofile-use.
> The rationale is that those optimizations at -O3 only mainly since they may
> hurt performance or not pay back in code size when used blindly on all loops.
> Profile feedback gives us data o
> There are 27 unique toplevel assembly in following files.
> That is when building only vmlinux with default settings.
> There are probably a few more.
Try allyesconfig. The default config is quite small.
-Andi
On Fri, Aug 29, 2025 at 09:01:06AM -0700, Andi Kleen wrote:
> From: Andi Kleen
>
> This makes them not fail during test suite runs with overriden arch or
> tunings.
Comitted as obvious now.
-Andi
liuhongt writes:
> 1) Fix predicate of operands[3] in cond_ since only
> const_vec_dup_operand is excepted for masked operations, and pass real
> count to ix86_vgf2p8affine_shift_matrix.
>
> 2) Pass operands[2] instead of operands[1] to
> gen_vgf2p8affineqb__mask which excepted the operand to shi
> > Can you point to that discussion?
>
> I'm not aware of a rejection of the new form in GCC 15, but in previous
> discussions, their responses were:
> * https://lore.kernel.org/all/87a64qo4th.ffs@tglx/
> *
> https://lore.kernel.org/all/y3jj67tz9ta2a...@hirez.programming.kicks-ass.net/
> *
> ht
From: Andi Kleen
This makes them not fail during test suite runs with overriden arch or
tunings.
gcc/testsuite/ChangeLog:
* gcc.target/i386/shift-gf2p8affine-1.c: Use -march=x86-64
-mtune-generic.
* gcc.target/i386/shift-gf2p8affine-2.c: Dito.
* gcc.target
On Fri, Aug 29, 2025 at 05:19:18AM -0700, H.J. Lu wrote:
> On Thu, Aug 28, 2025 at 10:22 PM Andi Kleen wrote:
> >
> >
> > This patch should fix it. Please confirm.
> >
> >
> > diff --git a/gcc/testsuite/gcc.target/i386/shift-gf2p8affine-1.c
> > b/gcc/
This patch should fix it. Please confirm.
diff --git a/gcc/testsuite/gcc.target/i386/shift-gf2p8affine-1.c
b/gcc/testsuite/gcc.target/i386/shift-gf2p8affine-1.c
index e5be3a35538..cb576eb4498 100644
--- a/gcc/testsuite/gcc.target/i386/shift-gf2p8affine-1.c
+++ b/gcc/testsuite/gcc.target/i386/s
Jakub Jelinek writes:
> On Wed, Aug 27, 2025 at 03:52:11PM +0200, Michal Jires wrote:
>> This new pass heuristically detects symbols referenced by toplevel
>> assembly to prevent their optimization.
>>
>> Heuristics is done by comparing identifiers in assembly to known
>> symbols.
>>
>> The pas
> >
> > with GCC configured with
> >
> > ../../gcc/configure
> > --prefix=/export/users3/haochenj/src/gcc-bisect/master/master/r16-3364/usr
> > --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
> > --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet
> > --without-isl
On Wed, Aug 27, 2025 at 02:11:44AM +, Jiang, Haochen wrote:
> On Linux/x86_64,
>
> 001cd39749f94ece8276b63f91eb864babb81a5d is the first bad commit
> commit 001cd39749f94ece8276b63f91eb864babb81a5d
> Author: Andi Kleen
> Date: Sun Aug 3 17:35:39 2025 -0700
>
&
Michal Jires writes:
> These patches allow us to handle toplevel assembly referencing symbols.
> Previous linux kernel patches needed to mark any such referenced symbols
> manually. Currently needed linux patches are here:
> https://gitlab.com/mixal_iirec/linux_gcc_lto_patches
>
>
Thanks for all
From: Andi Kleen
Make the expand pattern for operand 1 match the final instruction.
PR 121658
gcc/ChangeLog:
* config/i386/sse.md ("3"): Use
register_operand for rotate patterns.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr121658.c: New test.
---
From: Andi Kleen
[v4 version: Exclude for >> 7. Add test cases for 256/128bit
and improve tests. Remove some AVX512F checks. Fix mode iterator.]
[v3 version: Remove unnecessary _mask pattern.
Add extra FAIL case. Remove unnecessary AVX512F check.
Fix changelog.]
[v2 version: Split
>
> I think for a 512-bit vector, vgf2p8affineqb is better than the
> original codegen, but for a 128/256-bit vector, shouldn't vpcmpgtb be
> better than vgf2p8affineqb?
Yes it's better, but I don't see it in the loop bodies for
any of my test cases, only in prologues/epilogues.
Okay probably t
From: Andi Kleen
[v3 version: Remove unnecessary _mask pattern.
Add extra FAIL case. Remove unnecessary AVX512F check.
Fix changelog.]
[v2 version: Split rotate patterns in V16QI and V32/64QI.
Add various AVX512F checks. Remove some unnecessary
masks. Add untested cond_ pattern (untested
> > + else if (TARGET_GFNI && TARGET_AVX512F && CONST_INT_P (operands[2]))
> I don't think we need AVX512F here, and let's exclude >>7 cases here,
> so better be.
> else if (TARGET_GFNI
> && CONST_INT_P (operands[2])
> /* It's just vpcmpgtb against 0. */
> && !
From: Andi Kleen
[v2 version: Split rotate patterns in V16QI and V32/64QI.
Add various AVX512F checks. Remove some unnecessary
masks. Add untested cond_ pattern (untested, couldn't trigger it)
Clean up some control flow. Use narrower modes.
Avoid need for weakening predicate check in expand
> > It might be reasonable to tweak the costs per CPU however, I haven't
> > done that.
> >
> > BTW for rotate the wins are much higher because there are no native
> > instructions for it.
> For ashl/lshr, the original implementation only takes 2
> instructions(vpsllw/vpsrlw + vpand), and for ashr
>
> The latter takes 5 cycles, the former takes 3 cycles.
It's pipelined however.
>
> Do you have any microbenchmark or real workloads to show your
> optimization is better?
Keep in mind it only uses one port vs two.
Yes I ran it on Arrow lake and saw wins on both Pcore and Ecore
according to
Andi Kleen writes:
I wanted to ping
https://gcc.gnu.org/pipermail/gcc-patches/2025-August/691624.html
> From: Andi Kleen
>
> The GFNI AVX gf2p8affineqb instruction can be used to implement
> vectorized byte shifts or rotates. This patch uses them to implement
> shift and rot
From: Andi Kleen
The GFNI AVX gf2p8affineqb instruction can be used to implement
vectorized byte shifts or rotates. This patch uses them to implement
shift and rotate patterns to allow the vectorizer to use them.
Previously AVX couldn't do rotates (except with XOP) and had to handle
8 bit s
"H.J. Lu" writes:
> Don't hoist non all 0s/1s vector set outside of the loop to avoid extra
> spills.
It seems this could be a loss if there are actually enough registers.
So you need to make it depend on the register pressure?
-Andi
Dimitar Dimitrov writes:
> A few tests started failing recently on pru-unknown-elf because it uses
> SJLJ implementation for exceptions:
> FAIL: g++.dg/ext/musttail3.C -std=c++11 (test for excess errors)
> .../gcc/gcc/testsuite/g++.dg/ext/musttail3.C:12:34: error: cannot
> tail-call: caller
On Fri, Jul 11, 2025 at 12:14:46PM +0200, Jan Hubicka wrote:
> Hello,
> currently autoprofiled bootstrap produces auto-profiles for cc1 and
> cc1plus binaries. Those are used to build respective frontend files.
> For backend cc1plus.fda is used. This does not work well with LTO
> bootstrap where
On Fri, Jun 27, 2025 at 08:11:29AM +0200, Uros Bizjak wrote:
> On Fri, Jun 27, 2025 at 7:27 AM Andi Kleen wrote:
> >
> > Uros Bizjak writes:
> >
> > > Introduce crc_revsi4 expanders to generate CRC32 instruction when
> > > using
> > > __
Uros Bizjak writes:
> Introduce crc_revsi4 expanders to generate CRC32 instruction when using
> __builtin_rev_crc32_data* builtins with 0x1EDC6F41 poylnomial and -mcrc32.
>
> PR target/120719
>
> gcc/ChangeLog:
>
> * config/i386/i386.md (crc_revsi4): New expander.
>
> gcc/testsuite/Change
On 2025-06-06 12:42, Jan Hubicka wrote:
Hi,
also after fixing this issue my bootstrap failes with:
Permission error mapping pages.
Consider increasing /proc/sys/kernel/perf_event_mlock_kb,
or try again with a smaller value of -m/--mmap_pages.
(current value: 4294967295,0)
Permission error mappin
On Wed, May 14, 2025 at 02:46:15AM +, Kugan Vivekanandarajah wrote:
> Adding Eugene and Andi to CC as Sam suggested.
>
> > On 13 May 2025, at 12:57 am, Richard Sandiford
> wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> > Kugan Vivekanandarajah writes:
> >>
On 2025-05-06 09:48, H.J. Lu wrote:
On Mon, May 5, 2025 at 9:56 PM Andi Kleen wrote:
On Mon, May 05, 2025 at 06:20:40AM -0700, Andi Kleen wrote:
> > If the branch edge destination is a basic block with only a direct
> > sibcall, change the jcc target to the sibcall target, d
On Mon, May 05, 2025 at 06:20:40AM -0700, Andi Kleen wrote:
> > If the branch edge destination is a basic block with only a direct
> > sibcall, change the jcc target to the sibcall target, decrement the
> > destination basic block entry label use count and redirect the edge
>
> If the branch edge destination is a basic block with only a direct
> sibcall, change the jcc target to the sibcall target, decrement the
> destination basic block entry label use count and redirect the edge
> to the exit basic block. Call delete_unreachable_blocks to delete
> the unreachable bas
This adds an automatic downloader for the latest test results from
the mailing list archive and supports diffing test_summary to it.
Useful if you don't want to run your own baseline.
contrib/ChangeLog:
* diffsummary.py: New file.
---
contrib/diffsummary.py | 104
On 2025-04-23 10:18, Richard Biener wrote:
On Tue, Apr 22, 2025 at 5:43 PM Andi Kleen wrote:
On 2025-04-22 13:22, Richard Biener wrote:
> On Sat, Apr 12, 2025 at 5:09 PM Andi Kleen wrote:
>>
>> From: Andi Kleen
>>
>> ... that uses -march=native -mtune=native to bu
On Wed, Jan 29, 2025 at 10:33:14AM +0100, Christoph Müllner wrote:
> The avoid-store-forwarding pass is disabled by default and therefore
> in the risk of bit-rotting. This patch addresses this by enabling
> the pass at O2 or higher.
>
> The assembly patterns in `bitfield-bitint-abi-align16.c` an
On Tue, Apr 22, 2025 at 01:27:34PM +0200, Richard Biener wrote:
> I assume this passed bootstrap & regtest?
Yes it did
>
> This is OK for trunk after we've released GCC 15.1.
Thanks.
Andi
From: Andi Kleen
... that uses -march=native -mtune=native to build a compiler optimized
for the host.
config/ChangeLog:
* bootstrap-native.mk: New file.
gcc/ChangeLog:
* doc/install.texi: Document bootstrap-native.
---
config/bootstrap-native.mk | 1 +
gcc/doc/install.texi
This adds an automatic downloader for the latest test results from
the mailing list archive and supports diffing test_summary to it.
Useful if you don't want to run your own baseline.
contrib/ChangeLog:
* diffsummary.py: New file.
---
contrib/diffsummary.py | 104
Right now ggc has a single free list for multiple sizes. In some cases
the list can get mixed by orders and then the allocator may spend a lot
of time walking the free list to find the right sizes.
This patch splits the free list into multiple free lists by order
which allows O(1) access in most c
When -fprofile-generate is used musttail often fails because the
compiler adds instrumentation after the tail calls.
This patch prevents adding exit extra edges after musttail because for a
tail call the execution leaves the function and can never come back
even on a unwind or exception.
This is
From: Andi Kleen
This isn't a regression, but it's a very simple patch with high
performance improvement, so perhaps suitable in the current stage.
---
bitmap_set_bit checks the original value of the bit to return it to the
caller and then only writes the new value back if it cha
> I'd like to ping the
> https://gcc.gnu.org/pipermail/gcc-patches/2025-March/679182.html
> patch.
> I know it is quite controversial and if clang wouldn't be the first
> to implement this I'd certainly not go that way; I am willing to change
> the warning option names or move the maybe one from -W
> You're right (although I don't remember which targets are
> non-external_musttail).
Several flavors of ARM and Power at least.
Jakub Jelinek writes:
> --- gcc/testsuite/g++.dg/opt/musttail2.C.jj 2025-03-24 13:27:44.329204196
> +0100
> +++ gcc/testsuite/g++.dg/opt/musttail2.C 2025-03-24 13:28:08.975867389
> +0100
> @@ -0,0 +1,14 @@
> +// PR ipa/119376
> +// { dg-do compile { target musttail } }
I think this need
> This can be rewritten as
>
> void foo(int v)
> {
> {
> int a;
> capture(&a);
> if (condition)
> goto tail_position;
> // do something with a
> }
> tail_position:
> tailcall(v);
> }
>
> or with 'do { ... if (...) break; ...} while (0)' when one prefers that to
> goto
On Tue, Mar 25, 2025 at 07:43:28PM +0300, Alexander Monakov wrote:
> Hello,
>
> FWIW I think Clang made a mistake in bending semantics in a way that is
> clearly
> misaligned with the general design of C and C++, where a language-native, so
> to
> speak, solution was available: introduce a scope
> 2025-03-25 Jakub Jelinek
> Andi Kleen
>
> PR gcov-profile/118442
> * profile.cc (branch_prob): Ignore EDGE_FAKE edges from musttail calls
> to EXIT.
>
> * c-c++-common/pr118442.c: New test.
>
> --- gcc/profile.cc.jj 2025-
From: Andi Kleen
When -fprofile-generate is used musttail often fails because the
compiler adds instrumentation after the tail calls.
This patch prevents adding exit extra edges after musttail because for a
tail call the execution leaves the function and can never come back
even on a unwind or
On Thu, Mar 20, 2025 at 06:25:26PM +0100, Jakub Jelinek wrote:
> On Thu, Mar 20, 2025 at 10:01:02AM -0700, Andi Kleen wrote:
> > So it could be as simple as that patch? It solves your test case at least
> > for x86.
>
> Not sure I like this, but if others (e.g. Richi, Josep
On Thu, Mar 20, 2025 at 05:28:48PM +0100, Jakub Jelinek wrote:
> On Thu, Mar 20, 2025 at 09:19:02AM -0700, Andi Kleen wrote:
> > The inlining was just one of the issue, there are some related to
> > different semantics of escaped locals. gcc always errors out while
> > LLVM
On Thu, Mar 20, 2025 at 11:45:33AM -0400, Jason Merrill wrote:
> On 3/19/25 9:31 PM, Andi Kleen wrote:
> > From: Andi Kleen
> >
> > There are multiple reports (see PR 119376) now where semantic differences
> > in the gcc musttail implementation break existing programs
From: Andi Kleen
There are multiple reports (see PR 119376) now where semantic differences
in the gcc musttail implementation break existing programs written for the clang
variant.
Even though that can be all hopefully fixed eventually,
for the gcc 15 release it seems safer to disable clang
From: Andi Kleen
When -fprofile-generate is used musttail often fails because the
compiler adds instrumentation after the tail calls.
This patch prevents adding exit extra edges after musttail because for a
tail call the execution leaves the function and can never come back
even on a unwind or
> This looks wrong to me. Even tail calls can be terminated with exit,
> perform longjmp, do other things for which stmt_can_terminate_bb_p
> should return true. stmt_can_terminate_bb_p is used in many places, not
> just in the predict instrumentation.
Okay so the check should be only used for s
Andi Kleen writes:
> diff --git a/gcc/input.cc b/gcc/input.cc
> index fabfbfb6eaa..d3b12037ba8 100644
> --- a/gcc/input.cc
> +++ b/gcc/input.cc
> @@ -1325,6 +1325,8 @@ dump_line_table_statistics (void)
>if (s.num_expanded_macros != 0)
> fprintf (stderr, "Av
"James K. Lowden" writes:
>> Having a minimal harness in GCCs testsuite is critical - I'd expect a
>> gcc/testsuite/gcobol.dg/dg.exp supporting execution tests. I assume
>> Cobol has a way to exit OK or fatally and this should be
>> distinguished as testsuite PASS or FAIL.
>
> Yes, a COBOL pro
From: Andi Kleen
The file-cache-lines / file-cache-files tunables were documented in the
wrong section. Fix that.
Reported-by: Filip Kastl
Comitted as obvious.
gcc/ChangeLog:
* doc/invoke.texi:
---
gcc/doc/invoke.texi | 20 ++--
1 file changed, 10 insertions(+), 10
From: Andi Kleen
Document new params in invoke.texi.
The auto tuning description was on the wrong tunable, move to lines.
Comitted as obvious.
gcc/ChangeLog:
* doc/invoke.texi: Document file cache tunables.
* params.opt: Move auto tuning description to lines.
---
gcc/doc
On Tue, Jan 28, 2025 at 09:50:41AM +0100, Richard Biener wrote:
> On Mon, Jan 27, 2025 at 9:59 PM David Malcolm wrote:
> >
> > On Sat, 2025-01-25 at 23:31 -0800, Andi Kleen wrote:
> > > From: Andi Kleen
> > >
> > > This is the hot function in input.cc
&
>
> If I reading this right, calls to get_next_line lead to insertions into
> the ring buffer whilst the buffer is empty or the last line in the ring
> buffer cache is m_line_num - 1.
>
> There are a few places where we update m_line_num, but this caching
> code doesn't seem to touch those places
On Sun, Feb 02, 2025 at 09:35:52PM -0800, Andi Kleen wrote:
> > Patch 7 is OK otherwise, and I'm taking a look at the rest of the
> > patches now; thanks.
>
> Any comments on the other patches?
nm. I see you already commented. somehow i missed that.
-Andi
> Patch 7 is OK otherwise, and I'm taking a look at the rest of the
> patches now; thanks.
Any comments on the other patches?
Thanks,
-Andi
From: Andi Kleen
While the input line cache size now tunable it's better if the compiler
auto tunes it. Otherwise large files needing random file access will
still have to search many lines to find the right lines.
Add support for allocating one line anchor per hundred input lines.
This
This is a fix for slowness accessing random lines in the source file
for diagnostics.
This version I added a unit test as requested by David, and also
added a x86 vectorization hint for the hot line search function (with the
early break work the vectorizer is powerful enough to handle it now)
If
From: Andi Kleen
gcc/ChangeLog:
* input.cc (check_line): New.
(test_replacement): New function to test line caching.
(input_cc_tests): Call test_replacement
---
gcc/input.cc | 46 ++
1 file changed, 46 insertions(+)
diff
From: Andi Kleen
This is the hot function in input.cc
The vectorizer can vectorize it now, but in a generic cpu O2 x86 build it isn't.
Add a automatic target clone to handle it for x86 and build
that function with O3.
The ifdef here is ugly, perhaps gcc should have a more convenient
"
From: Andi Kleen
The input machinery to read the source code independent of the lexer
has a range of hard coded maximum array sizes that can impact performance.
Make them tunable.
input.cc is part of libcommon so it cannot direct access params
without a level of indirection.
gcc/ChangeLog
From: Andi Kleen
The input context file_cache maintains an array of anchors
to speed up accessing lines before the previous line.
The array has a fixed upper size and the algorithm relies
on the linemap reporting the maximum number of lines in the file
in advance to compute the position of each
From: Andi Kleen
With the new cache maintenance algorithm we don't need the
maximum number of lines anymore. Remove all the code for that.
gcc/ChangeLog:
PR preprocessor/118168
* input.cc (total_lines_num): Remove.
(file_cache_slot::evict):
From: Andi Kleen
For larger files the file_cache line index will be spread out to make
the index fit into the fixed buffer, so any access to the non latest line
will need some skipping of lines.
Most accesses for line are near the latest line because
a diagnostic is likely near where the
From: Andi Kleen
Correct the description of inline assembler to say that gcc does
limited assembler parsing to estimate the length of inline assembler
statements, and document that certain assembler primitives can confuse
it.
gcc/ChangeLog:
* doc/extend.texi: Document assembler parsing
From: Andi Kleen
Committed as obvious.
gcc/ChangeLog:
* config/i386/x86-tune-sched-core.cc: Fix incorrect comment.
---
gcc/config/i386/x86-tune-sched-core.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/config/i386/x86-tune-sched-core.cc
b/gcc/config/i386
On Wed, Jan 15, 2025 at 10:41:11PM +0100, Jakub Jelinek wrote:
> Hi!
>
> When writing the gcc-15/changes.html patch posted earlier, I've been
> wondering where significant part of the Basic asm chapter went and the
> problem was the insertion of a new @node in the middle of the Basic Asm
> @node,
On Wed, Jan 08, 2025 at 07:47:27PM -0500, David Malcolm wrote:
> On Wed, 2025-01-08 at 07:48 -0800, Andi Kleen wrote:
> >
> > I wanted to ping this patch series. Thanks.
> >
> > -Andi
> >
>
> Thanks for tha patches, and sorry about not getting back
I wanted to ping this patch series. Thanks.
-Andi
On Tue, Jan 07, 2025 at 08:36:29PM +0100, Jakub Jelinek wrote:
> Hi!
>
> The following patch fixes ICEs when the new inline asm syntax
> to use C++26 static_assert-like constant expressions in place
> of string literals is used in templates.
> As finish_asm_stmt doesn't do any checking for
> proce
Mark Wielaard writes:
> commit 56946c801a7c ("gimple: Add limit after which slower switchlower
> algs are used [PR117091] [PR117352]") introduced a limit on the number
> of cases of a switch. It also bails out on finding jump tables if the
> switch is too large. This introduces a compile time reg
From: Andi Kleen
While the input line cache size now tunable it's better if the compiler
auto tunes it. Otherwise large files needing random file access will
still have to search many lines to find the right lines.
Add support for allocating one line anchor per hundred input lines.
This
From: Andi Kleen
With the new cache maintenance algorithm we don't need the
maximum number of lines anymore. Remove all the code for that.
gcc/ChangeLog:
PR preprocessor/118168
* input.cc (total_lines_num): Remove.
(file_cache_slot::evict):
From: Andi Kleen
The input context file_cache maintains an array of anchors
to speed up accessing lines before the previous line.
The array has a fixed upper size and the algorithm relies
on the linemap reporting the maximum number of lines in the file
in advance to compute the position of each
From: Andi Kleen
For larger files the file_cache line index will be spread out to make
the index fit into the fixed buffer, so any access to the non latest line
will need some skipping of lines.
Most accesses for line are near the latest line because
a diagnostic is likely near where the
This patch kit fixes scaling issues for the input cache,
especially for C, motivated by PR118168.
In overall in number of lines it is practically neutral:
gcc/input.cc | 261
--
gcc/inp
From: Andi Kleen
glibc ferror is surprisingly expensive. Move it out of the hot loop
of finding lines by setting a flag after the actual IO operations.
gcc/ChangeLog:
PR preprocessor/118168
* input.cc (file_cache_slot::m_error): New field.
(file_cache_slot::create
From: Andi Kleen
The input machinery to read the source code independent of the lexer
has a range of hard coded maximum array sizes that can impact performance.
Make them tunable.
input.cc is part of libcommon so it cannot direct access params
without a level of indirection.
gcc/ChangeLog
"James K. Lowden" writes:
> The following 8 patches constitute the 80 files needed to build and
> document the COBOL front end. They assume that following exist:
>
> gcc/cobol/ChangeLog
> libgcobol/ChangeLog
>
> The messages are grouped by files in a more or less logical order,
> but gro
> > What do you think, Andi and Richi? I myself slightly prefer keeping the DP
> > but
> > I would be fine with either option.
>
> I think we can keep both, though I have no strong opinion.
Keeping both is fine for me.
-Andi
> > But yeah, thinking about it some more, 1 seems like a lot. Maybe the
> > limit
> > could be 1000. That's also big enough. I could try to run the testcase
> > set to
> > 1000 on my not-so-powerful laptop this time and check that even on that
> > machine
> > it finishes "fast" (under a
On Tue, Nov 26, 2024 at 04:06:37PM -0800, Andrew Pinski wrote:
> On Thu, Oct 31, 2024 at 1:41 PM Andi Kleen wrote:
> >
> > From: Andi Kleen
> >
> > autofdo looks up inline stacks and tries to match them with the profile
> > data using their symbol name. Mak
On Fri, Nov 15, 2024 at 10:43:57AM +0100, Filip Kastl wrote:
> Hi,
>
> Andi's greedy bit test finding algorithm was reverted. I found a fix for the
> problem that caused the revert. I made this patch to reintroduce the greedy
> alg into GCC. However I think we should keep the old slow but more
On Tue, Jul 30, 2024 at 09:40:42AM -0700, Andi Kleen wrote:
> From: Andi Kleen
>
> ... that uses -march=native -mtune=native to build a compiler optimized
> for the host.
>
> config/ChangeLog:
>
> * bootstrap-native.mk: New file.
>
> gcc/ChangeLog:
On Fri, Nov 01, 2024 at 02:01:18PM -0400, John David Anglin wrote:
> This breaks build on hppa64-hp-hpux11.11. This target has clock_gettime
> but it doesn't have CLOCK_MONOTONIC. It has CLOCK_REALTIME. I modified
> timevar.cc as follows to restore build.
Alternative would be to check for CLOCK
On Tue, Nov 05, 2024 at 09:47:17AM +0100, Richard Biener wrote:
> On Tue, Nov 5, 2024 at 2:02 AM Jason Merrill wrote:
> >
> > On 10/31/24 4:40 PM, Andi Kleen wrote:
> > > From: Andi Kleen
> > >
> > > autofdo looks up inline stacks and tries to match th
From: Andi Kleen
- Fix warnings with newer python versions about bad escapes by
making all the python string raw.
- Add a fallback for using the builtin perf event list if the
CPU model number is unknown.
- Regenerate the shipped gcc-auto-profile with the changes.
contrib/ChangeLog
From: Andi Kleen
When autofdo bootstrap support was originally implemented there were
issues with the LTO bootstrap, that is why it wasn't enabled
for them. I retested this now and it works on x86_64-linux.
Fortran was also missing, not sure why. Also enabled now.
gcc/fortran/Chan
From: Andi Kleen
autofdo looks up inline stacks and tries to match them with the profile
data using their symbol name. Make sure all decls that can be in a inline stack
have a valid assembler name.
This fixes a bootstrap problem with autoprofiledbootstrap and LTO.
2024-10-30 Jason Merrill
> I'm getting a build failure:
>
> timevar.cc:163: undefined reference to `clock_gettime'
>
> Our frozen build tools are intended to produce binaries that work
> "everywhere", so they're a few years old, but apparently something didn't
> configure correctly.
>
> I see that libbacktrace configure
On Wed, Oct 23, 2024 at 02:56:51PM +0200, Richard Biener wrote:
> On Wed, Oct 9, 2024 at 6:18 PM Andi Kleen wrote:
> >
> > From: Andi Kleen
> >
> > Retrieving sys/user time in timevars is quite expensive because it
> > always needs a system call. Only getting
Qing Zhao writes:
> Control this with a new option -fdiagnostics-details.
It would be useful to be also able to print the inline call stack,
maybe with a separate option.
In some array bounds cases I looked at the problem was hidden in some inlines
and it wasn't trivial to figure it out.
I wro
1 - 100 of 893 matches
Mail list logo