Re: smtgcc end-of-year update

2025-01-02 Thread Andi Kleen via Gcc
Krister Walfridsson via Gcc writes: > But running smtgcc on the test suite is not the best use case for the > tool -- it only detects bugs where the test triggers an unrelated bug > compared to what the test is checking, which should be uncommon. I > therefore plan to start testing by compiling r

Re: How to debug/improve excessive compiler memory usage and compile times

2024-10-02 Thread Andi Kleen via Gcc
Ben Boeckel via Gcc writes: > On Tue, Oct 01, 2024 at 18:06:35 +0200, Richard Biener via Gcc wrote: >> Analyze where the compile time is spent and where memory is spent. >> Identify unfitting data structures and algorithms causing the issue. >> Replace with better ones. That’s what I do for thes

Re: LTO progress indicator

2024-09-15 Thread Andi Kleen via Gcc
"Ghorban M. Tavakoly via Gcc" writes: > > I need LTO. Is there a way to have LTO in GCC, without LTOing the GCC > itself? This way my builds will be many times faster. LTO can be used without LTOing gcc itself. It is normally built by default if the target supports it. -Andi

Re: What is the purpose of these two fixincludes?

2024-06-05 Thread Andi Kleen via Gcc
FX Coudert via Gcc writes: > Hi, > > I am trying to reduce the number of unneeded fixincludes that are used > on darwin (because fixincluded headers make it impossible to change > SDK once the compiler is built, which is common practice in the macOS > world, and quite useful). It's the same prob

Re: AutoFDO tools for GCC

2024-03-26 Thread Andi Kleen via Gcc
On Tue, Mar 26, 2024 at 08:45:22AM +0100, Richard Biener wrote: > On Mon, Mar 25, 2024 at 9:54 PM Eugene Rozenfeld via Gcc > wrote: > > > > Hello, > > > > I've been the AutoFDO maintainer for the last 1.5 years. I've resurrected > > autoprofiledbootstrap build and made a number of other fixes/imp

Re: New feature: -fdump-gimple-nodes (once more, with feeling)

2024-02-13 Thread Andi Kleen via Gcc
Robert Dubner writes: > There didn't seem to be any such functionality in GCC. I found a routine > in print-tree.cc which printed out a single node, but I needed to > understand the entire tree of nodes for a function. FWIW the standard way to do this is to run the compiler in gdb with the .gdb

Re: State of AutoFDO in GCC

2021-05-10 Thread Andi Kleen via Gcc
On Mon, May 10, 2021 at 04:55:50PM +, Joseph Myers wrote: > On Mon, 10 May 2021, Andi Kleen via Gcc wrote: > > > It's difficult to find now because it was a branch in the old SVN that > > wasn't > > converted. Sadly the great git conversion was quite

Re: State of AutoFDO in GCC

2021-05-10 Thread Andi Kleen via Gcc
On 5/9/2021 10:01 AM, Jan Hubicka wrote: With my tests, AutoFDO could achieve almost half of the effect of instrumentation FDO on real applications such as MySQL 8.0.20 . Likely this could be improved with some of the missing changes. Apparently discriminator support is worth quite a bit espec

Re: State of AutoFDO in GCC

2021-05-09 Thread Andi Kleen via Gcc
With my tests, AutoFDO could achieve almost half of the effect of instrumentation FDO on real applications such as MySQL 8.0.20 . Likely this could be improved with some of the missing changes. Apparently discriminator support is worth quite a bit especially on dense C++ code bases. Without

Re: RFC: attributes for marking security boundaries (system calls/ioctls, user vs kernel pointers etc)

2021-04-30 Thread Andi Kleen via Gcc
David Malcolm via Gcc writes: > I think I want a way for the user to be able to mark security > boundaries in their code: for example: > * in the Linux kernel the boundary between untrusted user-space data > and kernel data, or, > * for a user-space daemon, the boundary between data coming from t

Re: [EXTERNAL] Re: State of AutoFDO in GCC

2021-04-30 Thread Andi Kleen via Gcc
Eugene Rozenfeld via Gcc writes: > Is the format produced by create_gcov and expected by GCC under > -fauto-rpofile documented somewhere? How is it different from .gcda > used in FDO, e.g., as described here: > http://src.gnu-darwin.org/src/contrib/gcc/gcov-io.h.html? I believe it's very similar

Re: State of AutoFDO in GCC

2021-04-30 Thread Andi Kleen via Gcc
172060...@hdu.edu.cn writes: > Hi all, > > I`m using GCC 9.3 AutoFDO and the old version create_gcov on arm64 > and it works well. Actually it support not only LBR like mode but > also inst_retired even cycles event, which`s the early implementation > of AutoFDO[1]. There is no difference in o

Re: State of AutoFDO in GCC

2021-04-28 Thread Andi Kleen via Gcc
On Mon, Apr 26, 2021 at 06:40:56PM +, Hongtao Yu wrote: >Andi, thanks for pointing out the perf script issues. Can you please >elaborate a bit on the exact issue you have seen? We’ve been using >specific output of perf script such as mmap, LBR and callstack events >filtered by p

Re: State of AutoFDO in GCC

2021-04-26 Thread Andi Kleen via Gcc
On Mon, Apr 26, 2021 at 06:40:56PM +, Hongtao Yu wrote: >Andi, thanks for pointing out the perf script issues. Can you please >elaborate a bit on the exact issue you have seen? We’ve been using >specific output of perf script such as mmap, LBR and callstack events >filtered by p

Re: State of AutoFDO in GCC

2021-04-26 Thread Andi Kleen via Gcc
>There are multiple directional changes in this new tool: >1) it uses perf-script trace output (in text) as input profile data;  I suspect this will break regularly too (I personally did numerous changes to perf script output, and also wrote a lot of parsing scripts) The perf script outp

Re: State of AutoFDO in GCC

2021-04-26 Thread Andi Kleen via Gcc
Jan Hubicka writes: > > Is there a way to get this working w/o using older perf? It's usually rather simple to fix up autofdo for new perf. I did it before here https://github.com/andikleen/autofdo/commits/perf4-3 I think it would work always if it just ignored unknown records (which is quite p

Re: Is it very hard to implement Zero-overhead deterministic exceptions: Throwing values??

2020-06-14 Thread Andi Kleen via Gcc
sotrdg sotrdg via Gcc writes: > http://open-std.org/JTC1/SC22/WG21/docs/papers/2018/p0709r0.pdf > > I really want this feature. How, it looks like this requires changes > on RTL, gimple and C++ front-end. Is that very hard to implement it? If you're asking about setjmp/longjmp exceptions, you ca

Re: Does gcc automatically lower optimization level for very large routines?

2019-12-31 Thread Andi Kleen
Qing Zhao writes: > (gdb) where > #0 0x00ddbcb3 in df_chain_create (src=0x631006480f08, > dst=0x63100f306288) at ../../gcc-8.2.1-20180905/gcc/df-problems.c:2267 > #1 0x001a in df_chain_create_bb_process_use ( > local_rd=0x7ffc109bfaf0, use=0x63100f306288, top_fla

Re: Can LTO minor version be updated in backward compatible way ?

2019-07-17 Thread Andi Kleen
Romain Geissler writes: > > I have no idea of the LTO format and if indeed it can easily be updated > in a backward compatible way. But I would say it would be nice if it > could, and would allow adoption for projects spread on many teams > depending on each others and unable to re-build everythin

Re: [GSoC 2019] [extending Csmith for fuzzing OpenMp extensions]

2019-03-26 Thread Andi Kleen
> That is a correct diagnostics. > > See Canonical loop form. > > test-expr One of the following: > var relational-op b > b relational-op var > > ( var relational-op b ) > is neither of those. Still seems strange to fail for some meaningle

Re: [GSoC 2019] [extending Csmith for fuzzing OpenMp extensions]

2019-03-25 Thread Andi Kleen
sameeran joshi writes: > On 3/24/19, Andi Kleen wrote: >> On Sat, Mar 23, 2019 at 11:49:11PM +0530, sameeran joshi wrote: >>> 1) check_structured_block_conditions() >>> checks for the conditions related to a structured block >>> 1.no returns in b

Re: [GSoC 2019] [extending Csmith for fuzzing OpenMp extensions]

2019-03-23 Thread Andi Kleen
On Sat, Mar 23, 2019 at 11:49:11PM +0530, sameeran joshi wrote: > 1) check_structured_block_conditions() > checks for the conditions related to a structured block > 1.no returns in block returns should be allowed inside statement expressions. > 2.no gotos > 3.no breaks > a

Re: Support for AVX512 ternary logic instruction

2019-01-20 Thread Andi Kleen
Wojciech Muła writes: > > The main concern is if it's a proper approach? Seems that to match > other logic functions, like "a & b | c", a separate pattern is required. > Since an argument can be either negated or not, and we can use three > logic ops (or, and, xor) there would be 72 patterns. So

Re: Testing compiler reliability using Csmith

2018-12-06 Thread Andi Kleen
Radu Ometita writes: > Hello everyone! > > We are working on writing a paper about testing the reliability of C > compilers by using Csmith (a random C99 program generator). > > A previous testing effort, using Csmith, found 79 GCC bugs, and 25 of > those have been marked by developers as P1 > (

Re: Source code coverage of gcc

2018-12-06 Thread Andi Kleen
sameeran joshi writes: > Hi, > I have a random C program as a test case, for which I need to do > source code coverage on gcc. > I have used the gcov tool and further the lcov tool. The percentage of > source code coverage which I get after using gcov, Is that the final % > which I need to do gcc

Re: Proposal to add FDO profile quality related diagnostics

2018-11-27 Thread Andi Kleen
> > Regarding the function level detail being too noisy : I sort of agree with > that > comment. But I am of the opinion that I would rather leave it to the user to > infer the profile quality as per the application characteristics. Makes sense I guess. But I would keep the drill down as opt-

Re: Proposal to add FDO profile quality related diagnostics

2018-11-20 Thread Andi Kleen
Indu Bhagat writes: > Proposal to add diagnostics to know which functions were not run in the > training run in FDO. Don't you think the warning will be very noisy? I assume most programs have a lot of cold error handling functions etc. that are never executed in a normal execution. Like how do

Re: Fuzzer extension for gcc

2018-06-10 Thread Andi Kleen
On Sun, Jun 10, 2018 at 12:49:44PM +0530, sameeran joshi wrote: >Hi all,I have been figuring out to work on some project,so while searching >I found fuzzer implementation project quite interesting,so please can I >get some information and links about the extension of fuzzer project fo

Re: GCC GSOC Participation

2018-03-07 Thread Andi Kleen
> I would suggest that you start with reading through Andi's email to > another student who expressed interest in that project which you can > find at: https://gcc.gnu.org/ml/gcc/2018-02/msg00216.html > > Andi, do you have any further suggestions what Prateek should check-out, > perhaps build, exa

Re: GCC GSOC Participation

2018-03-07 Thread Andi Kleen
On Wed, Mar 07, 2018 at 03:52:15AM +0530, Prathamesh Kulkarni wrote: > On 3 March 2018 at 16:22, Prateek Kalra wrote: > > Hello GCC Community, > > My name is Prateek Kalra.I am pursuing integrated dual > > degree(B.tech+M.tech) in Computer Science Software Engineering,from Gautam > > Buddha Univer

Re: GSoC

2018-02-26 Thread Andi Kleen
> If the scope of GCC still intimidates you (but we all struggle with it > sometimes, trust me), consider reaching out to Andi Kleen and discuss > his fuzzer project idea with him (he may tell you what to check out and > experiment with to get the feeling about the task at hand).

Re: gcc generated memcpy calls symbol version

2018-01-29 Thread Andi Kleen
Tom Mason writes: > Is there any way for me to force the version for these symbols aswell? It seems pointless because the ABI for these symbols will never change. -Andi

Re: Google Summer of Code 2018: Call for mentors and ideas

2018-01-18 Thread Andi Kleen
Martin Jambor writes: > > Therefore I would like to ask all seasoned GCC contributors who would > like to mentor a GSoC student to send a reply to this thread with their > idea for a project. If you have an idea but you do not want to be a > mentor then I will consider it only if it is really int

Re: Byte swapping support

2017-09-13 Thread Andi Kleen
Jürg Billeter writes: > > I don't. The idea is to reverse scalar storage order for the whole > userspace process and then add byte swapping to the Linux kernel when > accessing userspace memory. This keeps userspace memory consistent > with regards to endianness, which should lead to high compatib

Re: Quantitative analysis of -Os vs -O3

2017-08-27 Thread Andi Kleen
Allan Sandfeld Jensen writes: > > Yeah. That is just more problematic in practice. Though I do believe we have > support for it. It is good to know it will automatically upgrade > optimizations > like that. I just wish there was a way to distribute pre-generated arch- > independent training dat

Re: comparing parallel test runs

2017-05-17 Thread Andi Kleen
Marek Polacek writes: > On Wed, May 17, 2017 at 09:13:40AM -0600, Jeff Law wrote: >> On 05/17/2017 04:23 AM, Aldy Hernandez wrote: >> > Hi folks. >> > >> > I've been having troubles comparing the results of different test runs >> > for quite some time, and have finally decided to whine about it.

Re: Thread-safety of a profiled binary (and GCOV runtime library)

2016-07-25 Thread Andi Kleen
Martin Liška writes: > > I'm also surprised about it :) Let's start without invention of a new flag, > I'll work on that. You definitely need a new flag: atomic or per thread instrumentation will almost certainly have significant overhead (either at run time or in memory). Just making an existin

Re: Moving to git

2015-08-26 Thread Andi Kleen
Jason Merrill writes: > > You don't even need to worry about the hash code, you can use the > timestamp by itself. Given the timestamp, > > git log -1 --until 1440153969 Consider tree merges. There's no guarantee a time stamp maps to monotonically increasing commit numbers. But I don't really

Re: [RFC] Kernel livepatching support in GCC

2015-06-09 Thread Andi Kleen
> > As I am bit concerned with performance why require nops there? Add a > > byte count number >= requested thats boundary of next instruction. When > > lifepatching for return you need to copy this followed by jump back to next > > instruction. Then gcc could fill that with instructions that don't

Re: Builtin expansion versus headers optimization: Reductions

2015-06-05 Thread Andi Kleen
Ondřej Bílka writes: > > On ivy bridge I got that Using rep stosq for memset(x,0,4096) is 20% > slower than libcall for L1 cache resident data while 50% faster for data > outside cache. How do you teach compiler that? It would be in theory possible with autofdo. Profile with a cache miss event. C

Re: Builtin expansion versus headers optimization: Reductions

2015-06-04 Thread Andi Kleen
Ondřej Bílka writes: > As I commented on libc-alpha list that for string functions a header > expansion is better than builtins from maintainance perspective and also > that a header is lot easier to write and review than doing that in gcc > Jeff said that it belongs to gcc. When I asked about be

Re: [RFC] Kernel livepatching support in GCC

2015-06-04 Thread Andi Kleen
> Rather than just a sequence of NOP's, should the first NOP be a > unconditional branch to the beginning of the real function? I don't > know if this applies to AArch64 cpus, but I believe some cpus can handle > such branches already in the decode unit, thus avoiding any extra cycles > for skippi

Re: [RFC] Kernel livepatching support in GCC

2015-05-28 Thread Andi Kleen
> Our proposal is that instead of adding -mfentry/-mnop-count/-mrecord-mcount > options to other architectures, we should implement a target-independent > option -fprolog-pad=N, which will generate a pad of N nops at the beginning > of each function and add a section entry describing the pad sim

Re: AutoFDO profile toolchain is open-sourced

2015-04-21 Thread Andi Kleen
On Wed, Apr 22, 2015 at 05:15:47AM +0200, Andi Kleen wrote: > On Tue, Apr 21, 2015 at 01:52:18PM -0700, Dehao Chen wrote: > > Andi, > > > > Thanks for the patches. Turns out that the first 3 patches are already > > in, the correct upstream quipper re

Re: AutoFDO profile toolchain is open-sourced

2015-04-21 Thread Andi Kleen
On Tue, Apr 21, 2015 at 01:52:18PM -0700, Dehao Chen wrote: > Andi, > > Thanks for the patches. Turns out that the first 3 patches are already > in, the correct upstream quipper repository is: > > https://chromium.googlesource.com/chromiumos/platform2/+/master/chromiumos-wide-profiling/ > > The

Re: AutoFDO profile toolchain is open-sourced

2015-04-21 Thread Andi Kleen
On Tue, Apr 21, 2015 at 10:27:49AM -0700, Dehao Chen wrote: > In that case, we should get quipper fixed upstream to accommodate new > format. (Maybe they already fixed it, I will do a batch sync to make > quipper up-to-date). >From a quick look at http://git.chromium.org/gitweb/?p=chromiumos/pla

Re: AutoFDO profile toolchain is open-sourced

2015-04-21 Thread Andi Kleen
> > BTW the biggest problem with autofdo currently is that it is > > quite bitrotten and supports only several years old perf. > > So all of this above will only work with old distributions, > > unless you compile an old perf utility first. > > Do you mean newer perf does not support LBR (-b) any

Re: AutoFDO profile toolchain is open-sourced

2015-04-21 Thread Andi Kleen
Ilya Palachev writes: > > But why create_gcov does not inform about that (no branch events were > found)? It creates empty gcov file and says nothing :( > > Moreover, in the mentioned README it is said that perf should also be > executed with option -e BR_INST_RETIRED:TAKEN. Standard perf doesn't

Re: Why not implementation of interrupt attribute on IA32/x86-64

2015-03-13 Thread Andi Kleen
Didier Garcin writes: > many OS hobbyist developpers would be pleased GCC implements the > interrupt or interrupt_handler attribute for Intel architecture. > > Would it be so difficult to implement for this architecture ? There are lots of different ways to implement interrupts on x86 (e.g. what

Re: Missed optimization case

2014-12-23 Thread Andi Kleen
Matt Godbolt writes: > > I'll happily file a bug if necessary but I'm not clear in what phase > the optimization opportunity has been missed. Please file a bug with a test case. No need to worry about the phase too much initially, just fill in a reasonable component. -Andi

graphite in -O3

2014-11-16 Thread Andi Kleen
Is there any specific reason why none of the graphite loop optimizations (loop-block, loop-interchange, loop-strip-mine, loop-jam) are enabled with -O3 or -Ofast? I assume doing so would make them much more widely used. Perhaps would be something to consider for 5.0? -Andi

Re: Autotuning parameters/heuristics within gcc - best place to start?

2014-09-26 Thread Andi Kleen
Robert Stevenson <9goli...@gmail.com> writes: > I am planning to begin a project to explore the space of tuning gcc > internals in an effort to increase performance. I am wondering if > anyone can point to me any parameterizations, heuristics, or > priorities functions within gcc that can be tuned

Re: Skipping assembler when producing slim LTO files

2014-09-24 Thread Andi Kleen
Jan Hubicka writes: Nice patch. > The implementation is pretty straighforward except for -fbypass-asm requiring > one existing OBJ file to fetch target's file attributes from. This is > definitly not optimal, but libiberty currently can't build output files from > scratch. As Ian suggested, I p

Re: GCC version bikeshedding

2014-07-20 Thread Andi Kleen
Paulo Matos writes: > > That's what I understood as well. Someone mentioned to leave the patch > level number to the distros to use which sounded like a good idea. Sounds like a bad idea, as then there would be non unique gcc versions. redhat gcc 5.0.2 potentially being completely different from

Re: question about -ffast-math implementation

2014-06-02 Thread Andi Kleen
Mike Izbicki writes: > Right, but I've never taken a look at the gcc codebase. Where would I > start looking for the relevant files? Is there a general introduction > to the codebase anywhere that I should start with? grep for all the flags set in the two functions below (from gcc/opts.c), wit

Re: Request for discussion: Rewrite of inline assembler docs

2014-02-27 Thread Andi Kleen
dw writes: > > What would you say to something like this: > > "Since GCC does not parse the asm, it has no visibility of any static > variables or functions it references. This may result in those > symbols getting discarded by GCC as unused. To avoid this problem, > list the symbols as inputs o

Re: Request for discussion: Rewrite of inline assembler docs

2014-02-27 Thread Andi Kleen
Andrew Haley writes: > Over the years there has been a great deal of traffic on these lists > caused by misunderstandings of GCC's inline assembler. That's partly > because it's inherently tricky, but the existing documentation needs > to be improved. > > dw has done a fairly thorough reworking

Re: [RFC] Replace Java with Go in default languages

2013-11-11 Thread Andi Kleen
Jeff Law writes: > Thoughts or comments? If noone tests java completely then it will quickly bitrot won't it? So ideally some bot would still regularly build/test it. If you don't do that you could as well just remove the code. The underlying problem seems to be the requirement for each contri

Re: Git mirror: asan branch

2013-10-29 Thread Andi Kleen
On Tue, Oct 29, 2013 at 05:28:40PM +0100, Tom de Vries wrote: > On 24/10/13 07:05, Andi Kleen wrote: > > Tom de Vries writes: > >> ... > >> Can you translate the last sentence into shell/git command(s)? > > > > It would be far better to just centrally mirr

Re: Git mirror: asan branch

2013-10-23 Thread Andi Kleen
Tom de Vries writes: > ... > Can you translate the last sentence into shell/git command(s)? It would be far better to just centrally mirror all branches in SVN as standard git branches. Then all these problems wouldn't occur. As far as I can tell all the workarounds proposed so far have some n

Re: Adding all SVN branches to git tree

2013-10-23 Thread Andi Kleen
On Wed, Oct 23, 2013 at 11:06:39AM +0200, Thomas Schwinge wrote: > Hi! > > On Wed, 23 Oct 2013 10:59:57 +0200, Andreas Schwab wrote: > > Andi Kleen writes: > > > > > - Any chance to add all branches from SVN to the standard git mirror? > > > > Actual

Adding all SVN branches to git tree

2013-10-23 Thread Andi Kleen
Hi, I wanted to experiment with some of the google branch features (like AutoFDO). But the google branches are not in the standard git tree, only in SVN. My workflow is all git based, so it was difficult to fit a SVN checkout in. I tried to git svn clone myself, but it takes a very long time and w

Re: RFC: SIMD pragma independent of Cilk Plus / OpenMPv4

2013-09-10 Thread Andi Kleen
Tobias Burnus writes: > > Those require -fcilkplus and -fopenmp, respectively, and activate much > more. The question is whether it makes sense to provide a means to ask > the compiler for SIMD vectorization without enabling all the other things > of Cilk Plus/OpenMP. What's your opinion? If you

Re: Propose moving vectorization from -O3 to -O2.

2013-08-21 Thread Andi Kleen
One problem I have with the vectorizer on by default is that it enables tree loop unrolling, which sometimes generates quite bloated/weird code and it's unclear if it helps. Would it be possible to only do the unrolling when vectorizing? Also I suspect the trade off on vectorizing is different b

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-12 Thread Andi Kleen
"H. Peter Anvin" writes: > However, I would really like to > understand what the value is. Probably very little. When I last looked at it, the main overhead in perf currently seems to be backtraces and the ring buffer, not this code. -Andi -- a...@linux.intel.com -- Speaking for myself only

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Andi Kleen
Steven Rostedt writes: Can't you just use -freorder-blocks-and-partition? This should already partition unlikely blocks into a different section. Just a single one of course. FWIW the disadvantage is that multiple code sections tends to break various older dwarf unwinders, as it needs dwarf3 la

Re: Should -Wmaybe-uninitialized be included in -Wall?

2013-07-10 Thread Andi Kleen
Xinliang David Li writes: > What about introducing a new blanket warning kind that excludes > anything with false positives? something like -WALL ? This still doesn't help if any new compiler version could ever add a new warning. -Andi -- a...@linux.intel.com -- Speaking for myself only

Re: Should -Wmaybe-uninitialized be included in -Wall?

2013-07-10 Thread Andi Kleen
> No. People expect that -Werror turns warnings into errors. > That is what we have documented for years. > Starting to special case these things is a royal road to confusion, > and a slippery slope. Ok, I will keep removing -Werrors from Makefiles then. FWIW basically -Werror -Wall defines a co

Re: Should -Wmaybe-uninitialized be included in -Wall?

2013-07-10 Thread Andi Kleen
Andrew Haley writes: > On 07/09/2013 12:59 PM, Andreas Arnez wrote: >> With this situation at hand, I wonder whether it's a good idea to keep >> maybe-uninitialized included in -Wall. Projects which have been using >> "-Wall -Werror" successfully for many years are now forced to >> investigate n

Re: Libitm issues porting to POWER8 HTM

2013-06-19 Thread Andi Kleen
On Wed, Jun 19, 2013 at 11:04:25AM -0500, Peter Bergner wrote: > On Tue, 2013-06-18 at 21:48 +0200, Andi Kleen wrote: > > > Given Torvald's comment, can you verify whether your hw txn succeeds > > > (all the way to commit) or whether it is failing and somehow skips > &

Re: Libitm issues porting to POWER8 HTM

2013-06-18 Thread Andi Kleen
> Given Torvald's comment, can you verify whether your hw txn succeeds > (all the way to commit) or whether it is failing and somehow skips > the fall through code that is hanging for us (Power and S390)? All the 3 transactions in reentrant.c abort. That's not surprising, because there are usually

Re: Libitm issues porting to POWER8 HTM

2013-06-18 Thread Andi Kleen
Peter Bergner writes: > > I have yet to track down who has the write lock and why, but I am working > towards that. Talking with Andreas, he said he is seeing the same failure > on S390, so I'm wondering whether this might be a generic libitm issue > and it might hit Intel too. Does anyone know

Re: RFD: Should __builtin_constant_p approximate CONSTANT_P ?

2013-04-23 Thread Andi Kleen
Joern Rennecke writes: > > More importantly. addresses that becomes a SYMBOL_REF should be considered > constant. I.e. In particular, the addresses of variables with static storage. > I have a simple patch to recognize these as constants; > do people agree that this is the right thing to do? if

Re: History question: Thread-safe profiling instrumentation

2013-04-22 Thread Andi Kleen
Bill Schmidt writes: > > My reason for asking involves a large heavily-threaded application that > is improved by feedback-directed optimization on some platforms, but not > on others. One theory is that a defective profile is generated due to > counter dropouts from contention. I'm somewhat ske

Re: If you had a month to improve gcc build parallelization, where would you begin?

2013-04-03 Thread Andi Kleen
Simon Baldwin writes: > Suppose you had a month in which to reorganise gcc so that it builds > its 3-stage bootstrap and runtime libraries in some massively parallel > fashion, without hardware or resource constraints(*). How might you > approach this? Add support for truly caching configure in

Re: Compiler speed (vanilla vs. LTO, PGO and LTO+PGO)

2013-03-25 Thread Andi Kleen
Markus Trippelsdorf writes: > > So it appears, contrary to the advice given above, that it is not useful > to build gcc 4.8.0 with the lto option at the moment. Did you build firefox/kernel with debug info on/off? Often debug info on changes the compiler performance significantly, as it generate

Re: How To Add a Sequence Point?

2013-02-03 Thread Andi Kleen
Andrew Pinski writes: > On Sat, Feb 2, 2013 at 5:10 PM, Jeffrey Walton wrote: >> Thanks Andrew. >> >> So, it looks like I don't understand sequence points. Please forgive >> my ignorance. >> >> What does C/C++ and GCC offer to ensure writes are complete before >> reads are performed on a value i

Re: RFC: [ARM] Disable peeling

2012-12-12 Thread Andi Kleen
"H.J. Lu" writes: > > i386.c has > >{ > /* When not optimize for size, enable vzeroupper optimization for > TARGET_AVX with -fexpensive-optimizations and split 32-byte > AVX unaligned load/store. */ This is only for the load, not for deciding whether peeling is worthw

Re: RFC: [ARM] Disable peeling

2012-12-10 Thread Andi Kleen
Jan Hubicka writes: > Note that I think Core has similar characteristics - at least for string > operations > it fares well with unalignes accesses. Nehalem and later has very fast unaligned vector loads. There's still some penalty when they cross cache lines however. iirc the rule of thumb i

Re: What happened to the IRA interprocedural reg-alloc work? (function_used_regs and friends)

2012-10-16 Thread Andi Kleen
Steven Bosscher writes: > > I suppose it's theoretically possible to make a good initial guess of > what registers might be not-clobbered by a function even if the ABI > says so. For instance, perhaps it's possible to assume that a function > that doesn't touch any variables in a floating point mo

Re: What happened to the IRA interprocedural reg-alloc work? (function_used_regs and friends)

2012-10-16 Thread Andi Kleen
Vladimir Makarov writes: >> > As I remember, the performance improvement from this optimization was > very small. There were problems in reviewing IRA and I decided to > simplify this task. > > May be it is worth to return to this work. ... especially if you could make it work with LTO. -Andi

Re: 50% slowdown with LTO

2012-08-13 Thread Andi Kleen
Ian Lance Taylor writes: > > Figuring out what has gone wrong is like optimizing any program. Get > a profile for your program, e.g., using -pg. Build the program with > and without -flto, run it, and look at the resulting profiles. A 50% > slowdown should be fairly obvious. I would guess that

Re: New GCC takes 19x as long to compile my program (compared to old GCC), plus void** patch suggestion

2012-08-08 Thread Andi Kleen
Elmar Krieger writes: > > The slowdown is not the same with other files, so I'm essentially sure > that this specific source file has some 'feature' that catches GCC at > the wrong leg. This raises my hopes that one of the GCC experts wants > to take a look at it. The code is confidential, You co

Re: The state of glibc libm

2012-03-15 Thread Andi Kleen
> SSE ABI entries for i?86 in glibc were rejected. I proposed them like > 4-5 years ago to make -mfpmath=sse not suck. With the new libm hopefully this can be revisited. But there's the ABI and there's the internal implementation. My point was just that relying on x87 fully again does not reall

Re: The state of glibc libm

2012-03-14 Thread Andi Kleen
On Wed, Mar 14, 2012 at 09:04:53PM +, Joseph S. Myers wrote: > On Wed, 14 Mar 2012, Andi Kleen wrote: > > > One big win alone on 32bit x86 would be to use a SSE ABI for libm > > by default. > > I haven't checked, but I'd hope x32 does that as a better 32-b

Re: The state of glibc libm

2012-03-14 Thread Andi Kleen
Jeff Law writes: > On 03/14/2012 10:30 AM, Joseph S. Myers wrote: >> >> I'd say that "better performance with the potential loss of accuracy" >> should be covered by -ffast-math - that GCC should generate direct use of >> fsin/fcos instructions for sin/cos for -O2 -funsafe-math-optimizations on >

Re: User directed Function Multiversioning (MV) via Function Overloading

2012-03-07 Thread Andi Kleen
Richard Guenther writes: > > I don't like specifying 'arch' at all. Instead you _always_ want architecture > feature tests, not architecture tests. Because, does amdfam10 also cover > bdver1? [it can't! bdver1 does no longer have 3dnow! but that's entirely > surprising for a user] There's stil

Re: lto pseudo-object files and fixed registers

2012-02-07 Thread Andi Kleen
Richard Guenther writes: > > You then can do > > gcc $OPTIONS -flto a.c -o a.o > gcc $OPTIONS -flto b.c -o b.o > gcc $OPTIONS -ffixed-r9 -ffixed-r10 -flto d.c -o d.o > gcc $OPTIONS -ffixed-r9 -ffixed-r10 -flto e.c -o e.o > gcc $OPTIONS -flto a.o b.o -o non-fixed-reg-part.o -r -nostdlib > gcc

Re: LTO multiple definition failures

2012-01-02 Thread Andi Kleen
> Anyway, the problem here isn't that I particularly care about coming up > with some workaround to make LTO work, but rather that tests from the > gcc testsuite are failing on this target because of what looks like > buggy LTO behavior instead of bugs in the target support, and I wanted > to b

Re: LTO multiple definition failures

2012-01-01 Thread Andi Kleen
Sandra Loosemore writes: > > I'm still finding my way around LTO; can anyone who's more familiar > with this help narrow down where to look for the cause of this? I > don't even know if this is a compiler or ld bug at this point. I'm I would look into the interaction between the LTO plugin and

Re: Transactional Memory documentation in extend.texi

2011-11-17 Thread Andi Kleen
On Thu, Nov 17, 2011 at 03:14:57PM +0100, Torvald Riegel wrote: > We are aware that the TM language constructs should be documented in > extend.texi. However, the most recent public version of the C++ TM > specification document is outdated, and a new version is supposed to be > released in a few

Re: asm in inline function invalidating function attributes?

2011-10-17 Thread Andi Kleen
> > At least the Linux kernel has a couple such cases ("nasty inline asm to > > hide register clobbering in calls") and it's always ugly and hard to > > maintain. > > It would simply be an alternate ABI that makes all registers callee-saved? Yes exactly that. -Andi -- a...@linux.intel.com -- Sp

Re: asm in inline function invalidating function attributes?

2011-10-16 Thread Andi Kleen
Ulrich Drepper writes: > > It's not guaranteed to work in general. The problem to solve is that > I know the function which is called is not clobbering any registers. > If I leave it with the normal function call gcc has to spill > registers. If I can hide the function call the generated code ca

Re: RFC: Add --plugin-gcc option to ar/nm

2011-10-15 Thread Andi Kleen
"H.J. Lu" writes: > Hi, > > ---plugin option for ar/nm is very long. I am proposing to add > a --plugin-gcc option. It can be implemented with > > 1. Move LTOPLUGINSONAME from gcc to config/plugins.m4. > 2. Define LTOPLUGINSONAME for ar/nm. > 3. For --plugin-gcc, ar/nm call popen using environ

Re: Vector alignment tracking

2011-10-13 Thread Andi Kleen
> Or I am missing someting? I often see the x86 vectorizer with -mtune=generic generate a lot of complicated code just to adjust for potential misalignment. My thought was just if the alias oracle knows what the original declaration is, and it's available for changes (e.g. LTO), it would be like

Re: Vector alignment tracking

2011-10-13 Thread Andi Kleen
Artem Shinkarov writes: > > 1) Currently in C we cannot provide information that an array is > aligned to a certain number. The problem is hidden in the fact, that Have you considered doing it the other way round: when an optimization needs something to be aligned, make the declaration aligned?

Re: Merging gdc (GNU D Compiler) into gcc

2011-10-05 Thread Andi Kleen
David Brown writes: > > Some toolchains are configured to have a series of "init" sections at > startup (technically, that's a matter of the default linker scripts > and libraries rather than the compiler). You can get code to run at > specific times during startup by placing the instructions dir

Re: Questions Regarding DWARF

2011-09-08 Thread Andi Kleen
Kevin Polulak writes: > > I've tried to gain some knowledge by digging through the GCC source > but haven't come up with much other than the values of the DW_* > constants which isn't that important. Are there any files in > particular I should be looking at? >From the gcc internals manual: * D

Re: Fortran's DO CONCURRENT - make use of it middle-end-wise

2011-09-04 Thread Andi Kleen
Tobias Burnus writes: > > The plan is to translate it as normal loop; however, it would be > useful if this non-order-dependence could be used by the middle end > (general optimization or at least for -floop-parallelize-all / > -ftree-parallelize-loops). Is there a way to tell the middle-end about

Re: An unusual x86_64 code model

2011-08-11 Thread Andi Kleen
Jed Davis writes: > > But is that the right way to do that, do people think? Or should I > look into making this its own -mcmodel option? (Which would raise the I would make it a new -mcmodel=... option. > question of what to call it -- medsmall? smallhigh? altkernel?) Or is smallhigh sounds

  1   2   3   >