[Bug tree-optimization/19590] IVs with the same evolution not eliminated
--- Comment #10 from stevenb dot gcc at gmail dot com 2006-04-08 21:13 --- Subject: Re: IVs with the same evolution not eliminated > The new SCC value numberer for PRE i'm working on gets this case right (and > this is in fact, one of the advantages of SCC based value numbering). Is the SCC-VN patch I posted long ago still of some use to you, or are you writing something new from scratch? (See http://gcc.gnu.org/ml/gcc-patches/2004-01/msg00211.html) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19590
[Bug tree-optimization/31849] [4.3/4.4 Regression] Code size increased with PR 31360 (IV-opts not understanding autoincrement)
--- Comment #44 from stevenb dot gcc at gmail dot com 2008-12-10 22:30 --- Subject: Re: [4.3/4.4 Regression] Code size increased with PR 31360 (IV-opts not understanding autoincrement) Joern, can you attach the updated patch? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31849
[Bug middle-end/38584] [4.3/4.4 Regression] Inline heuristics run even at -O0
--- Comment #7 from stevenb dot gcc at gmail dot com 2009-01-01 13:42 --- Subject: Re: [4.3/4.4 Regression] Inline heuristics run even at -O0 Note that the compile time at, say, -O1 for 4.3 vs. 4.4 is also a huge difference for the test case (4.4 much slower, in part due to the expensive heuristic). Therefore, IMHO, this is still a 4.4 regression too. We should not be running such expensive algorithms just for inline heuristics. We need to figure out a cheaper heuristic. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38584
[Bug tree-optimization/35805] [ira] error in start_allocno_priorities, at ira-color.c:1806
--- Comment #13 from stevenb dot gcc at gmail dot com 2009-01-02 18:45 --- Subject: Re: [ira] error in start_allocno_priorities, at ira-color.c:1806 On Fri, Jan 2, 2009 at 7:37 PM, Paolo Bonzini wrote: >>> At this point, if your patch costs say 0.3%, and removing all traces >>> DF_LR_RUN_DCE (instead scheduling a dozen more pass_fast_rtl_dce in >>> passes.c) costs 0.5%, I'd rather see the latter, at least it's easier to >>> look for opportunities to remove some useless DCE. > > I'll try to do this for 4.5. It might be more worthwhile to just "fix" IRA to use DF_LIVE (which Vlad should have done in the first place). Then we wouldn't need Kenny's patch and DF_LR_RUN_DCE would still be essentially free. Gr. Steven -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35805
[Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
--- Comment #18 from stevenb dot gcc at gmail dot com 2009-07-23 22:23 --- Subject: Re: [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01 I had the patch ready but Matz' PRE patch means I have to rework things a bit. Since I only have time for this in weekends... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785
[Bug c/4076] -Wunused doesn't warn about static function only called by itself.
--- Comment #8 from stevenb dot gcc at gmail dot com 2007-01-27 19:58 --- Subject: Re: -Wunused doesn't warn about static function only called by itself. Just one for everything should suffice. Or, if you prefer, you can remove that one function with a separate patch first, which, I believe, you can commit as obviously correct (given that the author of that function and authority of its usage already ack'ed that the function is dead code). Thanks for working on this. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=4076
[Bug c/4076] -Wunused doesn't warn about static function only called by itself.
--- Comment #11 from stevenb dot gcc at gmail dot com 2007-01-29 18:22 --- Subject: Re: -Wunused doesn't warn about static function only called by itself. If it is unused, don't hesitate to remove it. :-) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=4076
[Bug target/27869] "-O -fregmove" handles SSE scalar instructions incorrectly
--- Comment #8 from stevenb dot gcc at gmail dot com 2007-04-06 16:43 --- Subject: Re: "-O -fregmove" handles SSE scalar instructions incorrectly > The attached patch to remove '%' seems correct to me. Merge operating > wrapping the (commutative) plus/mult/min/max is not commutative, so '%' > is wrong. Or am I missing something? The commutative alternative asm output should also be removed. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27869
[Bug fortran/29635] debug info of modules
--- Comment #4 from stevenb dot gcc at gmail dot com 2007-08-12 10:36 --- Subject: Re: debug info of modules This is still on my TODO-list, but not for GCC 4.3. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29635
[Bug middle-end/34884] [4.3 Regression] gfortran.dg/array_constructor_9.f90
--- Comment #22 from stevenb dot gcc at gmail dot com 2008-01-21 14:29 --- Subject: Re: [4.3 Regression] gfortran.dg/array_constructor_9.f90 On 21 Jan 2008 13:25:23 -, zadeck at naturalbridge dot com I understand that, that is why if the pass does not specify DF_EQ_NOTES, > the lr (and the rest of the info) stays as it is now. But if you are > building chains out of them, then you are treating them as live anyway. This is not really true. You are treating the registers as "reaching", i.e. even though the register is not live at the point of the REG_EQ* note, it _could_ be live holding the value expected for the REG_EQ* note. This difference between live registers and reaching definitions is also why I feel so uncomfortable about the RD patch. After trimming, DEFs that reach a certain point in the program don't show up in the reaching definitions anymore. You've already found one example how this can be confusing (the loop IV problem). It also breaks my new const/copy prop pass, which expects definitions to reach points where the register of the reaching def is not live. What is confusing about all this, is that in GCC there is nothing to prevent you from inserting a new SET of a register between a previous set of the register and a use in an REG_EQ* note, if the register is dead at the point of the REG_EQ* note. E.g. it is perfectly OK to have a situation like this: (insn 1 (set (reg1) (...))) (insn 2 (...) REG_DEAD (reg1)) (insn 3 (...) REG_EQUAL (... ((reg1) ...))) and nothing in the compiler will object if you would insert a new insn like so: (insn 1 (set (reg1) (...))) (insn 2 (...) REG_DEAD (reg1)) (insn 4 (set (reg1) (...))) (insn 3 (...) REG_EQUAL (... ((reg1) ...))) even though this would probably result in wrong code. The way GCC avoids this kind of situation, is by never re-using registers this way, unless the new definition sets reg1 to the same value (gcse PRE does this, for example). The effect of including REG_* notes in the LR and LIVE problems would be to extend the live ranges of pseudos into regions where they are not actually live. What it comes down to, is that it seems that liveness is not a good condition to trim reaching definitions. Trimming LIVE with LR makes sense because the result of the problem does not change, and the nature of the problems are the same (i.e. compute two meanings of liveness for registers). Live registers and reaching definitions, on the other hand, are dissimilar problems. For LIVE, the effect of trimming is to not compute partial availability in regions where a register is not live. The similar condition for reaching definitions is not liveness, but absence of uses. To trim reaching definitions, one should really be looking at the last reachable use of a definition, and trim from there. I don't know what problem computes the last use of a register, but it may well be a problem that is equivalent to the LR problem, but considering *all* uses including REG_EQ* uses, instead of only real uses. But I haven't thought about this much yet, I must admit. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34884
[Bug tree-optimization/17863] [4.0/4.1/4.2/4.3 Regression] performance loss (not inlining as much??)
--- Comment #37 from stevenb dot gcc at gmail dot com 2008-01-30 20:13 --- Subject: Re: [4.0/4.1/4.2/4.3 Regression] performance loss (not inlining as much??) > Those seems to be all just array manipulations. AFAICT, they are exactly in the form that some targets like it (e.g. auto-inc/dec and SMALL_REGISTER_CLASS targets). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17863
[Bug target/35045] [4.3 Regression] gcc-4.3 generates wrong code on i386 with -O3
--- Comment #8 from stevenb dot gcc at gmail dot com 2008-02-01 11:51 --- Subject: Re: [4.3 Regression] gcc-4.3 generates wrong code on i386 with -O3 I would say it is a target issue if the target return insn does not mention that %edx is used. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35045
[Bug target/35045] [4.3 Regression] gcc-4.3 generates wrong code on i386 with -O3
--- Comment #22 from stevenb dot gcc at gmail dot com 2008-02-01 14:55 --- Subject: Re: [4.3 Regression] gcc-4.3 generates wrong code on i386 with -O3 Could you retain the " gcc_assert (HARD_REGISTER_P (x)); please? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35045
[Bug target/35045] [4.3 Regression] gcc-4.3 generates wrong code on i386 with -O3
--- Comment #18 from stevenb dot gcc at gmail dot com 2008-02-01 14:14 --- Subject: Re: [4.3 Regression] gcc-4.3 generates wrong code on i386 with -O3 Why would we be calling expand_null_return to begin with, if there is a proper return statement? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35045
[Bug middle-end/30595] gcc3.4.6 generates incorrect ppc32 code for combination of bitfields and shifts
--- Comment #5 from stevenb dot gcc at gmail dot com 2009-02-06 22:40 --- Subject: Re: gcc3.4.6 generates incorrect ppc32 code for combination of bitfields and shifts > Whilst I am not complaining about 3.4 not being supported, I think it is > a pretty poor show that you are not able to reproduce it. Did anyone > even try? Yes, there actually was a duplicate bug report for this, iirc. We don't do a good communication job in our bug bashing efforts. It is, well, just hard, with so many bugs and so few people who are willing to wade through the long list of bug reports. I'm sorry about that... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30595
[Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
--- Comment #92 from stevenb dot gcc at gmail dot com 2009-02-14 14:42 --- Subject: Re: [4.3/4.4 Regression] Inordinate compile times on large routines Re: Comment #88 I think the patch is perfectly acceptable as a stop-gap solution. I don't think we have anything better for 4.4. Maybe you can add a FIXME, though... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
[Bug tree-optimization/26854] [4.3/4.4 Regression] Inordinate compile times on large routines
--- Comment #95 from stevenb dot gcc at gmail dot com 2009-02-15 11:26 --- Subject: Re: [4.3/4.4 Regression] Inordinate compile times on large routines Re: Comment #94 The trouble with LCM in RTL (i.e. GCSE-PRE) is not that it is slow (or that it is disabled -- istr it is enabled at -O2), and also not that it is edge based. The problem is that it doesn't handle cascading expressions, because that just doesn't fit in the LCM framework. You have to iterate RTL GCSE-PRE to move the same invariants as what RTL LICM (i.e. loop-invariant.c) can achieve. (GCSE-PRE is old code from a time when GCC didn't really have a proper CFG. It is edge based because for block based you need critical edge splitting, which was was prohibitively expensive in the Old Days. Nowadays, gcse.c+lcm.c works in cfglayout mode and pre-splitting critical edges would be cheap, so it would be a good idea to experiment with a block based GCSE-PRE rewrite...) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
[Bug middle-end/12392] very long optimized compile
--- Comment #25 from stevenb dot gcc at gmail dot com 2009-02-23 17:47 --- Subject: Re: very long optimized compile Re Comment #24: I can look into it... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=12392
[Bug rtl-optimization/31849] [4.2/4.3 Regression] Code size regression caused by fix to PR 31360
--- Comment #6 from stevenb dot gcc at gmail dot com 2007-05-07 09:46 --- Subject: Re: [4.2/4.3 Regression] Code size regression caused by fix to PR 31360 Constant / copy simplifications should be done in at least CSE, fwprop, and the gcse CPROP passes (we run CPROP three times!). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31849
[Bug rtl-optimization/31987] [4.3 Regression] ICE in remove_insn, at emit-rtl.c:3579 at -O3
--- Comment #6 from stevenb dot gcc at gmail dot com 2007-06-13 05:22 --- Subject: Re: [4.3 Regression] ICE in remove_insn, at emit-rtl.c:3579 at -O3 I'll take a look this weekend. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31987
[Bug tree-optimization/38785] [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01
--- Comment #25 from stevenb dot gcc at gmail dot com 2010-02-19 23:32 --- Subject: Re: [4.3/4.4/4.5 Regression] huge performance regression on EEMBC bitmnp01 On 2/19/10, drow at gcc dot gnu dot org wrote: > > > --- Comment #24 from drow at gcc dot gnu dot org 2010-02-19 14:08 > --- > If no one else has EEMBC available, ask me and we can verify any fix. We've > been using Steven's and Joern's patches; we tried other approaches, but in > the > end we weren't able to come up with any other approach that worked as well. > > > -- > > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785 > > --- You are receiving this mail because: --- > You are on the CC list for the bug, or are watching someone who is. > -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785
[Bug target/43729] Mach-O LTO support needed for darwin
--- Comment #7 from stevenb dot gcc at gmail dot com 2010-04-15 14:03 --- Subject: Re: Mach-O LTO support needed for darwin > Can we just use the LTO COFF patch...as a template? That is certainly my plan, yes. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43729
[Bug target/43729] Mach-O LTO support needed for darwin
--- Comment #9 from stevenb dot gcc at gmail dot com 2010-04-26 16:06 --- Subject: Re: Mach-O LTO support needed for darwin Mach-O section names are too short, but I have solved this with a separate section with section names in a strings table. This is similar to the solution from lto-coff. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43729
[Bug debug/42630] "-fcompare-debug failure (length)" with "-O1 -fvariable-expansion-in-unroller -funroll-loops"
--- Comment #3 from stevenb dot gcc at gmail dot com 2010-01-08 07:31 --- Subject: Re: "-fcompare-debug failure (length)" with "-O1 -fvariable-expansion-in-unroller -funroll-loops" > --- Comment #2 from aoliva at gcc dot gnu dot org 2010-01-08 07:10 > --- > Taking this over, I hoep stevenb doesn't mind I got a patch. Not at all. You may also want to look at bug 42642. It's another case where -fno-web helps "fix" a -fcompare-debug issue. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42630
[Bug target/40730] redundant memory load
--- Comment #11 from stevenb dot gcc at gmail dot com 2010-01-11 08:22 --- Subject: Re: redundant memory load On Mon, Jan 11, 2010 at 7:47 AM, carrot at google dot com wrote: >> iterate: >> push{lr} >> ldr r3, [r1] >> .L6: >> str r3, [r0] >> sub r2, r3, #0 >> bne .L5 >> b .L3 >> .L4: >> ldr r3, [r3, #8] >> b .L6 >> .L5: >> ldr r1, [r3, #4] >> cmp r1, #0 >> beq .L4 >> .L3: >> str r2, [r0, #12] >> @ sp needed for prologue >> pop {pc} >> >> Carrot, could you please double-check that this is still correct code? >> > > Yes, it is correct. > There are still 13 instructions, I think it is related to unoptimized basic > block order. Yes, I would have expected the block starting with .L4 to be *after* the block starting with .L5, something like so: iterate: push{lr} ldr r3, [r1] .L6: str r3, [r0] sub r2, r3, #0 beq .L3 .L5: ldr r1, [r3, #4] cmp r1, #0 bne .L3 ldr r3, [r3, #8] b .L6 .L3: str r2, [r0, #12] @ sp needed for prologue pop {pc} Does that look correct? And if so, could you see if there is an open bug report about this; or otherwise file a new PR and add me to the CC-list? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40730
[Bug target/43729] Mach-O LTO support needed for darwin
--- Comment #26 from stevenb dot gcc at gmail dot com 2010-05-01 22:30 --- Subject: Re: Mach-O LTO support needed for darwin > Do you mean the errors which have "symbol xxx can't be undefined in a > subtraction expression"? Yes, exactly those. > A google shows this to look like that discussed > here... > > http://gcc.gnu.org/ml/gcc-bugs/2003-11/msg01552.html > > which is apparently PR10901. On gpc seems to have been worked around with a > --longjmp-all-nonlocal-labels option? I don't think that's the same problem. With LTO it happens for regular variables. But perhaps the problems are related. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43729
[Bug tree-optimization/19097] [4.1/4.2/4.3 regression] Quadratic behavior with many sets for the same register in VRP
--- Comment #45 from stevenb dot gcc at gmail dot com 2007-11-11 09:23 --- Subject: Re: [4.1/4.2/4.3 regression] Quadratic behavior with many sets for the same register in VRP Because it costs more than it brings: compile time on average goes _up_ with that patch. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19097
[Bug middle-end/34400] [4.3 regression] bad interaction between DF and SJLJ exceptions
--- Comment #37 from stevenb dot gcc at gmail dot com 2007-12-17 16:55 --- Subject: Re: [4.3 regression] bad interaction between DF and SJLJ exceptions Compiling with checking disabled might give a less unfair comparison. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400
[Bug middle-end/34400] [4.3 regression] bad interaction between DF and SJLJ exceptions
--- Comment #43 from stevenb dot gcc at gmail dot com 2007-12-20 06:15 --- Subject: Re: [4.3 regression] bad interaction between DF and SJLJ exceptions I did not mean more bitmaps but more elements per bitmap, obviously. I know the effect of the patch, or I wouldn't have written it ;-) I tried to add something like df_hack2 to the reaching defs problem, but I didn't succeed the first time. It is indeed harder. If you could work in it, that would be terrific. I will work on some tools to investigate DF memory usage. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34400
[Bug tree-optimization/26854] Inordinate compile times on large routines
--- Comment #44 from stevenb dot gcc at gmail dot com 2007-12-20 15:08 --- Subject: Re: Inordinate compile times on large routines On 20 Dec 2007 14:49:12 -, zadeck at naturalbridge dot com <[EMAIL PROTECTED]> wrote: > > > --- Comment #43 from zadeck at naturalbridge dot com 2007-12-20 14:49 > --- > Subject: Re: Inordinate compile times on large > routines > > lucier at math dot purdue dot edu wrote: > > --- Comment #42 from lucier at math dot purdue dot edu 2007-12-20 > > 03:52 --- > > Created an attachment (id=14799) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799&action=view) > --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799&action=view) > > --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14799&action=view) > > memory details for an unpatched mainline > > > > Here is the same information without Steven's two patches for mainline. > > > > > > > Could you add the attached patch in and rerun your example? > > It will add 4 lines to indicate what kinds of def-use and use-def chains > are being created. > A lot of the space is being used by these chains and I want to find out > how many of those chains are for artificial uses and defs. > > thanks > > kenny > struct df_link * > df_chain_create (struct df_ref *src, struct df_ref *dst) > { >struct df_link *head = DF_REF_CHAIN (src); > - struct df_link *link = pool_alloc (df_chain->block_pool);; > + struct df_link *link = pool_alloc (df_chain->block_pool); > + int index = 0; > + > + if (!src->insn) > +index += (src->type == DF_REF_REG_DEF) ? 2 : 1; > + if (!dst->insn) > +index += (src->type == DF_REF_REG_DEF) ? 2 : 1; > + > + df_chain_counters[index]++; Watch for segfaults. Index will be 1, 2, 3, or 4. df_chain_counters[4] does not exist. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26854
[Bug middle-end/30905] [4.3 Regression] Fails to cross-jump
--- Comment #12 from stevenb dot gcc at gmail dot com 2008-01-11 13:48 --- Subject: Re: [4.3 Regression] Fails to cross-jump Richi, could you commit it for me? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30905
[Bug rtl-optimization/29840] [4.3 Regression] build/genconditions ../../gcc/gcc/config/pa/pa.md > tmp-condmd.c: /bin/sh: 13354 Memory fault(coredump)
--- Comment #18 from stevenb dot gcc at gmail dot com 2006-11-26 09:19 --- Subject: Re: [4.3 Regression] build/genconditions ../../gcc/gcc/config/pa/pa.md > tmp-condmd.c: /bin/sh: 13354 Memory fault(coredump) Just adding DF_HARD_REGS is not enough. At least this bit: - if (use) + if (use && !HARD_REGISTER_P (use->reg)) is also necessary. You can reproduce the problem with a cross-compiler BTW. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29840
[Bug fortran/29635] debug info of modules
--- Comment #2 from stevenb dot gcc at gmail dot com 2007-01-02 15:27 --- Subject: Re: debug info of modules I'm waiting for my gdb assignment to be finished. This will probably be work for Q2 2007. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29635