Re: [PATCH] lower-subreg and IBM long double

2013-08-20 Thread Steven Bosscher
On Mon, Aug 19, 2013 at 8:47 AM, Alan Modra wrote: > gcc/ ... > * doc/tm.texi.in (TARGET_INIT_LOWER_SUBREG): Document. ChangeLog update needed here, the new hook is documented in target.def. > +static void > +rs6000_init_lower_subreg (void *data) > +{ Functions should start with a leadi

Re: [PATCH 1/2] Convert symtab, cgraph and varpool nodes into a real class hierarchy

2013-08-20 Thread Steven Bosscher
On Tue, Aug 20, 2013 at 11:06 PM, Jan Hubicka wrote: >> +/* GTY((user)) hooks for symtab_node_base (and its subclasses). >> + We could use virtual functions for this, but given the presence of the >> + "type" field and the trivial size of the class hierarchy, switches are >> + perhaps simple

Re: [PATCH] Rerun df_analyze after delete_unmarked_insns during DCE

2013-08-21 Thread Steven Bosscher
On Wed, Aug 21, 2013 at 5:10 PM, Jeff Law wrote: > On 08/21/2013 08:25 AM, David Edelsohn wrote: >> >> This patch has caused a bootstrap failure for powerpc-aix and probably >> powerpc64-linux. GCC segfaults and core dumps during stage2 >> configure. >> >> The motivation for this patch seems faul

Re: [PATCH]: Fix PR middle-end/56382 -- Only move MODE_COMPLEX_FLOAT by parts if we can create pseudos

2013-08-24 Thread Steven Bosscher
On Fri, Aug 23, 2013 at 2:47 AM, John David Anglin wrote: > Ping. > > > On 28-Jul-13, at 12:17 PM, John David Anglin wrote: > >> This patch fixes PR middle-end/56382 on hppa64-hp-hpux11.11. The patch >> prevents moving a complex float by parts if we can't >> create pseudos. On a big endian 64-bit

Re: [PATCH 6/6] Add manual GTY hooks

2013-08-29 Thread Steven Bosscher
On Thu, Aug 29, 2013 at 6:20 PM, David Malcolm wrote: > * gimple.c (gt_ggc_mx (gimple)): New, as required by GTY((user)). > (gt_pch_nx (gimple)): Likewise. GIMPLE isn't supposed to end up in a PCH. Can you please make this function simply call gcc_unreachable()? FWIW 1: I really

Re: [PATCH]: Fix PR middle-end/56382 -- Only move MODE_COMPLEX_FLOAT by parts if we can create pseudos

2013-08-29 Thread Steven Bosscher
On Thu, Aug 29, 2013 at 2:14 AM, John David Anglin wrote: > As expected, your patch doesn't fix the PR. Hmm, unfortunate. The reason why I proposed it is because it would revert to the way this code worked before http://gcc.gnu.org/r104371 The idea was to make "force" false, and let the normal ba

Re: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

2013-09-04 Thread Steven Bosscher
On Wed, Sep 4, 2013 at 10:58 AM, Alexander Monakov wrote: > Hello, > > Could you use the existing facilities instead, such as adjust_priority hook, > or making the compare-branch insn sequence a SCHED_GROUP? Or a define_bypass? Ciao! Steven

Re: RFC - Next refactoring steps

2013-09-05 Thread Steven Bosscher
On Thu, Sep 5, 2013 at 5:47 PM, Andrew MacLeod wrote: > ok, so to dwell on header file cleanup. When creating new header files for > say, tree-ssa-ter.h, what other include files should we make assumptions > have already been included... or should we make none? > For instance, the header files

Re: [PATCH] C++-ify and move control dependence code

2013-09-05 Thread Steven Bosscher
On Thu, Sep 5, 2013 at 4:05 PM, Richard Biener wrote: > > This C++-ifies and moves the control dependence code from tree-ssa-dce.c > to cfganal.c as I am about to re-use that code from loop distribution. I'd recommend re-implementing the control dependence code, then. The current implementation is

Re: RFC - Next refactoring steps

2013-09-05 Thread Steven Bosscher
On Fri, Sep 6, 2013 at 12:22 AM, Andrew MacLeod wrote: > Or are you suggesting that coretypes.h is a file we can assume is available? > that seems like a bit of a slippery slope, but we could pick a few. I > prefer it be explicit myself. coretypes.h is available. Why do you think that's a slipper

Re: RFA: Fix mark_target_live_regs to take COND_EXEC into account

2013-09-06 Thread Steven Bosscher
On Fri, Sep 6, 2013 at 12:20 PM, Joern Rennecke wrote: > We found that > > std::basic_string, std::allocator > ::copy(char*, unsigned long, unsigned long) const > > got miscompiled for ARC because reorg thought that all call-clobbered > registers were dead after a conditional call. Hmm, interestin

Re: RFC - Next refactoring steps

2013-09-06 Thread Steven Bosscher
On Fri, Sep 6, 2013 at 5:21 PM, Andrew MacLeod wrote: >> hackery in some headers will suddenly break (that is, change outcome) >> if you include for example >> tm.h before or after it. >> > these would be really good to identify and fix, if possible. (surely they > can be fixed.. :-)if they ca

Re: RFA: Fix mark_referenced_resources to handle COND_EXEC

2013-09-06 Thread Steven Bosscher
On Fri, Sep 6, 2013 at 12:22 PM, Joern Rennecke wrote: > 2013-04-30 Joern Rennecke <...> > > * resource.c (mark_referenced_resources): Handle COND_EXEC. This is OK. Ciao! Steven

Re: [PATCH] C++-ify and move control dependence code

2013-09-06 Thread Steven Bosscher
On Fri, Sep 6, 2013 at 8:41 AM, Richard Biener wrote: >> I'd recommend re-implementing the control dependence code, then. The >> current implementation is basically taken from old RTL-SSA dce.c and >> uses a now old-fashioned view of the CFG, e.g. using edge lists. >> You're probably better off sta

Re: RFA: Fix debug-insn sensitivity in RA

2013-09-07 Thread Steven Bosscher
On Sat, Sep 7, 2013 at 11:14 AM, Richard Sandiford wrote: > The problem seems to be split across IRA and LRA. In IRA we have: > > FOR_EACH_BB (bb) > FOR_BB_INSNS (bb, insn) > { > if (! INSN_P (insn)) > continue; > for_each_rtx (&insn, set_paradoxical_subreg, (

Re: RFA: Fix debug-insn sensitivity in RA

2013-09-07 Thread Steven Bosscher
On Sat, Sep 7, 2013 at 1:37 PM, Richard Sandiford wrote: > Steven Bosscher writes: >> Can you please add a test case? > > What kind of test would you suggest? Do we have a harness for testing > that -O2 and -O2 -g .text output is identical? Not .text, but the assembly outpu

Re: [PATCH][RFC] Move IVOPTs closer to RTL expansion

2013-09-09 Thread Steven Bosscher
On Mon, Sep 9, 2013 at 10:01 AM, Richard Biener wrote: >> >> First, the loop passes that at the moment preceede IVOPTs leave >> >> around IL that is in desparate need of basic re-optimization >> >> like CSE, constant propagation and DCE. That puts extra load >> >> on IVOPTs and its cost model, inc

Re: RFA: Store the REG_BR_PROB probability directly as an int

2013-09-22 Thread Steven Bosscher
Hello Richard, Not directly related to your patch but... On Sun, Sep 22, 2013 at 12:54 PM, Richard Sandiford wrote: > @@ -588,14 +589,17 @@ cond_exec_process_if_block (ce_if_block_ > goto fail; > #endif > > - true_prob_val = find_reg_note (BB_END (test_bb), REG_BR_PROB, NULL_RTX); > - if

Re: [PATCH, LRA] Remove REG_DEAD and REG_UNUSED notes.

2013-09-24 Thread Steven Bosscher
On Tue, Sep 24, 2013 at 5:03 PM, Eric Botcazou wrote: >> This patch removes REG_DEAD and REG_UNUSED notes in update_inc_notes, >> as it is what the function is supposed to do (see the comments) and as >> keeping these notes produce some failures, at least on ARM. > > The description is too terse.

Re: [PATCH, LRA] Remove REG_DEAD and REG_UNUSED notes.

2013-09-25 Thread Steven Bosscher
On Wed, Sep 25, 2013 at 4:55 PM, Vladimir Makarov wrote: > On 09/24/2013 03:40 PM, Mike Stump wrote: >> On Sep 24, 2013, at 12:23 PM, Steven Bosscher wrote: >>> On Tue, Sep 24, 2013 at 5:03 PM, Eric Botcazou wrote: >>>>> This patch removes REG_DEAD and REG_

Re: RFC: LRA for x86/x86-64 [0/9]

2012-09-28 Thread Steven Bosscher
On Fri, Sep 28, 2012 at 12:56 AM, Vladimir Makarov wrote: > Any comments and proposals are appreciated. Even if GCC community > decides that it is too late to submit it to gcc4.8, the earlier reviews > are always useful. I would like to see some benchmark numbers, both for code quality and com

Re: [PATCH RFA] Implement register pressure directed hoist pass

2012-09-28 Thread Steven Bosscher
On Fri, Sep 28, 2012 at 9:18 AM, Bin Cheng wrote: > (get_regno_pressure_class, get_pressure_class_and_nregs) Broken long lines in a ChangeLog entry end with a ",". > (change_pressure, mark_regno_live, mark_regno_death, mark_reg_death) > (mark_reg_store, mark_reg_clobber,

Re: RFC: LRA for x86/x86-64 [2/9]

2012-09-28 Thread Steven Bosscher
On Fri, Sep 28, 2012 at 12:57 AM, Vladimir Makarov wrote: > LRA outputs a lot debug information about insns. I found that using slim > insn/rtl presentation helps a lot for LRA debuging. The following patch > makes slim presentation printing functions visible to LRA. It also > implements one mor

Re: RFC: LRA for x86/x86-64 [0/9]

2012-09-28 Thread Steven Bosscher
On Fri, Sep 28, 2012 at 5:21 PM, Vladimir Makarov wrote: > On 12-09-28 4:21 AM, Steven Bosscher wrote: >> >> On Fri, Sep 28, 2012 at 12:56 AM, Vladimir Makarov >> wrote: >>> >>>Any comments and proposals are appreciated. Even if GCC community >>

Re: Profile housekeeping 3/4 (call-cddce fix)

2012-09-28 Thread Steven Bosscher
On Fri, Sep 28, 2012 at 10:43 PM, Jan Hubicka wrote: > Hi, > shrink_wrap_one_built_in_call forgets to update counts. Hi, Can you look at the one-liner from http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00794.html too, please? The patch there is this: Index: tree-ssa-tail-merge.c ===

Re: RFC: LRA for x86/x86-64 [0/9]

2012-09-29 Thread Steven Bosscher
Hi Vlad, Thanks for the testing and the logs. You must have good hardware, your timings are all ~3 times faster than mine :-) On Sat, Sep 29, 2012 at 3:01 AM, Vladimir Makarov wrote: > --32-bit > Reload: > 581.85user 29.91system

Re: RFC: LRA for x86/x86-64 [0/9]

2012-09-30 Thread Steven Bosscher
On Sun, Sep 30, 2012 at 6:01 PM, Richard Guenther wrote: >>> --64-bit:--- >>> Reload: >>> 503.26user 36.54system 30:16.62elapsed 29%CPU (0avgtext+0avgdata >>> LRA: >>> 598.70user 30.90system 27:26.92elapsed 38%CPU (0avgtext+0avgdata >

Re: RFC: LRA for x86/x86-64 [0/9]

2012-09-30 Thread Steven Bosscher
Hi, To look at it in yet another way: > integrated RA : 189.34 (16%) usr > LRA non-specific: 59.82 ( 5%) usr > LRA virtuals eliminatenon: 56.79 ( 5%) usr > LRA create live ranges : 175.30 (15%) usr > LRA hard reg assignment : 130.85 (11%) usr The IRA pass is slower tha

Re: RFC: LRA for x86/x86-64 [0/9]

2012-09-30 Thread Steven Bosscher
On Sun, Sep 30, 2012 at 7:03 PM, Richard Guenther wrote: > On Sun, Sep 30, 2012 at 6:52 PM, Steven Bosscher > wrote: >> Hi, >> >> >> To look at it in yet another way: >> >>> integrated RA : 189.34 (16%) usr >>> LRA no

Re: RFC: LRA for x86/x86-64 [0/9]

2012-09-30 Thread Steven Bosscher
On Mon, Oct 1, 2012 at 12:44 AM, Vladimir Makarov wrote: > Actually, I don't see there is a problem with LRA right now. I think we > should first to solve a whole compiler memory footprint problem for this > test because cpu utilization is very small for this test. On my machine > with 8GB, th

Re: RFC: LRA for x86/x86-64 [0/9]

2012-09-30 Thread Steven Bosscher
On Mon, Oct 1, 2012 at 12:50 AM, Vladimir Makarov wrote: > As I wrote, I don't see that LRA has a problem right now because even on > 8GB machine, GCC with LRA is 10% faster than GCC with reload with real time > point of view (not saying that LRA generates 15% smaller code). And real > time is

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher
[ Sorry for re-send, it seems that mobile gmail sends text/html and the sourceware mailer daemon rejects that. ] On Monday, October 1, 2012, Jakub Jelinek wrote: > On Sun, Sep 30, 2012 at 06:50:50PM -0400, Vladimir Makarov wrote: > > I think this testcase shouldn't be a show stopper for LRA inclu

Re: [PATCH RFA] Implement register pressure directed hoist pass

2012-10-01 Thread Steven Bosscher
On Sat, Sep 29, 2012 at 8:37 AM, Bin Cheng wrote: > This is the updated patch according to your comments. Please review. > I also re-collected code size data and found it is improved by about 0.24% > for mips, which is better than previous data. I believe this should be > caused by recent changes

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher
On Mon, Oct 1, 2012 at 9:16 AM, Jakub Jelinek wrote: > On Mon, Oct 01, 2012 at 08:47:13AM +0200, Steven Bosscher wrote: >> The test case compiles just fine at -O2, only VRP has trouble with it. >> Let's try to stick with facts, not speculation. > > I was talking about th

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher
On Mon, Oct 1, 2012 at 11:52 AM, Richard Guenther wrote: >> I think this testcase shouldn't be a show stopper for LRA inclusion into >> 4.8, but something to look at for stage3. > > I agree here. I would also agree if it were not for the fact that IRA is already a scalability bottle-neck and that

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher
On Sun, Sep 30, 2012 at 7:03 PM, Richard Guenther wrote: > On Sun, Sep 30, 2012 at 6:52 PM, Steven Bosscher > wrote: >> Hi, >> >> >> To look at it in yet another way: >> >>> integrated RA : 189.34 (16%) usr >>> LRA no

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher
On Mon, Oct 1, 2012 at 12:14 PM, Jakub Jelinek wrote: > On Mon, Oct 01, 2012 at 12:01:36PM +0200, Steven Bosscher wrote: >> I would also agree if it were not for the fact that IRA is already a >> scalability bottle-neck and that has been known for a long time, too. >> I have

[patch][lra] Use XNEWVEC and friends instead of xmalloc/xrealloc, and add some timevars

2012-10-01 Thread Steven Bosscher
Hello, This patch uses the libiberty new-like operators instead of using xmalloc/xrealloc. It also adds timevars for the main LRA phases, and it fixes a warning suggesting a space before a ';' in an only-looping for loop. Bootstrapped (lra-branch, of course) on x86_64-unknown-linux-gnu. OK? Ciao

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher
On Mon, Oct 1, 2012 at 12:10 PM, Steven Bosscher wrote: > The " LRA create live range" time is mostly spent in merge_live_ranges > walking lists. Hmm no, that's just gcc17's ancient debugger telling me lies. lra_live_range_in_p is not even used. /me upgrades to so

[patch] experimenting with renumbering of pseudos after expand

2012-10-01 Thread Steven Bosscher
Hello, For most code, expand creates a lot of pseudos that are cleaned up in subsequent passes, if they even live long enough to make it there. On average, for cc1 preprocessed source, the number of "holes" in regno_reg_rtx is about half the size of that array, or in other words: regno_reg_rtx is

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher
On Sat, Sep 29, 2012 at 10:26 PM, Steven Bosscher wrote: > LRA create live ranges : 175.30 (15%) usr 2.14 (13%) sys 177.44 > (15%) wall2761 kB ( 0%) ggc I've tried to split this up a bit more: process_bb_lives ~50% create_start_finish

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher
On Mon, Oct 1, 2012 at 9:19 PM, David Miller wrote: > From: Ian Lance Taylor > Date: Mon, 1 Oct 2012 11:55:56 -0700 > >> Steven is correct in saying that there is a tendency to move on and >> never address GCC bugs. However, there is also a counter-vailing >> tendency to fix GCC bugs. Anyhow I'

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-01 Thread Steven Bosscher
On Mon, Oct 1, 2012 at 9:51 PM, Vladimir Makarov wrote: >> I think it's more important in this case to recognize Steven's real >> point, which is that for an identical situation (IRA), and with an >> identical patch author, we had similar bugs. They were promised to be >> worked on, and yet some

[patch][lra] a few bitmap obstacks for lra-assigns

2012-10-01 Thread Steven Bosscher
Hello, This eliminates a few large loops in lra-assigns.c. They're not the most costly loops but the life times of the bitmaps is well-defined and destroying a bitmap obstack is much cheaper than looping over all bitmaps calling bitmap_clear. The saving is small but you have to start somewhere...

[patch][lra] Comment typo fix

2012-10-01 Thread Steven Bosscher
I suppose no-one would object if I commit this as obvious at some point? Index: lra-constraints.c === --- lra-constraints.c (revision 191858) +++ lra-constraints.c (working copy) @@ -4293,7 +4293,7 @@ update_ebb_live_info (rtx hea

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-02 Thread Steven Bosscher
On Tue, Oct 2, 2012 at 3:14 AM, Vladimir Makarov wrote: > My experience shows that these lists are usually 1-2 elements. Although in > this case, there are pseudos with huge number elements (hundreeds). I tried > -fweb for this tests because it can decrease the number elements but GCC (I > don'

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-02 Thread Steven Bosscher
On Tue, Oct 2, 2012 at 10:29 AM, Paolo Bonzini wrote: > Il 02/10/2012 09:28, Steven Bosscher ha scritto: >>> My experience shows that these lists are usually 1-2 elements. Although in >>> > this case, there are pseudos with huge number elements (hundreeds). I >>

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-02 Thread Steven Bosscher
On Mon, Oct 1, 2012 at 1:11 AM, Steven Bosscher wrote: > On Mon, Oct 1, 2012 at 12:44 AM, Vladimir Makarov wrote: >> Actually, I don't see there is a problem with LRA right now. I think we >> should first to solve a whole compiler memory footprint problem for this

Re: RFC: LRA for x86/x86-64 [7/9]

2012-10-03 Thread Steven Bosscher
On Tue, Oct 2, 2012 at 3:42 PM, Richard Sandiford wrote: > >> +/* Compress pseudo live ranges by removing program points where >> + nothing happens. Complexity of many algorithms in LRA is linear >> + function of program points number. To speed up the code we try to >> + minimize the numbe

Re: [patch][lra] Use XNEWVEC and friends instead of xmalloc/xrealloc, and add some timevars

2012-10-03 Thread Steven Bosscher
On Mon, Oct 1, 2012 at 1:05 PM, Steven Bosscher wrote: > Hello, > > This patch uses the libiberty new-like operators instead of using > xmalloc/xrealloc. > It also adds timevars for the main LRA phases, and it fixes a warning > suggesting a space before a ';'

[patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-03 Thread Steven Bosscher
On Wed, Oct 3, 2012 at 4:56 PM, Vladimir Makarov wrote: > On 12-10-03 3:13 AM, Steven Bosscher wrote: >> >> On Tue, Oct 2, 2012 at 3:42 PM, Richard Sandiford >> wrote: >>>> >>>> +/* Compress pseudo live ranges by removing program points where

Re: Propagate profile counts during switch expansion

2012-10-03 Thread Steven Bosscher
On Wed, Oct 3, 2012 at 6:12 PM, Xinliang David Li wrote: > What is the status of switch expansion GIMPLE rewrite? If it is not > planned for 4.8, It will be desirable to include this fix into trunk. I could work on it for GCC 4.8 (there's not a lot of work left to be done for it now) but we haven

Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-03 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 5:30 AM, Vladimir Makarov wrote: > I was going to look at this code too but I was interesting in generation of > less points and live ranges. It is strange that in my profiles, > remove_some_program_points_and_update_live_ranges takes 0.6% of compiler > time on these huge t

Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 8:57 AM, Steven Bosscher wrote: > On Thu, Oct 4, 2012 at 5:30 AM, Vladimir Makarov wrote: >> I was going to look at this code too but I was interesting in generation of >> less points and live ranges. It is strange that

Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-04 Thread Steven Bosscher
On Wed, Oct 3, 2012 at 5:35 PM, Steven Bosscher wrote: > The "worst" result is this: > Compressing live ranges: from 726174 to 64496 - 8%, pre_count 40476128, > post_count 12483414 > > But that's still a lot better than before the patch for the same function: &

Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 1:30 PM, Richard Guenther wrote: > Isn't _REVERSE vs. non-_RESERVE still kind-of "random" order? Not at this stage. For cfglayout mode I would answer yes, but IRA/LRA operates in cfgrtl mode, so the sequence of insns and basic blocks must match. Therefore, if you walk the b

Re: Profile housekeeping 6/n (-fprofile-consistency-report)

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 4:01 PM, Jan Hubicka wrote: > * doc/invoke.texi (-fprofile-consistency-report): Document. > * common.opt (fprofile-consistency-report): New. > * toplev.h (dump_profile_consistency_report): Declare. > * toplev.c (finalize): Call dump_profile_con

Re: [PATCH] Improve var-tracking memory disambiguation with frame pointer (PR debug/54796)

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 6:33 PM, Jakub Jelinek wrote: > This patch fixes a few FAILs in the ix86 guality testsuite (mainly -Os), > by better disambiguating sp based VALUEs (which usually have no MEM_EXPR > and thus the alias Oracle can't be used for them) from frame pointer > based ones or global va

Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 5:31 PM, Vladimir Makarov wrote: > > Wow. I did not have such effect. What machine do you use? I do all my testing on gcc17. Ciao! Steven

Re: [patch][lra] Improve initial program point density in lra-lives.c

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 6:12 PM, Vladimir Makarov wrote: >>> 0.6% sounds really very different from my timings. How much time does >>> create_start_finish_chains take for you? >>> >> 0.65% (2.78s). >> >> Actually, I have a profile but I am not sure now that it is for PR54146. >> It might be for PR2

Re: [lra] patch to solve most scalability problems for LRA

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 5:37 PM, Vladimir Makarov wrote: > The only issue now is PR54146 compilation time for IRA+LRA although it > was improved significantly. I will continue work on PR54146. But now I > am going to focus on proposals from reviews. Right, there still are opportunities to impr

Re: [patch][lra] Improve initial program point density in lra-lives.c

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 6:56 PM, Steven Bosscher wrote: > "Crude, but efficient" (tm) :-) BTW with a similar approach I also time other bits of process_bb_lives: timevar_push (TV_HOIST); /* See if we'll need an increment at the end of this basic block. An incremen

Re: [Patch, Fortran] Fix some memory leaks

2012-10-04 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 7:06 PM, Tobias Burnus wrote: > This patch fixes some memory leaks and other issues found by > http://scan5.coverity.com. > > Build and regtested on x86-64-linux. > OK for the trunk? Yes, thanks for plugging these! Some of them have been there since day 0 :-) Ciao! Steven

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-04 Thread Steven Bosscher
On Sat, Sep 29, 2012 at 10:26 PM, Steven Bosscher wrote: > To put it in another perspective, here are my timings of trunk vs lra > (both checkouts done today): > > trunk: > integrated RA : 181.68 (24%) usr 1.68 (11%) sys 183.43 > (24%) wall 643564 kB (

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-04 Thread Steven Bosscher
On Tue, Oct 2, 2012 at 3:14 AM, Vladimir Makarov wrote: > Analogous live ranges are used in IRA as intermidiate step to build a > conflict graph. Right, ira-lives.c and lra-lives.c look very much alike, the only major difference is that the object of interest in an IRA live range is an ira_obje

Re: Use conditional casting with symtab_node

2012-10-05 Thread Steven Bosscher
On Fri, Oct 5, 2012 at 2:43 PM, Diego Novillo wrote: > Because (...) there has been so much > negative pressure on our work, that we sometimes try to find some > benefit when reality may provide neutral results. When people say "your work sucks", they probably don't mean to apply negative pressure

Re: Use conditional casting with symtab_node

2012-10-05 Thread Steven Bosscher
On Fri, Oct 5, 2012 at 2:43 PM, Diego Novillo wrote: > Because (...) there has been so much > negative pressure on our work, that we sometimes try to find some > benefit when reality may provide neutral results. When people say "your work sucks", they probably don't mean to apply negative pressure

Re: [PATCH] Fix PR54489 - FRE needing AVAIL_OUT

2012-10-05 Thread Steven Bosscher
On Fri, Sep 14, 2012 at 2:26 PM, Richard Guenther wrote: > If you can figure out a better name for the function we should > probably move it to cfganal.c It looks like my previous e-mail about this appears to have gone got somehow, so retry: Your my_rev_post_order_compute is simply inverted_post

Re: [patch][lra] Improve initial program point density in lra-lives.c (was: Re: RFC: LRA for x86/x86-64 [7/9])

2012-10-05 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 2:59 PM, Steven Bosscher wrote: > On Thu, Oct 4, 2012 at 1:30 PM, Richard Guenther > wrote: >> Isn't _REVERSE vs. non-_RESERVE still kind-of "random" order? > > Not at this stage. For cfglayout mode I would answer yes, but IRA/LRA > oper

Re: Use conditional casting with symtab_node

2012-10-05 Thread Steven Bosscher
On Fri, Oct 5, 2012 at 11:50 PM, Lawrence Crowl wrote: > If no one cares about these time reports, then I will gladly stop > spending the effort to make them. It's not that no-one cases, I think, but the mathematics don't have to be so complicated. Just showing or saying there's no significant co

Re: *ping* [patch, libfortran] Fix PR 54736, memory corruption with GFORTRAN_CONVERT_UNIT

2012-10-06 Thread Steven Bosscher
On Sat, Oct 6, 2012 at 1:31 PM, Thomas Koenig wrote: > Am 01.10.2012 20:34, schrieb Thomas Koenig: >> >> Hello world, >> >> the previous version of the patch has an issue that Shane pointed >> out in the PR. This version should work; at least it survived >> all the test cases I could come up with

Re: Profile housekeeping 6/n (-fprofile-consistency-report)

2012-10-06 Thread Steven Bosscher
On Sat, Oct 6, 2012 at 4:10 PM, Jan Hubicka wrote: >> > Index: passes.c >> > +/* Hold statistic about profile consistency. */ >> ... >> >> I don't see why this should live in passes.c, can you please put it in >> a more logical place (profile.c, perhaps)? > > Hmm, the problem here is that the cod

Re: Profile housekeeping 6/n (-fprofile-consistency-report)

2012-10-06 Thread Steven Bosscher
On Sat, Oct 6, 2012 at 4:39 PM, Steven Bosscher wrote: >> If there are no complains I will commit the patch tomorrow. > > +1 complaint. > You're putting profile stuff and even RTL stuff in the pass manager. > That is Just Wrong. You already committed the patch. Your

Re: [lra] patch to speed more compilation of PR54146

2012-10-07 Thread Steven Bosscher
Hi Vlad, Thanks for working on this! > - EXECUTE_IF_SET_IN_BITMAP (reg_live_out, 0, j, bi) > -if (j >= FIRST_PSEUDO_REGISTER) > - mark_pseudo_live (j); > + EXECUTE_IF_SET_IN_BITMAP (reg_live_out, FIRST_PSEUDO_REGISTER, j, bi) > +mark_pseudo_live (j); FWIW, the above is optimized b

[patch] Add option to compute "reaching and live definitions"

2012-10-07 Thread Steven Bosscher
Hello, The attached patch adds a DF changeable flag to compute a subset of reaching definitions that are also live at the program points they reach. This is an idea I discussed with Paolo many years ago already, but until today it hadn't really ever been close to the top of my todo list, but tryin

[patch][lra] Improve initial program point density in lra-lives.c (RFA)

2012-10-07 Thread Steven Bosscher
On Sat, Oct 6, 2012 at 4:52 AM, Vladimir Makarov wrote: >> Without this patch: >> Compressing live ranges: from 700458 to 391665 - 55%, pre_count >> 40730653, post_count 34363983 >> max per-reg pre_count 12978 (228090, 2 defs, 2 uses) (reg/f:DI 228090 >> [ SR.25009 ]) >> max per-reg post_count 1096

Re: handle isl and cloog in contrib/download_prerequisites

2012-10-07 Thread Steven Bosscher
On Sun, Oct 7, 2012 at 10:31 PM, Manuel López-Ibáñez wrote: > Since isl and cloog need to be > configured/build in a special way to work with gcc They do?? I built isl and cloog on a few compile farm machines without any special configure magic. What problems did you encounter? Ciao! Steven

Re: [lra] patch to speed more compilation of PR54146

2012-10-07 Thread Steven Bosscher
On Sun, Oct 7, 2012 at 5:59 PM, Vladimir Makarov wrote: > The following patch speeds LRA up more on PR54146. Below times for > compilation of the test on gcc17.fsffrance.org (an AMD machine): > > Before: > real=1214.71 user=1192.05 system=22.48 > After: > real=1144.37 user=1124.31 system=20.11 Hi

[lra] another patch to speed more compilation of PR54146

2012-10-07 Thread Steven Bosscher
Hello, This patch changes the worklist-like bitmap in lra_eliminate() to an sbitmap. Effect on compile time: lra r192183: LRA virtuals elimination: 51.56 ( 6%) with patch: LRA virtuals elimination: 14.02 ( 2%) OK for the branch after bootstrap&test on x86_64-unknown-linux-gnu? Ciao! Steven

Re: [lra] patch to speed more compilation of PR54146

2012-10-08 Thread Steven Bosscher
On Mon, Oct 8, 2012 at 10:18 AM, Jakub Jelinek wrote: >> > I'm playing with a patch to expand the insns_with_changed_offsets >> > bitmap to an sbitmap, and will send a patch if this works better. >> >> Or make insns_with_changed_offsets a VEC of insns (or a pointer-set). > > Or use temporarily som

Re: [lra] patch to speed more compilation of PR54146

2012-10-08 Thread Steven Bosscher
On Sun, Oct 7, 2012 at 5:59 PM, Vladimir Makarov wrote: > * lra-lives.c (lra_start_point_ranges, lra_finish_point_ranges): > Remove. > (process_bb_lives): Change start regno in > EXECUTE_IF_SET_IN_BITMAP. Iterate on DF_LR_IN (bb) instead of > pseudos_live_th

Re: [lra] another patch to speed more compilation of PR54146

2012-10-08 Thread Steven Bosscher
On Mon, Oct 8, 2012 at 1:00 AM, Steven Bosscher wrote: > Hello, > > This patch changes the worklist-like bitmap in lra_eliminate() to an > sbitmap. Effect on compile time: I have another patch to also make lra_constraint_insn_stack_bitmap. Without patch: log.0: LRA non-specific

Re: [patch] Add option to compute "reaching and live definitions"

2012-10-08 Thread Steven Bosscher
On Mon, Oct 8, 2012 at 3:27 PM, Paolo Bonzini wrote: > I wonder if we actually need the non-pruned version anywhere... I don't think so, but I'm not sure. Only ddg.c and loop-iv.c access the DF_RD results directly (i.e. not via DU/UD chains). For loop-iv the pruned version is fine. For ddg I didn'

[lra] 3rd patch to speed more compilation of PR54146

2012-10-08 Thread Steven Bosscher
Hello, This patch makes lra_constraint_insn_stack_bitmap an sbitmap. This reduces compile time by another minute or so on gcc17 for the test case of PR54146, and I think it's a general improvement also for less extreme code. For cc1-i files the compile time change tends to be a little less but tha

Re: [lra] another patch to speed more compilation of PR54146

2012-10-08 Thread Steven Bosscher
On Mon, Oct 8, 2012 at 10:25 PM, Vladimir Makarov wrote: > Actually I have a simpler and better patch: Ah, lra_insn_recog_data, I couldn't find out how to get the insn itself :-) The OOM you're seeing on gcc17 is probably because we're both working on that machine. If we're both trying to compi

Re: [lra] 3rd patch to speed more compilation of PR54146

2012-10-08 Thread Steven Bosscher
On Mon, Oct 8, 2012 at 10:26 PM, Vladimir Makarov wrote: > I am not a fan of sbitmap for regular use. Me neither, to be honest. (For the lra-eliminations.c bitmap it was a particularly bad choice :-) > This patch definitely helps for > this particular test. But it might hurt performance for s

Re: Profile housekeeping 6/n (-fprofile-consistency-report)

2012-10-08 Thread Steven Bosscher
On Sat, Oct 6, 2012 at 5:56 PM, Jan Hubicka wrote: > Hi, > does this look better? Moving to cfg.c would importing tree-pass.h and rtl.h > that is not cool either. predict.c does all of these. > Obviously can also go to a separate file, if preferred. Attached is how I would do it. What do you thin

[patch][IRA] Apply LRA lessons-learned to IRA

2012-10-09 Thread Steven Bosscher
Hello, For LRA, compressing the live range chains proved to be quite helpful. The same can be done for IRA, as in the attached patch. For the test case of PR54146 the effect is time spent in IRA cut in half: without patch: integrated RA : 206.35 (28%) with patch: integrated RA

Re: [lra] patch to solve most scalability problems for LRA

2012-10-10 Thread Steven Bosscher
On Thu, Oct 4, 2012 at 5:37 PM, Vladimir Makarov wrote: > The following patch solves most of LRA scalability problems. > > It switches on simpler algorithms in LRA. The first it switches off > trying to reassign hard registers to spilled pseudos (they usually for such > huge functions have lon

Re: [lra] patch to solve most scalability problems for LRA

2012-10-10 Thread Steven Bosscher
On Wed, Oct 10, 2012 at 10:14 PM, Vladimir Makarov wrote: > It is also interesting that your IRA range patch results in > different code generation (i can not explain it too now). I saw the same > on a small test (black jack playing and betting strategy). I haven't looked into this, but I'm gue

Re: [asan] New transitional branch to port ASAN to trunk

2012-10-10 Thread Steven Bosscher
On Wed, Oct 10, 2012 at 10:20 PM, Diego Novillo wrote: > * tree-asan.c: New file. > * tree-asan.h: New file. Nit: do we still need the "tree-" prefix? IMHO not. Ciao! Steven

Re: [asan] New transitional branch to port ASAN to trunk

2012-10-10 Thread Steven Bosscher
On Wed, Oct 10, 2012 at 11:00 PM, Xinliang David Li wrote: > Is there an agreed way for file naming? It was not my intent to start a bike shed discussion. This was just something I've been wondering for some time. But AFAIC it's up to Diego&co to do what they think is right :-) Ciao! Steven

Re: [PATCH] Reduce conservativeness in REE using machine model (issue6631066)

2012-10-10 Thread Steven Bosscher
On Wed, Oct 10, 2012 at 11:25 PM, Teresa Johnson wrote: > What I did to address this is to call get_attr_mode from the machine model > to get the actual mode of the insn. In this case, it returns MODE_SI. > There doesn't seem to be any code that maps from the attr_mode (MODE_SI) > to the machine_m

[patch][IRA] Really record loop exits

2012-10-11 Thread Steven Bosscher
Hello, IRA uses record_loop_exits() to cache the loop exit edges, but due to a code ordering bug the edges are not actually recorded. record_loop_exits() starts with: if (!current_loops) return; So ira.c should set current_loops before calling record_loop_exits. With the current order, the

Re: Move statements upwards after reassociation

2012-10-11 Thread Steven Bosscher
On Thu, Oct 11, 2012 at 3:16 PM, Richard Biener wrote: > NB, the whole reassoc code needs a re-write to avoid the excessive > stmt modifications when it does nothing. So I'd very much rather avoid > adding anything to reassoc until that rewrite happened. IMHO it's fair to Easwaran to hold up a p

Re: [PATCH] Reduce conservativeness in REE using machine model (issue6631066)

2012-10-11 Thread Steven Bosscher
On Thu, Oct 11, 2012 at 11:44 PM, Teresa Johnson wrote: > + mode = targetm.machine_mode_from_attr_mode(insn); Nit: space between "..._mode" and "(". A test case would also be Nice To Have. Looks OK to me otherwise, but I can't approve it. Ciao! Steven

Re: Move statements upwards after reassociation

2012-10-11 Thread Steven Bosscher
On Fri, Oct 12, 2012 at 12:00 AM, Ian Lance Taylor wrote: > On Thu, Oct 11, 2012 at 1:25 PM, Steven Bosscher wrote: >> On Thu, Oct 11, 2012 at 3:16 PM, Richard Biener wrote: >>> NB, the whole reassoc code needs a re-write to avoid the excessive >>> stmt modifications wh

Re: [PATCH RFA] Implement register pressure directed hoist pass

2012-10-11 Thread Steven Bosscher
On Thu, Oct 11, 2012 at 8:50 AM, Bin Cheng wrote: > + /* x+y won't be hoisted without defaultly enabled "-fira-hoist-pressure", defaulty comment. > + kinds of code motion(including code hoisting) in a unified way. needs space between "motion" and "(". > + flow graph, given if it can rea

Re: [patch][IRA] Really record loop exits

2012-10-12 Thread Steven Bosscher
On Fri, Oct 12, 2012 at 6:16 AM, Vladimir Makarov wrote: > On 12-10-11 4:17 PM, Steven Bosscher wrote: >> >> Hello, >> >> IRA uses record_loop_exits() to cache the loop exit edges, but due to >> a code ordering bug the edges are not actually recorded. &

Re: [patch][IRA] Really record loop exits

2012-10-12 Thread Steven Bosscher
On Fri, Oct 12, 2012 at 3:31 PM, Vladimir Makarov wrote: > Ops. Sorry, Steven. I did a wrong conclusion because I thought I would > have found such code generation problem if it had an affect. Oh, the patch shouldn't (and doesn't) change the generated code, that is not how cfgloops works: record

  1   2   3   4   5   6   7   8   9   10   >