Fwd: Lots of gfortrans testsuite failuers on sparc64-linux: undefined reference to `_gfortran_reshape_r8
[Transfering this to the fortran list] Hi Christian, I did the commit that introduced these new symbols _gfortran_{reshape,transpose}_r{4,8}. They come from ${srcdir}/libgfortran/generated/{reshape,transpose}_r{4,8}.c and this file should be present indeed at revision 114896: $ svn info libgfortran/generated/reshape_r8.c Path: libgfortran/generated/reshape_r8.c Name: reshape_r8.c URL: svn+ssh://gcc.gnu.org/svn/gcc/trunk/libgfortran/generated/reshape_r8.c Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4 Revision: 114961 Node Kind: file Schedule: normal Last Changed Author: fxcoudert Last Changed Rev: 114880 Last Changed Date: 2006-06-22 08:04:02 +0200 (Thu, 22 Jun 2006) Text Last Updated: 2006-06-21 11:55:58 +0200 (Wed, 21 Jun 2006) Checksum: 8c9d27a3b974fbd53754fa7f6ac003d8 Indeed, both library and front-end changes were commited together. Maybe you haven't rebuilt the library after your last update, or did not get the generated files correctly (but then, I don't know why). If indeed, you have these sources files and, while rebuilding the library, the symbols do not end up in libgfortran.so, I'd appreciate you sending me the content of ${builddir}/${target}/libgfortran/kinds.h Thanks, FX
Re: Lots of gfortrans testsuite failuers on sparc64-linux: undefined reference to `_gfortran_reshape_r8
On 6/24/06, FX Coudert <[EMAIL PROTECTED]> wrote: [Transfering this to the fortran list] Hi Christian, I did the commit that introduced these new symbols _gfortran_{reshape,transpose}_r{4,8}. They come from ${srcdir}/libgfortran/generated/{reshape,transpose}_r{4,8}.c and this file should be present indeed at revision 114896: > $ svn info libgfortran/generated/reshape_r8.c > Path: libgfortran/generated/reshape_r8.c > Name: reshape_r8.c > URL: svn+ssh://gcc.gnu.org/svn/gcc/trunk/libgfortran/generated/reshape_r8.c > Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4 > Revision: 114961 > Node Kind: file > Schedule: normal > Last Changed Author: fxcoudert > Last Changed Rev: 114880 > Last Changed Date: 2006-06-22 08:04:02 +0200 (Thu, 22 Jun 2006) > Text Last Updated: 2006-06-21 11:55:58 +0200 (Wed, 21 Jun 2006) > Checksum: 8c9d27a3b974fbd53754fa7f6ac003d8 Indeed, both library and front-end changes were commited together. Maybe you haven't rebuilt the library after your last update, or did not get the generated files correctly (but then, I don't know why). If indeed, you have these sources files and, while rebuilding the library, the symbols do not end up in libgfortran.so, I'd appreciate you sending me the content of ${builddir}/${target}/libgfortran/kinds.h Thanks, FX well, I didn't do a full bootstrap, I did a "bubblestrap" ... maybe that was the issue then. before running the next bubblestrap, what files do you recommend me to remove so that they get stage wise properly rebuilt? -- Cheers, /ChJ
Re: Lots of gfortrans testsuite failuers on sparc64-linux: undefined reference to `_gfortran_reshape_r8
well, I didn't do a full bootstrap, I did a "bubblestrap" ... maybe that was the issue then. before running the next bubblestrap, what files do you recommend me to remove so that they get stage wise properly rebuilt? Hum... I'm not sure, but I think the safe steps here are: - check the original files are there (${srcdir}/libgfortran/generated/{reshape,transpose}_r{4,8}.c) - force the build mechanism to update your ${builddir}/${target}/libgfortran/Makefile, either by reconfiguring this directory, or removing the Makefile (I'm not sure that works) or deleting your whole ${builddir}/${target}/libgfortran directory. That should work. FX
Re: Lots of gfortrans testsuite failuers on sparc64-linux: undefined reference to `_gfortran_reshape_r8
On 6/24/06, FX Coudert <[EMAIL PROTECTED]> wrote: > well, I didn't do a full bootstrap, I did a "bubblestrap" ... maybe > that was the issue then. before running the next bubblestrap, what > files do you recommend me to remove so that they get stage wise > properly rebuilt? Hum... I'm not sure, but I think the safe steps here are: - check the original files are there (${srcdir}/libgfortran/generated/{reshape,transpose}_r{4,8}.c) - force the build mechanism to update your ${builddir}/${target}/libgfortran/Makefile, either by reconfiguring this directory, or removing the Makefile (I'm not sure that works) or deleting your whole ${builddir}/${target}/libgfortran directory. That should work. well, $ ls -l sparc64-unknown-linux-gnu/libgfortran/kinds.h -rw-rw-r-- 1 chj chj 1003 Jun 15 04:03 sparc64-unknown-linux-gnu/libgfortran/kinds.h which means that that file is from the previous build... $ svn info libgfortran/generated/reshape_r8.c Path: libgfortran/generated/reshape_r8.c Name: reshape_r8.c URL: http://gcc.gnu.org/svn/gcc/trunk/libgfortran/generated/reshape_r8.c Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4 Revision: 114896 Node Kind: file Schedule: normal Last Changed Author: fxcoudert Last Changed Rev: 114880 Last Changed Date: 2006-06-22 08:04:02 +0200 (Thu, 22 Jun 2006) Text Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006) Properties Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006) Checksum: 8c9d27a3b974fbd53754fa7f6ac003d8 $ svn info libgfortran/generated/reshape_r4.c Path: libgfortran/generated/reshape_r4.c Name: reshape_r4.c URL: http://gcc.gnu.org/svn/gcc/trunk/libgfortran/generated/reshape_r4.c Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4 Revision: 114896 Node Kind: file Schedule: normal Last Changed Author: fxcoudert Last Changed Rev: 114880 Last Changed Date: 2006-06-22 08:04:02 +0200 (Thu, 22 Jun 2006) Text Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006) Properties Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006) Checksum: 74ff3f839131e8c667e404b316d41859 $ svn info libgfortran/generated/transpose_r8.c Path: libgfortran/generated/transpose_r8.c Name: transpose_r8.c URL: http://gcc.gnu.org/svn/gcc/trunk/libgfortran/generated/transpose_r8.c Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4 Revision: 114896 Node Kind: file Schedule: normal Last Changed Author: fxcoudert Last Changed Rev: 114880 Last Changed Date: 2006-06-22 08:04:02 +0200 (Thu, 22 Jun 2006) Text Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006) Properties Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006) Checksum: 3043842d8d36938c8f29f5d319c962d9 $ svn info libgfortran/generated/transpose_r4.c Path: libgfortran/generated/transpose_r4.c Name: transpose_r4.c URL: http://gcc.gnu.org/svn/gcc/trunk/libgfortran/generated/transpose_r4.c Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4 Revision: 114896 Node Kind: file Schedule: normal Last Changed Author: fxcoudert Last Changed Rev: 114880 Last Changed Date: 2006-06-22 08:04:02 +0200 (Thu, 22 Jun 2006) Text Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006) Properties Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006) Checksum: 9530e0da6e10c3e99665517f9e96209f So, I think I'll go for deletion of the whole ${builddir}/${target}/libgfortran directory. unless someone wants to help me check the dependencies to be able to list them in the proper places in the build mechanism so that this don't happen -- Cheers, /ChJ
Re: g++ 4.1.1 Missing warning
No negative responses, so I'll enter it in bugzilla. Andrew Walrond
Re: Visibility and C++ Classes/Templates
Mark Mitchell <[EMAIL PROTECTED]> writes: [...] | And, "extern template" is a GNU | extension which says "there's an explicit instantiation elsewhere; you | needn't bother implicitly instantiating here". FWIW, "extern template" is now part of C++0x. | I'm just not comfortable with the idea of #pragmas affecting | instantiations. (I'm OK with them affecting specializations, though; in | that case, the original template has basically no impact, so I think | it's fine to treat the specialization case as if it were any other | function.) I'm undecided whether #pragmas should not affect explicit instantiations. They really are not like implicit instantiations (which, I agree with you, should not be affected). Explicit instantiations behave more like real declarations than implicit instantiations. -- Gaby
Re: g++ 4.1.1 Missing warning
Stupid, stupid. While creating a minimal test case, my mistake becomes apparent, so please disregard. In case you're wondering, adding 'explicit' to the main Bifilter constructor stops the first parameter in Bifilter _bif(new Filter(),Bifilter::DELETE_ON_DESTRUCTION); being implicitly converted to a Bilfilter& using the first constructor so that the second (copy) constructor gets called: class Bifilter : public Filter { public: enum DestructorAction { DELETE_ON_DESTRUCTION,KEEP_ON_DESTRUCTION }; explicit Bifilter( Filter* _source = 0, Filter* _sink = 0, DestructorAction _action = KEEP_ON_DESTRUCTION ); Bifilter( const Bifilter& _original, DestructorAction _action = KEEP_ON_DESTRUCTION ); ... Tricksy ;) Andrew Walrond
Re: ICE in complex division
div_comp_red_2.f90: In function 'MAIN__': div_comp_red_2.f90:1: internal compiler error: Bus error Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html> for instructions. I reported this bug as PR 28151. It's not target-specific (it happens also on i686-linux) and it looks like a middle-end issue. Now, we have to hope that it gets more attention than PR 27889 :( FX
Re: Project RABLET
On 6/24/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: On Fri, 2006-06-23 at 15:07 -0700, Ian Lance Taylor wrote: > You omitted the RTL loop optimizer passes, which still do quite a bit > of work despite the tree-ssa loop passes. Also if-conversion and some > minor passes, though they are less relevant. Which brings up a good discussion. I presume the rtl loop optimizers see things exposed by addressing modes which aren't seen in the higher level code. I wonder what the "big gains" are here... Knowning which address computations are loop invariant. Knowing the number of instructions (sadly not the exact size because instructions haven't been selected) to determine whether it is worth unrolling/peeling/unswitching a loop. Finding loops that can use a doloop pattern. and if they are detectable at expansion time... For most of them, I don't think so. In general, I didnt mention anything that tends not to increase register pressure, at least not in any significant manner as far as RABLET is concerned. So do you have hard data showing that CSE increases register pressure? Given the thinks CSE does, it would probably be much more useful, then, to make it possible to have liveness information in CSE so that it can take register pressure into account in its cost considerations ;-) No magic new expand is going to make CSE obsolete, and it simply does too much to just throw it out. (FWIW I'm still working on simplifying cse.c...) Clearly there will be a lot of further investigation required once implementation reaches this point. Ultimately CSE and all RTL optimizations can be re-evaluated to see if things can be simplified. *laughs* Every time some RTL optimizer is re-re-re-re-re-evaluated, it turns out we lose without it. Good luck to you, but I think you're seriously underestimating the complexity of things here. > Modulo the above comments, I don't see anything wrong with your basic > idea. But I also wonder whether you couldn't get a similar effect by > forcing instruction selection to occur before register allocation. If > that is done well, reload will have much less work to do. Hurray. This is what new-ra did. It was probably the only thing there that worked well, but it was a great idea. (Sadly it was just reload rewritten so pre-reload.c was ugly, but the idea was good). Its clearly not as good as a new register allocator would be, but the effort to benefit ratio ought to be a lot higher for RABLET than for a register allocator rewrite. There is a register allocator rewrite under way, from one of your co-workers even. Is there any relation between Vlad's project and yours, or are you going different ways with the same goal in mind? :-D Gr. Steven
Re: unable to detect exception model
On Jun 23, 2006, at 7:42 PM, Jack Howarth wrote: I have run into a build problem with tonights gcc trunk on MacOS X which didn't exist in yesterdays svn pull. The gcc trunk build on MacOS X 10.4.6 crashes with... I can reproduce this, something is miscompiling cc1plus. -- Pinski
Re: Project RABLET
Steven Bosscher wrote: Every time some RTL optimizer is re-re-re-re-re-evaluated, it turns out we lose without it. Good luck to you, but I think you're seriously underestimating the complexity of things here. Its clearly not as good as a new register allocator would be, but the effort to benefit ratio ought to be a lot higher for RABLET than for a register allocator rewrite. There is a register allocator rewrite under way, from one of your co-workers even. Is there any relation between Vlad's project and yours, or are you going different ways with the same goal in mind? :-D Working on register allocation issues last three years and looking at the new-ra project I can say that any project in this area has a big chance to fail despite how good design looked at the first glance. The problem is in complexity of RTL and lot of ports with very specific issues which are described by gazillion of macros. Redesign and simplification of RTL could solve a lot of problems like code selection, register allocation etc (although might create others). But this task is much bigger than introducing tree-SSA because it means rewriting all machine description files and practically equal to redoing all ports. Do we have resources for this? I don't think so. So saying this, my point of view that the more projects we have in this area, the better chance we will have to solve the problem. Therefore I really appreciate what Andrew and Bernd Schmidt do. It might look as a waste of resources but we can not people force not to do what they believe and want to do (e.g. we can not force Bernd not to improve reload because he can work on a new register allocator. He improves reload because he knows it best than others). As for Andrew's proposal, my opinion is that all this transformations are done too early and we need them to do again on rtl sometime. o coalescing. CSE can create more moves but more important thing is the extended coalescing can not be done here (or I don't know how it can be done here). It is about moves generated because of two-address architecture constraints (regmove and global tries to solve this problem in ad hoc way e.g. through hard register preference by global). It should be part of coalescing pass, because removing a move can prevent removing a higher priority move generated by reload because of the two address constraints. o register pressure relief through live range splitting and/or rematerialization. We have no accurate information here, because after that there are passes which change the pressure like insn scheduling and CSE. Although insn scheduling has heuristic not to increase register pressure, it has very small priority (third or fourth). Therefore insn scheduling can increase the pressure a lot (but sometimes decrease it too). Insn scheduler with register renaming being implemented by ISP RAS might solve this problem, if it works only after the register allocator. But this insn scheduler can work before the register allocator too and only its usage will show will it work only after the register allocator or in traditional way (before and the after the register allocator). Even without changing the register pressure by subsequent passes, there is another problem which is difficulty to calculate the register pressure excess. We don't know what register class will be used for a pseudo-register (e.g. AREG or GENERAL_REGS for x86 which creates difference 6 in the register pressure). Although reducing register pressure from 100 to 6 will be very helpful, my experience shows that the most frequent and interesting cases are on the border. o register renaming is already done and effectively (because it uses the data flow analysis framework) by -fweb. But I think it can be done more effectively by out-of-ssa pass. Actually what Andrew proposes (and more) I did two years ago on RTL level close to the register allocator (see gcc summit article "fighting register pressure in gcc"). The result was not satisfactory for me and I moved on rewriting the register allocator. Probably, I should have committed more what I've done into the mainline. Probably what Andrew proposes can be done faster on tree-SSA although doing it on RTL we would have more accurate information. In any case it will improve code in some cases and can be used as a temporary solution (until new register allocator projects will be done or forever if they failed). Andrew's proposal has a sense too with the code reuse point of view if he wants to move on with RABLE project. As for my project YARA, I don't know when it will be ready for the mainline (at least one more year) because it includes removing reload (the biggest and most complicated part of the RA). It works now only for x86 and x86_64, generates better code for SPECint2000 and SPECFp2000 (at least for pentium4, nocona and coming woodcrest. I have no free AMD machine to make benchmarking). I've just started wo
gcc-4.2-20060624 is now available
Snapshot gcc-4.2-20060624 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20060624/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.2 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 114971 You'll find: gcc-4.2-20060624.tar.bz2 Complete GCC (includes all of below) gcc-core-4.2-20060624.tar.bz2 C front end and core compiler gcc-ada-4.2-20060624.tar.bz2 Ada front end and runtime gcc-fortran-4.2-20060624.tar.bz2 Fortran front end and runtime gcc-g++-4.2-20060624.tar.bz2 C++ front end and runtime gcc-java-4.2-20060624.tar.bz2 Java front end and runtime gcc-objc-4.2-20060624.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.2-20060624.tar.bz2The GCC testsuite Diffs from 4.2-20060617 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.2 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: What is baseline for the testsuite?
Thanks for everybody who replied. I have extracted some information from the replies and described it in http://gcc.gnu.org/wiki/TestingGCC, see "Interpretation of testsuite results". Please review and edit as you see fit. -- Laurynas
Re: Boehm-gc performance data
2006/6/23, Steven Bosscher <[EMAIL PROTECTED]>: Don't write off Boehm's GC just yet. You can't expect to beat something that has seen a lot of tuning for GCC with something that you got working only a few days ago. There are a lot of special tricks especially in ggc-page that may put it at an advantage, but with some tuning perhaps you can get Boehm's to perform better for GCC. But of course we are limited to tweaking usage of external Boehm's collector API, while internal collectors can have their internals hacked to support GCC's needs best. Nevertheless I will continue tweaking Boehm's GC: incremental collection, different allocation routines for large objects w/o pointers, weak pointer support, excluding roots for large static data... For the locality thing: Have you already tried using something like cachegrind or oprofile to compare the cache behavior of gcc with Boehm's and gcc with ggc? An excellent suggestion, although my primary working platform is valgrind-less Cygwin, but I will find a way to gather cache usage data. What about allocation strategies? Perhaps that's another thing you could toy with to improve the peak memory usage issue. I don't know how Boehm's GC works, but in ggc-page e.g. all binary expression 'tree's are allocated on the same bag of pages, which may help (or not, dunno). There might be some options here: for objects that do not contain pointers special API can be used instead of generic one. Moreover I think that peak memory usage can be reduced by using Boehm's weak pointer facilities where they should be used: I suspect that some things are not collected just because they are cached. Thanks for your comments, -- Laurynas
Re: Boehm-gc performance data
Hi, > combine.c: top mem usage: 52180k (13915k). GC execution time 0.66 > (0.61) 4% (4%). User running time: 0m16 (0m14). Are these with checking on or off? Normally checking is on, you have to go out of your way to turn it off. If it were on, the real numbers are going to look much worse than the ones you're presented. Both sets of numbers are with checking on, I guess that makes them comparable? Also, I've not been following real closely, but the GTY markers are used by PCH and the dual use of them by GC allow one to find PCH bugs more quickly and easily. If we moved entirely to Boehm's, did you have a plan for the GTY markers and PCH? As Andrew already has noted, I still use GTY markers at least for registering additional roots. I don't really have a plan for PCH yet; I guess that some additional bookkeeping would have to be done in allocation routines using some weak-pointer based data structure... I don't know yet. Thanks for comments, -- Laurynas
Re: Boehm-gc performance data
On Jun 24, 2006, at 1:43 PM, Laurynas Biveinis wrote: An excellent suggestion, although my primary working platform is valgrind-less Cygwin, but I will find a way to gather cache usage data. You could try to use Vtune though. Thanks, Andrew Pinski
Re: Boehm-gc performance data
2006/6/23, David Nicol <[EMAIL PROTECTED]>: Is it possible to turn garbage collection totally off for a null-case run-time comparison or would that cause thrashing except for very small jobs? It should be possible to adopt ggc-none for usage in GCC proper with little effort. Shouldn't cause trashing very soon: in my C tests, if GC memory peaks at 30MB, then total GC allocated memory is about 50MB. -- Laurynas
Re: Visibility and C++ Classes/Templates
Gabriel Dos Reis wrote: Mark Mitchell <[EMAIL PROTECTED]> writes: | I'm just not comfortable with the idea of #pragmas affecting | instantiations. (I'm OK with them affecting specializations, though; in | that case, the original template has basically no impact, so I think | it's fine to treat the specialization case as if it were any other | function.) I'm undecided whether #pragmas should not affect explicit instantiations. They really are not like implicit instantiations (which, I agree with you, should not be affected). Explicit instantiations behave more like real declarations than implicit instantiations. Yep. I'm sympathetic to Mark's position, but still tend to believe that the #pragma should affect explicit instantiations. Explicit instantiations are a way to make template instantiations conform more to the traditional declaration/definition model. We ignore the #pragmas for implicit instantiations because the user doesn't control the point of instantiation; with explicit instantiations, they do. Explicit instantiations don't behave just like implicit instantiations; there are other differences. Jason
Re: Visibility and C++ Classes/Templates
Jason Merrill wrote: > Yep. I'm sympathetic to Mark's position, but still tend to believe that > the #pragma should affect explicit instantiations. I don't feel strongly enough to care; let's do make sure, however, that we clearly document the precedence, so that people know what to expect. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: Project RABLET
On Sat, 2006-06-24 at 13:04 +0200, Steven Bosscher wrote: > On 6/24/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote: > > On Fri, 2006-06-23 at 15:07 -0700, Ian Lance Taylor wrote: > >= > > In general, I didnt mention anything that tends not to increase register > > pressure, at least not in any significant manner as far as RABLET is > > concerned. > > So do you have hard data showing that CSE increases register pressure? > Given the thinks CSE does, it would probably be much more useful, > then, to make it possible to have liveness information in CSE so that > it can take register pressure into account in its cost considerations > ;-) No magic new expand is going to make CSE obsolete, and it simply > does too much to just throw it out. (FWIW I'm still working on > simplifying cse.c...) no, no hard data, its just the kind of activity which can undo what RABLET proposes doing. Ie, an expression which I go to the effort to rematerializing in 3 places is likely to be commoned back to the original live range without stepping in and telling CSE not to do that. What I really had in mind was marking these register values/expression/whatever with a flag such that when CSE or GCSE or whoever asks, they get returned as not-to-be-looked-at. That way they don't undo the "opportunities" which RABLET has so kindly exposed to these optimizations :-) > > > Clearly there will be a lot of further investigation required once > > implementation reaches this point. Ultimately CSE and all RTL > > optimizations can be re-evaluated to see if things can be simplified. > > *laughs* > > Every time some RTL optimizer is re-re-re-re-re-evaluated, it turns > out we lose without it. Good luck to you, but I think you're seriously > underestimating the complexity of things here. I was not really looking to rewrite the passes so much as tell them not to work on certain registers. This could potentially be extended to ssa_names which the tree optimizers have processed and which get expanded into single RTL registers. A new expand would know that the translation into RTL was simple enough that nothing new has really been exposed to the RTL optimizers. That was my thought anyway. Then CSE et al work on whatever is left. Perhaps there are then some hunks that can be remoived as redundant, or maybe not. > > Its clearly not as good as a new register allocator would be, but the > > effort to benefit ratio ought to be a lot higher for RABLET than for a > > register allocator rewrite. > > There is a register allocator rewrite under way, from one of your > co-workers even. Is there any relation between Vlad's project and > yours, or are you going different ways with the same goal in mind? :-D Totally different scale of project with completely different goal. Had I set out and actually started writing RABLE, that would be going in different directions with the same goal. In theory, RABLET should make the job of any register allocator a bit easier. These days, I think any register allocator's goal ought to be to assign registers and be the final authority. No reload undoing any of the work. That is well beyond the scope of what Im doing. It is within the scope of what Vlad is doing. Andrew
Re: Project RABLET
On Sat, 2006-06-24 at 12:36 -0400, Vladimir N. Makarov wrote: > Steven Bosscher wrote: > As for Andrew's proposal, my opinion is that all this > transformations are done too early and we need them to do again on > rtl sometime. > > o coalescing. CSE can create more moves but more important thing is RABLET will do nothing different than is done today.. out of ssa coalesces ssa_names out the wazoo. In the interest of register pressure reduction, it may actually coalesce less to split up live ranges, and leave loads/stores from/to the stack. > > o register pressure relief through live range splitting and/or > rematerialization. We have no accurate information here, because > after that there are passes which change the pressure like insn Sure, Im not suggesting that RABLET will reduce the register pressure to something that isn't going to spill. Far from it. I am saying that RABLET can reduce something completely unmanageable to something more manageable. instead of handing the RTL passes a basic block that contains a peak register pressure of 120 when there are 16 hardware registers, perhaps it will be a basic block that has been reduced down to a peak of 25 or something. The calculations at the tree level are only going to be rough, enough to use as a guideline like that. If RA doesnt have to spill its guts, it has a chance to do a better job I think. > scheduling and CSE. Although insn scheduling has heuristic not to > increase register pressure, it has very small priority (third or > fourth). Therefore insn scheduling can increase the pressure a lot sure, but it wont increase it from 25 back up to 140, so there should still be benefit. > Actually what Andrew proposes (and more) I did two years ago on RTL > level close to the register allocator (see gcc summit article > "fighting register pressure in gcc"). The result was not satisfactory > for me and I moved on rewriting the register allocator. Probably, I > should have committed more what I've done into the mainline. > I think its hard to do what I am going to do at the RTL level. I have all the information from tree-ssa available to make quite a few interesting decisions. and with a rewritten expand, decisions about whether things are regisrer or memory based can be made more fine grain. It just seems like the last good place to do some of this work. ANd perhaps we can get better instructions selected by seeing more that expand currently sees. > Probably what Andrew proposes can be done faster on tree-SSA > although doing it on RTL we would have more accurate information. In > any case it will improve code in some cases and can be used as a > temporary solution (until new register allocator projects will be done > or forever if they failed). Andrew's proposal has a sense too with > the code reuse point of view if he wants to move on with RABLE > project. we'll see if that ever happens :-) Im hoping RABLET makes it less urgent. If not, then its time to consider something different, perhaps some of the key individual compents such as forced instructoin selection can be done, or maybe YARA will be a success and you'll take care of it! Andrew
Re: RFC: __cxa_atexit for mingw32
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Danny Smith wrote: > Adding a real __cxa_atexit to mingw runtime is of course also possible, > but I thought I'd attempt the easy options first. When you say "runtime", do you mean libstdc++ or something like libmingwex.a in "mingw-runtime"? If you mean the former, you can add this in for GCC 4.2 and work on a real __cxa_atexit() for GCC 4.3, if you want. Thanks, Ranjit. - -- Ranjit Mathew Email: rmathew AT gmail DOT com Bangalore, INDIA. Web: http://rmathew.com/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEneajYb1hx2wRS48RAtVKAKCPOIlElw5cVYajj9Ki1LxcRVwgiwCdFEA6 mL/bT1jDUYyTdJp1tQFEfVg= =iXH6 -END PGP SIGNATURE-
Re: Project RABLET
Andrew MacLeod wrote: o register pressure relief through live range splitting and/or rematerialization. We have no accurate information here, because after that there are passes which change the pressure like insn Sure, Im not suggesting that RABLET will reduce the register pressure to something that isn't going to spill. Far from it. I am saying that RABLET can reduce something completely unmanageable to something more manageable. instead of handing the RTL passes a basic block that contains a peak register pressure of 120 when there are 16 hardware registers, perhaps it will be a basic block that has been reduced down to a peak of 25 or something. The calculations at the tree level are only going to be rough, enough to use as a guideline like that. Having no information about the final register allocator decision, the partial register pressure reducing through rematerialization is not working in many cases. For example, making rematerialization of a <- b + c when you reduce the pressure from 100 to 50 for x86 there is a big chance that b and c will be not placed in hard registers. Instead of one load (of a), two loads (b and c) will be needed. This result code is even worse than before reducing pressure. So rematerialization in out-of-ssa pass will work well only for full pressure relief (to the level equal to the number of hard registers) or close to the full relief. But even if you can decrease register pressure relief to the level of the hard register number, it is hard to know have you achieved the full register pressure relief because you can not be sure what register class will be used (e.g. AREG or GENERAL_REGS for x86). Although it can work for architectures with big regular register files (e.g. classic RISC processors). The SSA pressure relief through rematerialization described in Simpson's theses is oriented for such architectures (with a big regular register file size of 32 as I remember). So it can work for ppc but it will be less successful for major interest platforms x86 and x86_64.