Fwd: Lots of gfortrans testsuite failuers on sparc64-linux: undefined reference to `_gfortran_reshape_r8

2006-06-24 Thread FX Coudert

[Transfering this to the fortran list]

Hi Christian,

I did the commit that introduced these new symbols 
_gfortran_{reshape,transpose}_r{4,8}. They come from 
${srcdir}/libgfortran/generated/{reshape,transpose}_r{4,8}.c

and this file should be present indeed at revision 114896:

$ svn info libgfortran/generated/reshape_r8.c 
Path: libgfortran/generated/reshape_r8.c

Name: reshape_r8.c
URL: svn+ssh://gcc.gnu.org/svn/gcc/trunk/libgfortran/generated/reshape_r8.c
Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
Revision: 114961
Node Kind: file
Schedule: normal
Last Changed Author: fxcoudert
Last Changed Rev: 114880
Last Changed Date: 2006-06-22 08:04:02 +0200 (Thu, 22 Jun 2006)
Text Last Updated: 2006-06-21 11:55:58 +0200 (Wed, 21 Jun 2006)
Checksum: 8c9d27a3b974fbd53754fa7f6ac003d8


Indeed, both library and front-end changes were commited together. Maybe 
you haven't rebuilt the library after your last update, or did not get 
the generated files correctly (but then, I don't know why).


If indeed, you have these sources files and, while rebuilding the 
library, the symbols do not end up in libgfortran.so, I'd appreciate you 
sending me the content of ${builddir}/${target}/libgfortran/kinds.h


Thanks,
FX


Re: Lots of gfortrans testsuite failuers on sparc64-linux: undefined reference to `_gfortran_reshape_r8

2006-06-24 Thread Christian Joensson

On 6/24/06, FX Coudert <[EMAIL PROTECTED]> wrote:

[Transfering this to the fortran list]

Hi Christian,

I did the commit that introduced these new symbols
_gfortran_{reshape,transpose}_r{4,8}. They come from
${srcdir}/libgfortran/generated/{reshape,transpose}_r{4,8}.c
and this file should be present indeed at revision 114896:

> $ svn info libgfortran/generated/reshape_r8.c
> Path: libgfortran/generated/reshape_r8.c
> Name: reshape_r8.c
> URL: svn+ssh://gcc.gnu.org/svn/gcc/trunk/libgfortran/generated/reshape_r8.c
> Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
> Revision: 114961
> Node Kind: file
> Schedule: normal
> Last Changed Author: fxcoudert
> Last Changed Rev: 114880
> Last Changed Date: 2006-06-22 08:04:02 +0200 (Thu, 22 Jun 2006)
> Text Last Updated: 2006-06-21 11:55:58 +0200 (Wed, 21 Jun 2006)
> Checksum: 8c9d27a3b974fbd53754fa7f6ac003d8

Indeed, both library and front-end changes were commited together. Maybe
you haven't rebuilt the library after your last update, or did not get
the generated files correctly (but then, I don't know why).

If indeed, you have these sources files and, while rebuilding the
library, the symbols do not end up in libgfortran.so, I'd appreciate you
sending me the content of ${builddir}/${target}/libgfortran/kinds.h

Thanks,
FX



well, I didn't do a full bootstrap, I did a "bubblestrap" ... maybe
that was the issue then. before running the next bubblestrap, what
files do you recommend me to remove so that they get stage wise
properly rebuilt?

--
Cheers,

/ChJ


Re: Lots of gfortrans testsuite failuers on sparc64-linux: undefined reference to `_gfortran_reshape_r8

2006-06-24 Thread FX Coudert

well, I didn't do a full bootstrap, I did a "bubblestrap" ... maybe
that was the issue then. before running the next bubblestrap, what
files do you recommend me to remove so that they get stage wise
properly rebuilt?


Hum... I'm not sure, but I think the safe steps here are:
  - check the original files are there 
(${srcdir}/libgfortran/generated/{reshape,transpose}_r{4,8}.c)
  - force the build mechanism to update your 
${builddir}/${target}/libgfortran/Makefile, either by reconfiguring this 
directory, or removing the Makefile (I'm not sure that works) or 
deleting your whole ${builddir}/${target}/libgfortran directory.


That should work.

FX


Re: Lots of gfortrans testsuite failuers on sparc64-linux: undefined reference to `_gfortran_reshape_r8

2006-06-24 Thread Christian Joensson

On 6/24/06, FX Coudert <[EMAIL PROTECTED]> wrote:

> well, I didn't do a full bootstrap, I did a "bubblestrap" ... maybe
> that was the issue then. before running the next bubblestrap, what
> files do you recommend me to remove so that they get stage wise
> properly rebuilt?

Hum... I'm not sure, but I think the safe steps here are:
   - check the original files are there
(${srcdir}/libgfortran/generated/{reshape,transpose}_r{4,8}.c)
   - force the build mechanism to update your
${builddir}/${target}/libgfortran/Makefile, either by reconfiguring this
directory, or removing the Makefile (I'm not sure that works) or
deleting your whole ${builddir}/${target}/libgfortran directory.

That should work.


well,

$ ls -l sparc64-unknown-linux-gnu/libgfortran/kinds.h
-rw-rw-r--  1 chj chj 1003 Jun 15 04:03
sparc64-unknown-linux-gnu/libgfortran/kinds.h

which means that that file is from the previous build...

$ svn info libgfortran/generated/reshape_r8.c
Path: libgfortran/generated/reshape_r8.c
Name: reshape_r8.c
URL: http://gcc.gnu.org/svn/gcc/trunk/libgfortran/generated/reshape_r8.c
Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
Revision: 114896
Node Kind: file
Schedule: normal
Last Changed Author: fxcoudert
Last Changed Rev: 114880
Last Changed Date: 2006-06-22 08:04:02 +0200 (Thu, 22 Jun 2006)
Text Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006)
Properties Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006)
Checksum: 8c9d27a3b974fbd53754fa7f6ac003d8

$ svn info libgfortran/generated/reshape_r4.c
Path: libgfortran/generated/reshape_r4.c
Name: reshape_r4.c
URL: http://gcc.gnu.org/svn/gcc/trunk/libgfortran/generated/reshape_r4.c
Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
Revision: 114896
Node Kind: file
Schedule: normal
Last Changed Author: fxcoudert
Last Changed Rev: 114880
Last Changed Date: 2006-06-22 08:04:02 +0200 (Thu, 22 Jun 2006)
Text Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006)
Properties Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006)
Checksum: 74ff3f839131e8c667e404b316d41859

$ svn info libgfortran/generated/transpose_r8.c
Path: libgfortran/generated/transpose_r8.c
Name: transpose_r8.c
URL: http://gcc.gnu.org/svn/gcc/trunk/libgfortran/generated/transpose_r8.c
Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
Revision: 114896
Node Kind: file
Schedule: normal
Last Changed Author: fxcoudert
Last Changed Rev: 114880
Last Changed Date: 2006-06-22 08:04:02 +0200 (Thu, 22 Jun 2006)
Text Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006)
Properties Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006)
Checksum: 3043842d8d36938c8f29f5d319c962d9

$ svn info libgfortran/generated/transpose_r4.c
Path: libgfortran/generated/transpose_r4.c
Name: transpose_r4.c
URL: http://gcc.gnu.org/svn/gcc/trunk/libgfortran/generated/transpose_r4.c
Repository UUID: 138bc75d-0d04-0410-961f-82ee72b054a4
Revision: 114896
Node Kind: file
Schedule: normal
Last Changed Author: fxcoudert
Last Changed Rev: 114880
Last Changed Date: 2006-06-22 08:04:02 +0200 (Thu, 22 Jun 2006)
Text Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006)
Properties Last Updated: 2006-06-22 19:10:51 +0200 (Thu, 22 Jun 2006)
Checksum: 9530e0da6e10c3e99665517f9e96209f

So, I think I'll go for deletion of the whole
${builddir}/${target}/libgfortran directory.

unless someone wants to help me check the dependencies to be able to
list them in the proper places in the build mechanism so that this
don't happen

--
Cheers,

/ChJ


Re: g++ 4.1.1 Missing warning

2006-06-24 Thread andrew
No negative responses, so I'll enter it in bugzilla.

Andrew Walrond


Re: Visibility and C++ Classes/Templates

2006-06-24 Thread Gabriel Dos Reis
Mark Mitchell <[EMAIL PROTECTED]> writes:

[...]

| And, "extern template" is a GNU
| extension which says "there's an explicit instantiation elsewhere; you
| needn't bother implicitly instantiating here".

FWIW, "extern template" is now part of C++0x.

| I'm just not comfortable with the idea of #pragmas affecting
| instantiations.  (I'm OK with them affecting specializations, though; in
| that case, the original template has basically no impact, so I think
| it's fine to treat the specialization case as if it were any other
| function.)

I'm undecided whether #pragmas should not affect explicit
instantiations.  They really are not like implicit instantiations
(which, I agree with you, should not be affected).  Explicit 
instantiations behave more like real declarations than implicit
instantiations. 

-- Gaby


Re: g++ 4.1.1 Missing warning

2006-06-24 Thread andrew
Stupid, stupid.

While creating a minimal test case, my mistake becomes apparent, so
please disregard. In case you're wondering, adding 'explicit' to the
main Bifilter constructor stops the first parameter in

   Bifilter _bif(new Filter(),Bifilter::DELETE_ON_DESTRUCTION);

being implicitly converted to a Bilfilter& using the first constructor
so that the second (copy) constructor gets called:

class Bifilter : public Filter
{
public:
enum DestructorAction { DELETE_ON_DESTRUCTION,KEEP_ON_DESTRUCTION };

explicit Bifilter(
Filter*  _source = 0,
Filter*  _sink   = 0,
DestructorAction _action = KEEP_ON_DESTRUCTION
);

Bifilter(
const Bifilter& _original,
DestructorAction _action = KEEP_ON_DESTRUCTION
);
...


Tricksy ;)

Andrew Walrond


Re: ICE in complex division

2006-06-24 Thread FX Coudert

div_comp_red_2.f90: In function 'MAIN__':
div_comp_red_2.f90:1: internal compiler error: Bus error
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html> for instructions.


I reported this bug as PR 28151. It's not target-specific (it happens 
also on i686-linux) and it looks like a middle-end issue. Now, we have 
to hope that it gets more attention than PR 27889 :(


FX


Re: Project RABLET

2006-06-24 Thread Steven Bosscher

On 6/24/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote:

On Fri, 2006-06-23 at 15:07 -0700, Ian Lance Taylor wrote:

> You omitted the RTL loop optimizer passes, which still do quite a bit
> of work despite the tree-ssa loop passes.  Also if-conversion and some
> minor passes, though they are less relevant.

Which brings up a good discussion. I presume the rtl loop optimizers see
things exposed by addressing modes which aren't seen in the higher level
code. I wonder what the "big gains" are here...


Knowning which address computations are loop invariant. Knowing the
number of instructions (sadly not the exact size because instructions
haven't been selected) to determine whether it is worth
unrolling/peeling/unswitching a loop. Finding loops that can use a
doloop pattern.


and if they are
detectable at expansion time...


For most of them, I don't think so.


In general, I didnt mention anything that tends not to increase register
pressure, at least not in any significant manner as far as RABLET is
concerned.


So do you have hard data showing that CSE increases register pressure?
Given the thinks CSE does, it would probably be much more useful,
then, to make it possible to have liveness information in CSE so that
it can take register pressure into account in its cost considerations
;-)  No magic new expand is going to make CSE obsolete, and it simply
does too much to just throw it out. (FWIW I'm still working on
simplifying cse.c...)



Clearly there will be a lot of further investigation required once
implementation reaches this point. Ultimately CSE and all RTL
optimizations can be re-evaluated to see if things can be simplified.


*laughs*

Every time some RTL optimizer is re-re-re-re-re-evaluated, it turns
out we lose without it. Good luck to you, but I think you're seriously
underestimating the complexity of things here.



> Modulo the above comments, I don't see anything wrong with your basic
> idea.  But I also wonder whether you couldn't get a similar effect by
> forcing instruction selection to occur before register allocation.  If
> that is done well, reload will have much less work to do.


Hurray.
This is what new-ra did. It was probably the only thing there that
worked well, but it was a great idea. (Sadly it was just reload
rewritten so pre-reload.c was ugly, but the idea was good).


Its clearly not as good as a new register allocator would be, but the
effort to benefit ratio ought to be a lot higher for RABLET than for a
register allocator rewrite.


There is a register allocator rewrite under way, from one of your
co-workers even. Is there any relation between Vlad's project and
yours, or are you going different ways with the same goal in mind? :-D

Gr.
Steven


Re: unable to detect exception model

2006-06-24 Thread Andrew Pinski


On Jun 23, 2006, at 7:42 PM, Jack Howarth wrote:

 I have run into a build problem with tonights gcc trunk on  
MacOS X which didn't exist in yesterdays

svn pull. The gcc trunk build on MacOS X 10.4.6 crashes with...


I can reproduce this, something is miscompiling cc1plus.

-- Pinski



Re: Project RABLET

2006-06-24 Thread Vladimir N. Makarov

Steven Bosscher wrote:



Every time some RTL optimizer is re-re-re-re-re-evaluated, it turns
out we lose without it. Good luck to you, but I think you're seriously
underestimating the complexity of things here.




Its clearly not as good as a new register allocator would be, but the
effort to benefit ratio ought to be a lot higher for RABLET than for a
register allocator rewrite.



There is a register allocator rewrite under way, from one of your
co-workers even. Is there any relation between Vlad's project and
yours, or are you going different ways with the same goal in mind? :-D



 Working on register allocation issues last three years and looking
at the new-ra project I can say that any project in this area has a
big chance to fail despite how good design looked at the first glance.
The problem is in complexity of RTL and lot of ports with very
specific issues which are described by gazillion of macros.  Redesign
and simplification of RTL could solve a lot of problems like code
selection, register allocation etc (although might create others).
But this task is much bigger than introducing tree-SSA because it
means rewriting all machine description files and practically equal to
redoing all ports.  Do we have resources for this?  I don't think so.

 So saying this, my point of view that the more projects we have in
this area, the better chance we will have to solve the problem.
Therefore I really appreciate what Andrew and Bernd Schmidt do.  It
might look as a waste of resources but we can not people force not to
do what they believe and want to do (e.g. we can not force Bernd not
to improve reload because he can work on a new register allocator. He
improves reload because he knows it best than others).

 As for Andrew's proposal, my opinion is that all this
transformations are done too early and we need them to do again on
rtl sometime.

o coalescing.  CSE can create more moves but more important thing is
 the extended coalescing can not be done here (or I don't know how it
 can be done here).  It is about moves generated because of
 two-address architecture constraints (regmove and global tries to
 solve this problem in ad hoc way e.g. through hard register
 preference by global).  It should be part of coalescing pass,
 because removing a move can prevent removing a higher priority move
 generated by reload because of the two address constraints.

o register pressure relief through live range splitting and/or
 rematerialization.  We have no accurate information here, because
 after that there are passes which change the pressure like insn
 scheduling and CSE.  Although insn scheduling has heuristic not to
 increase register pressure, it has very small priority (third or
 fourth).  Therefore insn scheduling can increase the pressure a lot
 (but sometimes decrease it too).  Insn scheduler with register
 renaming being implemented by ISP RAS might solve this problem, if
 it works only after the register allocator.  But this insn scheduler
 can work before the register allocator too and only its usage will
 show will it work only after the register allocator or in
 traditional way (before and the after the register allocator).

 Even without changing the register pressure by subsequent passes,
 there is another problem which is difficulty to calculate the
 register pressure excess.  We don't know what register class will be
 used for a pseudo-register (e.g. AREG or GENERAL_REGS for x86 which
 creates difference 6 in the register pressure).  Although reducing
 register pressure from 100 to 6 will be very helpful, my experience
 shows that the most frequent and interesting cases are on the
 border.

o register renaming is already done and effectively (because it uses
 the data flow analysis framework) by -fweb.  But I think it can be
 done more effectively by out-of-ssa pass.

 Actually what Andrew proposes (and more) I did two years ago on RTL
level close to the register allocator (see gcc summit article
"fighting register pressure in gcc").  The result was not satisfactory
for me and I moved on rewriting the register allocator.  Probably, I
should have committed more what I've done into the mainline.

 Probably what Andrew proposes can be done faster on tree-SSA
although doing it on RTL we would have more accurate information.  In
any case it will improve code in some cases and can be used as a
temporary solution (until new register allocator projects will be done
or forever if they failed).  Andrew's proposal has a sense too with
the code reuse point of view if he wants to move on with RABLE
project.

 As for my project YARA, I don't know when it will be ready for the
mainline (at least one more year) because it includes removing reload
(the biggest and most complicated part of the RA).  It works now only
for x86 and x86_64, generates better code for SPECint2000 and
SPECFp2000 (at least for pentium4, nocona and coming woodcrest.  I
have no free AMD machine to make benchmarking).  I've just started
wo

gcc-4.2-20060624 is now available

2006-06-24 Thread gccadmin
Snapshot gcc-4.2-20060624 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20060624/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.2 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 114971

You'll find:

gcc-4.2-20060624.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.2-20060624.tar.bz2 C front end and core compiler

gcc-ada-4.2-20060624.tar.bz2  Ada front end and runtime

gcc-fortran-4.2-20060624.tar.bz2  Fortran front end and runtime

gcc-g++-4.2-20060624.tar.bz2  C++ front end and runtime

gcc-java-4.2-20060624.tar.bz2 Java front end and runtime

gcc-objc-4.2-20060624.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.2-20060624.tar.bz2The GCC testsuite

Diffs from 4.2-20060617 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.2
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: What is baseline for the testsuite?

2006-06-24 Thread Laurynas Biveinis

Thanks for everybody who replied. I have extracted some information
from the replies and described it in
http://gcc.gnu.org/wiki/TestingGCC, see "Interpretation of testsuite
results". Please review and edit as you see fit.

--
Laurynas


Re: Boehm-gc performance data

2006-06-24 Thread Laurynas Biveinis

2006/6/23, Steven Bosscher <[EMAIL PROTECTED]>:

Don't write off Boehm's GC just yet.  You can't expect to beat
something that has seen a lot of tuning for GCC with something that
you got working only a few days ago. There are a lot of special tricks
especially in ggc-page that may put it at an advantage, but with some
tuning perhaps you can get Boehm's to perform better for GCC.


But of course we are limited to tweaking usage of external Boehm's
collector API, while internal collectors can have their internals
hacked to support GCC's needs best. Nevertheless I will continue
tweaking Boehm's GC: incremental collection, different allocation
routines for large objects w/o pointers, weak pointer support,
excluding roots for large static data...


For the locality thing: Have you already tried using something like
cachegrind or oprofile to compare the cache behavior of gcc with
Boehm's and gcc with ggc?


An excellent suggestion, although my primary working platform is
valgrind-less Cygwin, but I will find a way to gather cache usage
data.


What about allocation strategies?  Perhaps
that's another thing you could toy with to improve the peak memory
usage issue. I don't know how Boehm's GC works, but in ggc-page e.g.
all binary expression 'tree's are allocated on the same bag of pages,
which may help (or not, dunno).


There might be some options here: for objects that do not contain
pointers special API can be used instead of generic one. Moreover I
think that peak memory usage can be reduced by using Boehm's weak
pointer facilities where they should be used: I suspect that some
things are not collected just because they are cached.

Thanks for your comments,
--
Laurynas


Re: Boehm-gc performance data

2006-06-24 Thread Laurynas Biveinis

Hi,


> combine.c: top mem usage: 52180k (13915k). GC execution time 0.66
> (0.61) 4% (4%). User running time: 0m16 (0m14).

Are these with checking on or off?  Normally checking is on, you have
to go out of your way to turn it off.  If it were on, the real
numbers are going to look much worse than the ones you're presented.


Both sets of numbers are with checking on, I guess that makes them comparable?


Also, I've not been following real closely, but the GTY markers are
used by PCH and the dual use of them by GC allow one to find PCH bugs
more quickly and easily.  If we moved entirely to Boehm's, did you
have a plan for the GTY markers and PCH?


As Andrew already has noted, I still use GTY markers at least for
registering additional roots. I don't really have a plan for PCH yet;
I guess that some additional bookkeeping would have to be done in
allocation routines using some weak-pointer based data structure... I
don't know yet.

Thanks for comments,

--
Laurynas


Re: Boehm-gc performance data

2006-06-24 Thread Andrew Pinski


On Jun 24, 2006, at 1:43 PM, Laurynas Biveinis wrote:


An excellent suggestion, although my primary working platform is
valgrind-less Cygwin, but I will find a way to gather cache usage
data.


You could try to use Vtune though.

Thanks,
Andrew Pinski


Re: Boehm-gc performance data

2006-06-24 Thread Laurynas Biveinis

2006/6/23, David Nicol <[EMAIL PROTECTED]>:

Is it possible to turn garbage collection totally off for a null-case
run-time comparison or would that cause thrashing except for very
small jobs?


It should be possible to adopt ggc-none for usage in GCC proper with
little effort. Shouldn't cause trashing very soon: in my C tests, if
GC memory peaks at 30MB, then total GC allocated memory is about 50MB.

--
Laurynas


Re: Visibility and C++ Classes/Templates

2006-06-24 Thread Jason Merrill

Gabriel Dos Reis wrote:

Mark Mitchell <[EMAIL PROTECTED]> writes:
| I'm just not comfortable with the idea of #pragmas affecting
| instantiations.  (I'm OK with them affecting specializations, though; in
| that case, the original template has basically no impact, so I think
| it's fine to treat the specialization case as if it were any other
| function.)

I'm undecided whether #pragmas should not affect explicit
instantiations.  They really are not like implicit instantiations
(which, I agree with you, should not be affected).  Explicit 
instantiations behave more like real declarations than implicit
instantiations. 


Yep.  I'm sympathetic to Mark's position, but still tend to believe that 
the #pragma should affect explicit instantiations.  Explicit 
instantiations are a way to make template instantiations conform more to 
the traditional declaration/definition model.  We ignore the #pragmas 
for implicit instantiations because the user doesn't control the point 
of instantiation; with explicit instantiations, they do.


Explicit instantiations don't behave just like implicit instantiations; 
there are other differences.


Jason


Re: Visibility and C++ Classes/Templates

2006-06-24 Thread Mark Mitchell
Jason Merrill wrote:

> Yep.  I'm sympathetic to Mark's position, but still tend to believe that
> the #pragma should affect explicit instantiations.

I don't feel strongly enough to care; let's do make sure, however, that
we clearly document the precedence, so that people know what to expect.

Thanks,

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: Project RABLET

2006-06-24 Thread Andrew MacLeod
On Sat, 2006-06-24 at 13:04 +0200, Steven Bosscher wrote:
> On 6/24/06, Andrew MacLeod <[EMAIL PROTECTED]> wrote:
> > On Fri, 2006-06-23 at 15:07 -0700, Ian Lance Taylor wrote:
> >=
> > In general, I didnt mention anything that tends not to increase register
> > pressure, at least not in any significant manner as far as RABLET is
> > concerned.
> 
> So do you have hard data showing that CSE increases register pressure?
> Given the thinks CSE does, it would probably be much more useful,
> then, to make it possible to have liveness information in CSE so that
> it can take register pressure into account in its cost considerations
> ;-)  No magic new expand is going to make CSE obsolete, and it simply
> does too much to just throw it out. (FWIW I'm still working on
> simplifying cse.c...)

no, no hard data, its just the kind of activity which can undo what
RABLET proposes doing. Ie, an expression which I go to the effort to
rematerializing in 3 places is likely to be commoned back to the
original live range without stepping in and telling CSE not to do that.

What I really had in mind was marking these register
values/expression/whatever with a flag such that when CSE or GCSE or
whoever asks, they get returned as not-to-be-looked-at. That way they
don't undo the "opportunities" which RABLET has so kindly exposed to
these optimizations :-) 


> 
> > Clearly there will be a lot of further investigation required once
> > implementation reaches this point. Ultimately CSE and all RTL
> > optimizations can be re-evaluated to see if things can be simplified.
> 
> *laughs*
> 
> Every time some RTL optimizer is re-re-re-re-re-evaluated, it turns
> out we lose without it. Good luck to you, but I think you're seriously
> underestimating the complexity of things here.

I was not really looking to rewrite the passes so much as tell them not
to work on certain registers. This could potentially be extended to
ssa_names which the tree optimizers have processed and which get
expanded into single RTL registers. A new expand would know that the
translation into RTL was simple enough that nothing new has really been
exposed to the RTL optimizers.  That was my thought anyway. Then CSE et
al work on whatever is left. Perhaps there are then some hunks that can
be remoived as redundant, or maybe not.


> > Its clearly not as good as a new register allocator would be, but the
> > effort to benefit ratio ought to be a lot higher for RABLET than for a
> > register allocator rewrite.
> 
> There is a register allocator rewrite under way, from one of your
> co-workers even. Is there any relation between Vlad's project and
> yours, or are you going different ways with the same goal in mind? :-D

Totally different scale of project with completely different goal. Had I
set out and actually started writing RABLE, that would be going in
different directions with the same goal. In theory, RABLET should make
the job of any register allocator a bit easier.

These days, I think any register allocator's goal ought to be to assign
registers and be the final authority. No reload undoing any of the work.
That is well beyond the scope of what Im doing. It is within the scope
of what Vlad is doing.

Andrew




Re: Project RABLET

2006-06-24 Thread Andrew MacLeod
On Sat, 2006-06-24 at 12:36 -0400, Vladimir N. Makarov wrote:
> Steven Bosscher wrote:

>   As for Andrew's proposal, my opinion is that all this
> transformations are done too early and we need them to do again on
> rtl sometime.
> 
> o coalescing.  CSE can create more moves but more important thing is

RABLET will do nothing different than is done today.. out of ssa
coalesces ssa_names out the wazoo. In the interest of register pressure
reduction, it may actually coalesce less to split up live ranges, and
leave loads/stores from/to the stack.
 
> 
> o register pressure relief through live range splitting and/or
>   rematerialization.  We have no accurate information here, because
>   after that there are passes which change the pressure like insn

Sure, Im not suggesting that RABLET will reduce the register pressure to
something that isn't going to spill. Far from it. I am saying that
RABLET can reduce something completely unmanageable to something more
manageable. instead of handing the RTL passes a basic block that
contains a peak register pressure of 120 when there are 16 hardware
registers, perhaps it will be a basic block that has been reduced down
to a peak of 25 or something. The calculations at the tree level are
only going to be rough, enough to use as a guideline like that.  

If RA doesnt have to spill its guts, it has a chance to do a better job
I think.

 
>   scheduling and CSE.  Although insn scheduling has heuristic not to
>   increase register pressure, it has very small priority (third or
>   fourth).  Therefore insn scheduling can increase the pressure a lot

sure, but it wont increase it from 25 back up to 140, so there should
still be benefit.

>   Actually what Andrew proposes (and more) I did two years ago on RTL
> level close to the register allocator (see gcc summit article
> "fighting register pressure in gcc").  The result was not satisfactory
> for me and I moved on rewriting the register allocator.  Probably, I
> should have committed more what I've done into the mainline.
> 

I think its hard to do what I am going to do at the RTL level. I have
all the information from tree-ssa available to make quite a few
interesting decisions. and with a rewritten expand, decisions about
whether things are regisrer or memory based can be made more fine grain.
It just seems like the last good place to do some of this work. ANd
perhaps we can get better instructions selected by seeing more that
expand currently sees.
 
>   Probably what Andrew proposes can be done faster on tree-SSA
> although doing it on RTL we would have more accurate information.  In
> any case it will improve code in some cases and can be used as a
> temporary solution (until new register allocator projects will be done
> or forever if they failed).  Andrew's proposal has a sense too with
> the code reuse point of view if he wants to move on with RABLE
> project.

we'll see if that ever happens :-) Im hoping RABLET makes it less
urgent. If not, then its time to consider something different, perhaps
some of the key individual compents such as forced instructoin selection
can be done, or maybe YARA will be a success and you'll take care of it!


Andrew



Re: RFC: __cxa_atexit for mingw32

2006-06-24 Thread Ranjit Mathew
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Danny Smith wrote:
> Adding a real __cxa_atexit to mingw runtime is of course also possible,
> but I thought I'd attempt the easy options first.

When you say "runtime", do you mean libstdc++ or something
like libmingwex.a in "mingw-runtime"? If you mean the former,
you can add this in for GCC 4.2 and work on a real __cxa_atexit()
for GCC 4.3, if you want.

Thanks,
Ranjit.

- --
Ranjit Mathew   Email: rmathew AT gmail DOT com

Bangalore, INDIA. Web: http://rmathew.com/




-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEneajYb1hx2wRS48RAtVKAKCPOIlElw5cVYajj9Ki1LxcRVwgiwCdFEA6
mL/bT1jDUYyTdJp1tQFEfVg=
=iXH6
-END PGP SIGNATURE-


Re: Project RABLET

2006-06-24 Thread Vladimir N. Makarov

Andrew MacLeod wrote:




o register pressure relief through live range splitting and/or
 rematerialization.  We have no accurate information here, because
 after that there are passes which change the pressure like insn
   



Sure, Im not suggesting that RABLET will reduce the register pressure to
something that isn't going to spill. Far from it. I am saying that
RABLET can reduce something completely unmanageable to something more
manageable. instead of handing the RTL passes a basic block that
contains a peak register pressure of 120 when there are 16 hardware
registers, perhaps it will be a basic block that has been reduced down
to a peak of 25 or something. The calculations at the tree level are
only going to be rough, enough to use as a guideline like that.  

 


 Having no information about the final register allocator decision,
the partial register pressure reducing through rematerialization is
not working in many cases.  For example, making rematerialization of

a <- b + c

when you reduce the pressure from 100 to 50 for x86 there is a big
chance that b and c will be not placed in hard registers.  Instead of
one load (of a), two loads (b and c) will be needed.  This result code
is even worse than before reducing pressure.

 So rematerialization in out-of-ssa pass will work well only for full
pressure relief (to the level equal to the number of hard registers)
or close to the full relief.

 But even if you can decrease register pressure relief to the level
of the hard register number, it is hard to know have you achieved the
full register pressure relief because you can not be sure what
register class will be used (e.g. AREG or GENERAL_REGS for x86).
Although it can work for architectures with big regular register
files (e.g. classic RISC processors).

The SSA pressure relief through rematerialization described in
Simpson's theses is oriented for such architectures (with a big
regular register file size of 32 as I remember).  So it can work for
ppc but it will be less successful for major interest platforms x86 and
x86_64.