Re: C++ ABI mismatch crashes

2005-04-18 Thread Marcin Dalecki
On 2005-04-18, at 04:22, Dan Kegel wrote:
Once the gcc C++ ABI stabilizes,
i.e. once all the remaining C++ ABI compliance bugs have
been flushed out of gcc, this requirement can be relaxed."
"Thus in esp. on Judgment Day we will relax this requirement".
The changes in CPU instrution sets surpasses the presumed ABI
stabilization in C++. And not only due to gcc "bugs".


Re: function name lookup within templates in gcc 4.1

2005-04-18 Thread Marcin Dalecki
On 2005-04-18, at 04:37, Gareth Pearce wrote:
So I just started trying out gcc 4.1 - with a program which compiles 
and
runs fine on gcc 3.3.

Attached is a reduced testcase which shows runtime segfault due to 
stack
overflow if compiled with 4.1 but does not with 3.3.  Trivial work 
around is
to move the specific declaration above the template definition.  Now I 
see
potential for this to be 'the way the standard wants it to be', but 
given I
don't have a copy of the standard I am unsure.

The type of 1 doesn't have nothing to do with sstring_t. Thus the 4.1 
behavior
is correct.



Re: My opinions on tree-level and RTL-level optimization

2005-04-18 Thread Steven Bosscher
On Apr 18, 2005 07:41 AM, Roger Sayle <[EMAIL PROTECTED]> wrote:

> 
> On Sat, 16 Apr 2005, Richard Kenner wrote:
> > Although, RTL expansion may introduce new loops, these tend to be
> > rare, and the expanders have all the information they need to
> > hoist/sink invariant expressions and unroll/peel themselves.
> >
> > I disagree.  In order to make the proper decisions about merging givs
> > and chosing which giv should represent a biv, you have to know a lot
> > about the valid addressing modes on the machine and this isn't something
> > the tree level optimizers should have to deal with.
> > ...
> >
> > Simiarly, CSE shouldn't need to process more than a single basic
> > blocks,
> >
> > Again, not clear.  Certainly the costly stuff I put in ages ago to
> > walk through comparisons and around loops needs to go, but there's
> > no reason to tie CSE to a basic block: it can operate until the next
> > label, like it does now.  Admittedly, the number of CSE opportunities
> > won't be great, but why restrict them to a basic block?
> >
> > and GCSE shouldn't need to move anything other than simple
> > expressions.
> >
> > Why would we need a GCSE at the RTL level at all?  I'd guess the number
> > of wins it would produce would be very small.
> >
> > The quality of alias analysis at the RTL-level shouldn't be an issue.
> >
> > Here I disagree the strongest!  Instruction scheduling is rapidly
> > becoming one of the most critical optimizations and must be done at
> > the RTL level.  The quality of instruction scheduling depends quite
> > heavily on the quality of the aliasing information.
> 
> 
> Thanks for your feedback.  I agree with all your points.

But of course you do.  Unfortunately you appear to have little clue what you
are really talking about.  So let me provide you with some loud feedback as
well.


>  I had
> greatly underestimated the importance of RTL alias analysis, especially
> with respect to scheduling.

Which does not mean RTL alias analysis is important.  It just means that
having alias information available in RTL is important.  How you get that 
information there is a separate issue.

You just can not do much better for alias analysis on RTL.  There is almost
no array information, almost no type information, it is almost impossible to
do  points-to analysis, and it is definitely impossible to do flow sensitive
alias analysis.  Also, many interesting aliases GCC can not disambiguate now
are interprocedural ones.

The best approach seems to be to preserve the alias analysis done at the
tree level to assist the current RTL alias analysis in disambiguating memory
references.  Examples of how this can be done exist in the tree-profiling
branch, if you want to look for examples.

You should also read the GCC summit paper from two years ago about Debray
alias analysis (oh, horror!), and then compare the results (bot accuracy and
complexity) of that to the ideas for propagating tree alias info to RTL.


>  The compile-time hog issues with CSE are
> primarily due to deep path-following (AFAIU).

Path following, and running four times (worst case) at -O2.


>  Simple/cheap tricks are
> such as extended basic blocks are clearly a win, but their benefit depends
> on whether we keep GCSE. If we don't then EBBs of course, if we do, the
> EBBs are probably subsumed by GCSE.

You seem to have a misguided view on what CSE does.  What we call CSE is not
a common subexpression elimination pass.  It just happens to catch some CSEs
as well, but the most important thing it does right now is propagation things
forward to select better addressing modes (essentially a kind of instruction
selection), and limited folding/simplifying to clean up the sometimes terrible
things 'expand' can do.

Also, CSE does things across basic block boundaries that GCSE will _never_
be able to do, because GCSE is a PRE implementation, so it can't do expression
simplifications while eliminating redundancies, and it performs code motion
in addition to redundancy elimination. CSE uses value numbering and can only
remove fully redundant expressions.

(Kenner mentioned CSE around loops, but that is already gone.)


> And its very true at the RTL-level
> addressing modes have always been an Achilles' heel.  But whilst
> induction variable selection is very dependent upon target parameters,
> as is prefetching, but it's not yet clear whether uglifying tree-ssa or

I agree it is not clear yet, but at least we are now trying both approaches.
It looks like the TARGET_MEM_REF patch _does_ work, while clearly RTL
addressing mode selection does not (there are many open PRs about that).

It is interesting that you call Zdenek's work "uglifying". I agree that some
of the details of the machine-dependent parts of the tree optimizers are not
the most elegant.  But the reason for that is more just a complete lack of 
target modelling for trees.  If you run things like IVopts and TARGET_MEM_REF
late enough, it looks lik

Re: GCC 4.0 RC1 Available

2005-04-18 Thread Andrew Haley
Geoffrey Keating writes:
 > Andrew Haley <[EMAIL PROTECTED]> writes:
 > 
 > > Ranjit Mathew writes:
 > >  > Geoffrey Keating wrote:
 > >  > [...]
 > >  > > which I see you've already committed a patch for, and a large number
 > >  > > of Java failures.
 > >  > > 
 > >  > > You can see full test results at
 > >  > [...]
 > >  > > 
 > >  > > 
 > >  > > 
 > >  > > for 4.0.0-20050410.
 > >  > 
 > >  > It might be helpful to put your "libjava.log" somewhere
 > >  > or if all the Java failures seem similar, to post
 > >  > the error messages around the "FAIL" lines from your
 > >  > libjava.log.
 > 
 > > # of unexpected failures5
 > 
 > > powerpc-apple-darwin7.8.0
 > > dejagnu HEAD
 > > 
 > > So, this is unrepro, as far as I can see.
 > 
 > It seems like this could be the CLASSPATH issue; did you have gcj
 > installed on your system?  I didn't.

I don't think it was installed.  Unfortunately, the machine isn't
online at the moment, so I can't check.  Anyway, Eric Botcazou seems
to think the problems is fixed on Solaris, so I hope the problem is
fixed on Darwin too.

Andrew.


Re: Processor-specific code

2005-04-18 Thread Joseph S. Myers
On Sun, 17 Apr 2005, Geoffrey Keating wrote:

> > I thought we acted like it is "off", allowing CSE and constant folding 
> > which might be affected by changes in rounding mode.  Certainly some of 
> > Stephen Moshier's testcases (attached to bug 20785) fail.
> 
> The flag that controls this is -ftrapping-math, and it defaults to "on".

I was thinking of -frounding-math, which defaults to "off" and comes with 
a warning in the documentation that it may not yet do everything required.

-- 
Joseph S. Myers   http://www.srcf.ucam.org/~jsm28/gcc/
[EMAIL PROTECTED] (personal mail)
[EMAIL PROTECTED] (CodeSourcery mail)
[EMAIL PROTECTED] (Bugzilla assignments and CCs)


Re: My opinions on tree-level and RTL-level optimization

2005-04-18 Thread Richard Kenner
I take it from your comments, that you are in the camp that believes
that "the sun has not yet set" on the need for RTL optimizers. :-)

I'm actually in the camp that "the sun will never set" on the need for
some RTL optimizers.  We'll be able to remove some of the most costly
of them and the remaining ones will do less work, but I can't see
eliminating the bulk of them (combine, CSE, loop, etc).

The strength of GCC has always been its ability to perform optimization
at all levels and I think that will always continue to be the case.


i386 stack slot optimisation

2005-04-18 Thread Øyvind Harboe
How does the i386 backend optimise the stack slot assignment to minimize
the displacement offset?

What code should I look at?

Or is there some other optimisation at work here...?

I.e.:

; -O0 => large offset
leal8268(%esp), %eax
incl(%eax)

; -O3 => small offset
incl40(%esp)



The source for a test case + the output are attached

gcc-4.0 -S stackframe.c -fomit-frame-pointer -O0 -o stackframe-O0.s
gcc-4.0 -S stackframe.c -fomit-frame-pointer -O3 -o stackframe-O3.s


This thread has a stack slot assignment optimisation patch that has
never been committed to GCC CVS, but the above indicats that there is
some sort of mechanism in GCC already to mitigate this problem...

http://gcc.gnu.org/ml/gcc-patches/2003-01/msg00019.html




-- 
Øyvind Harboe
http://www.zylin.com

int bar(int a);
int test1(int *);

int foo(int a, int b, int c, int d)
{
  int abc[1024];
  int j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z;
  int def[1024];
  for (j=0; j.file   "stackframe.c"
.text
.globl foo
.type   foo, @function
foo:
subl$8284, %esp
movl$0, 8216(%esp)
jmp .L2
.L3:
subl$12, %esp
leal4132(%esp), %eax
pushl   %eax
calltest1
addl$16, %esp
movl$0, 8220(%esp)
jmp .L4
.L5:
movl$0, 8224(%esp)
jmp .L6
.L7:
movl$0, 8228(%esp)
jmp .L8
.L9:
subl$12, %esp
leal36(%esp), %eax
pushl   %eax
calltest1
addl$16, %esp
movl$0, 8228(%esp)
jmp .L10
.L11:
movl$0, 8236(%esp)
jmp .L12
.L13:
movl$0, 8240(%esp)
jmp .L14
.L15:
movl$0, 8248(%esp)
jmp .L16
.L17:
movl$0, 8252(%esp)
jmp .L18
.L19:
movl$0, 8256(%esp)
jmp .L20
.L21:
movl$0, 8260(%esp)
jmp .L22
.L23:
movl$0, 8264(%esp)
jmp .L24
.L25:
movl$0, 8268(%esp)
jmp .L26
.L27:
leal8268(%esp), %eax
incl(%eax)
.L26:
subl$12, %esp
pushl   8280(%esp)
callbar
addl$16, %esp
cmpl8268(%esp), %eax
jg  .L27
leal8264(%esp), %eax
incl(%eax)
.L24:
subl$12, %esp
pushl   8276(%esp)
callbar
addl$16, %esp
cmpl8264(%esp), %eax
jg  .L25
leal8260(%esp), %eax
incl(%eax)
.L22:
subl$12, %esp
pushl   8272(%esp)
callbar
addl$16, %esp
cmpl8260(%esp), %eax
jg  .L23
leal8256(%esp), %eax
incl(%eax)
.L20:
subl$12, %esp
pushl   8268(%esp)
callbar
addl$16, %esp
cmpl8256(%esp), %eax
jg  .L21
leal8252(%esp), %eax
incl(%eax)
.L18:
subl$12, %esp
pushl   8264(%esp)
callbar
addl$16, %esp
cmpl8252(%esp), %eax
jg  .L19
leal8248(%esp), %eax
incl(%eax)
.L16:
subl$12, %esp
pushl   8260(%esp)
callbar
addl$16, %esp
cmpl8248(%esp), %eax
jg  .L17
leal8240(%esp), %eax
incl(%eax)
.L14:
subl$12, %esp
pushl   8252(%esp)
callbar
addl$16, %esp
cmpl8240(%esp), %eax
jg  .L15
leal8236(%esp), %eax
incl(%eax)
.L12:
subl$12, %esp
pushl   8248(%esp)
callbar
addl$16, %esp
cmpl8236(%esp), %eax
jg  .L13
leal8232(%esp), %eax
incl(%eax)
.L10:
subl$12, %esp
pushl   8244(%esp)
callbar
addl$16, %esp
cmpl8232(%esp), %eax
jg  .L11
leal8228(%esp), %eax
incl(%eax)
.L8:
subl$12, %esp
pushl   8240(%esp)
callbar
addl$16, %esp
cmpl8228(%esp), %eax
jg  .L9
leal8224(%esp), %eax
incl(%eax)
.L6:
subl$12, %esp
pushl   8236(%esp)
callbar
addl$16, %esp
cmpl8224(%esp), %eax
jg  .L7
leal8220(%esp), %eax
incl(%eax)
.L4:
subl$12, %esp
pushl   8232(%esp)
callbar
addl$16, %esp
cmpl8220(%esp), %eax
jg  .L5
leal8216(%esp), %eax
incl(%eax)
.L2:
subl$12, %esp
pushl   8228(%esp)
callbar
addl$16, %esp
cmpl8216(%esp), %eax
jg  .L3
addl$8284, %esp
ret
.size   foo, .-foo
.ident  "GCC: (GNU) 4.0.0 20050410 (prere

Re: My opinions on tree-level and RTL-level optimization

2005-04-18 Thread Richard Kenner
Unfortunately you appear to have little clue what you are really
talking about.  So let me provide you with some loud feedback as well.

Please try to keep this discussion on a civil level!

>  I had greatly underestimated the importance of RTL alias analysis,
>  especially with respect to scheduling.

Which does not mean RTL alias analysis is important.  It just means
that having alias information available in RTL is important.  How you
get that information there is a separate issue.

I think Roger simply mis-spoke because in his original message, he
said what you said: the important issue is having the alias
information available in RTL.  Much (but not all: eg., SUBREG info) of
that information is best imported down from the tree level.

You seem to have a misguided view on what CSE does.  What we call CSE
is not a common subexpression elimination pass.  It just happens to
catch some CSEs as well, but the most important thing it does right
now is propagation things forward to select better addressing modes
(essentially a kind of instruction selection), and limited
folding/simplifying to clean up the sometimes terrible things 'expand'
can do.

This is a very inaccurate characterization of CSE.  Yes, it does those
things, but eliminating common subexpressions is indeed the major task
it performs.

(Kenner mentioned CSE around loops, but that is already gone.)

Sorry. But that is the cheaper of the two kludges (and I'm allowed to use
that word since I wrote them).

I agree that some of the details of the machine-dependent parts of the
tree optimizers are not the most elegant.  

I think there's a serious conceptual issue in making the tree level too
machine-dependent.  The *whole point* of doing tree-level optimizations
is to do machine-*independent* optimizations.  Trees are machine-independent
and RTL is machine-dependent.  If we go too far away from that, I think
we miss the point.

Besides, the RTL optimizers are not exactly a part of GCC to be proud
of if "ugliness" is a measure.

Really?

Of course GCC will always need a low-level IR.  But, combine is
instruction selection in the worst possible way; 

It served GCC well for decades, so I hardly think that's a fair statement.

reload is register allocation in the worst possible way, 

Reload is not supposed to do register allocation.  To the extent that
it does, I agree with you.  But what this has to do with the issue of
tree vs. RTL optimization is something I don't follow.  Surely you
aren't suggesting doing register allocation at the tree level?


libraries - double set

2005-04-18 Thread Ray Holme
After encountering problems with 3.4.3 of gcc (it did not compile a
package I really needed to have - yes yes I am sure it is right and
better, BUT ...), I went back to 3.3.3 for a while. I just noticed that
there are two copies of libraries installed the install script on my
machine (one in /opt/local/lib==/usr/local/lib) and one in
/opt/lib/sparcv9/==/usr/local/lib/sparcv9). They are installed by the same
install script as they are just about 1 minute apart in disk time
signatures but they are different in size by quite a bit. I did a little
more research and found lots of such things). Since some of these things
are quite large -

 What is the purpose of having two such identically names libraries?
  Or alternatively - which is the real one that I should be using?

The ones in /opt/lib are always larger! 
 And /opt/lib is the one mostly used for linking.
  SO, can I (should I) blow away the sparcv9 directory?

Confused,

Ray Holme









Re: libraries - double set

2005-04-18 Thread Eric Botcazou
>  What is the purpose of having two such identically names libraries?

To support 2 architectures, 32-bit (sparcv7) and 64-bit (sparcv9).

>   Or alternatively - which is the real one that I should be using?

Both, but the compiler automatically picks up the right one, depending on 
whether you compile with -m32 or -m64.

-- 
Eric Botcazou


Re: C++ ABI mismatch crashes

2005-04-18 Thread Mike Hearn
On Sun, 2005-04-17 at 19:22 -0700, Dan Kegel wrote:
> But I can't shake the feeling that it's crazy that libaspell
> got linked against two different C++ libraries.  Can you
> try creating a minimal test case demonstrating this
> without involving inkscape?  If so, maybe it's a glibc
> shared library loader bug?

This thread got moved to the libstdc++ list. I have created a minimal
test case which also shows a misbind of the std::string + op overload:

  http://gcc.gnu.org/ml/libstdc++/2005-04/msg00177.html

I'm afraid any solution that involves statically linking (or changing)
libraries like aspell won't work for me. I need to find a fix that can
be deployed in the field.

I am currently trying to bring myself up to speed on how all this works,
but the C++ ABI is pretty complicated. I'm still trying to figure out
exactly what COMDAT sections are, and how template instantiation works
at the ELF level. Joe suggested that it seems to be related to this. 

I thought (but I must be wrong) that COMDAT sections and weak symbols
were only put into .o files: once the .o files are combined into the
final binary or shared library, the compile time link editor merges the
definitions into a local symbol. I also still don't understand why the +
op overload is showing up in the symbol table at all. I thought methods
defined in headers would just be fancily copy/pasted into wherever they
were used.

thanks -mike



front-end tools for preprocessor / macro expansion

2005-04-18 Thread Henrik Sorensen
For the PL/I front-end project (pl1gcc.sourceforge.net), I am just about to 
begin to add  a preprocessor expansion step, and was wondering what other 
front-end do.

My initial thoughts were to create a completely separate program that just do 
the preprocessing and passes the output to the compiler.

Some background info regarding the PL/I preprocessor.

The PL/I processor language is more of a real text pre-processor that anything 
else, eg it is possible to define functions and have declared variables of 
data type either numeric or character based, and control statements includes 
for-loops, if-then constructs and goto statements.


any thoughts, hints is much appreciated.

Henrik


Re: My opinions on tree-level and RTL-level optimization

2005-04-18 Thread Steven Bosscher
On Apr 18, 2005 02:51 PM, Richard Kenner <[EMAIL PROTECTED]> wrote:

> > Unfortunately you appear to have little clue what you are really
> > talking about.  So let me provide you with some loud feedback as well.
> 
> Please try to keep this discussion on a civil level!

I am (for a change, maybe) not the one who started making the
discussion uncivil.  And what I wrote is not untrue either. Roger
didn't exactly display knowledge of what he was talking about, see
the many false statements he made about the RTL optimizers.


> > You seem to have a misguided view on what CSE does.  What we call CSE
> > is not a common subexpression elimination pass.  It just happens to
> > catch some CSEs as well, but the most important thing it does right
> > now is propagation things forward to select better addressing modes
> > (essentially a kind of instruction selection), and limited
> > folding/simplifying to clean up the sometimes terrible things 'expand'
> > can do.
> 
> This is a very inaccurate characterization of CSE.  Yes, it does those
> things, but eliminating common subexpressions is indeed the major task
> it performs.

Only on the first pass.  I have looked at how many actual common
subexpressions it eliminates in the two passes following GCSE, and
it is really almost nothing.  CSE2 only catches a significant number
if flag_loop_optimize2 and/or flag_tracer are set. CSE1 does catch
some common subexpressions, most of those appear to come from
expanding a tree expression to multiple RTL expressions (e.g. two
tree MULT_EXPRs for which expand_* create a cse).

It is undisputed that CSE _can_ remove common subexpressions.  It
really just does not appear to do that very often in practice.

It is unfortunately hard to tell exactly what CSE does, because it
is doing so many things and it does not report the replacements it
makes.  When I looked at this, I simply disabled various parts of
CSE to see what the effect would be on the resulting code.  I also
disabled CSE1 completely and replaced it with a cselib based pass,
which didn't catch many cses and missed many of the other things
CSE does (interestingly, the effect was worst for -fPIC, but I have
not yet figured out why).


> > (Kenner mentioned CSE around loops, but that is already gone.)
> 
> Sorry. But that is the cheaper of the two kludges (and I'm allowed to use
> that word since I wrote them).

:-)  I didn't know you wrote that.  It certainly wasn't a really
problematic piece of code, I think.  When I removed it, the plan was
to make CSE work on extended basic blocks.  But it turned out that
CSE around basic blocks (-fcse-skip-blocks) was still a very useful
thing to do (and it still was, when I looked at it again a couple of
weeks ago).


> > I agree that some of the details of the machine-dependent parts of the
> > tree optimizers are not the most elegant.  
> 
> I think there's a serious conceptual issue in making the tree level too
> machine-dependent.

Agreed with "too machine-dependent".  But what is that, exactly?  No
machine dependence at all?  Some, and if so, what?  I'm curious what
you think about this, I have no idea yet, really.  I tried a simple
lowering pass last year, but it didn't work out very well.  I didn't
try very hard either, though.

>  The *whole point* of doing tree-level optimizations
> is to do machine-*independent* optimizations.  Trees are machine-independent
> and RTL is machine-dependent.  If we go too far away from that, I think
> we miss the point.

Some machine-dependence would not be bad, I think.  Sure, you would
never want to explicitly expose all the details about the target
details.  But a machine specific lowering pass, for example, would
probably be a good thing.  Just to expose more expressions to the
tree optimizers, so they can do more work and the RTL optimizers do
not have to do that much work. Many other compilers do this, also.
I know that has never been a convincing argument on this list, but
it is still a sign that it is not all that insane to do, at least.


> > Besides, the RTL optimizers are not exactly a part of GCC to be proud
> > of if "ugliness" is a measure.
> 
> Really?

Well, yes.  I mean no offense to the people who wrote RTL optimizers
in the past, when many pieces of infrastructure we take for granted
now were simply not available.  But many passes could certainly use
a good cleanup or rewrite.  But, much to the credit of their authors,
many of the existing RTL passes just _work_ quite well, which is
why it is so difficult to rewrite them :-/

If you look at e.g. how regmove handles (or really does not handle)
basic block boundaries, or at how CSE does its path following (which,
no doubt, made sense when it was written that way), or at the kludges
in sched-ebb.c (didn't write it, still use the word ;-) to tear down
and later on fix up basic block boundaries, then yes, that is ugly.

Similarly, if you see how heavily some ports rely on relo

missed mail

2005-04-18 Thread Aldy Hernandez
Hi folks.

All mail addressed to me from Apr-3 to Apr-10 was not delivered.  I was
having problems with my mail setup.  Please resend.

My apologies for reporting this so late; I've been sequestered at
customer sites with no internet for the past week after my vacation :-(.

Cheers.
Aldy


Re: My opinions on tree-level and RTL-level optimization

2005-04-18 Thread Paolo Bonzini
I think Roger simply mis-spoke because in his original message, he
said what you said: the important issue is having the alias
information available in RTL.  Much (but not all: eg., SUBREG info) of
that information is best imported down from the tree level.
Well, paradoxical subregs are just a mess: optimizations on paradoxical 
subregs are better served at the tree level, because it is just 
obfuscation of e.g. QImode arithmetic.

Indeed, my patch removed an optimization on paradoxical subregs, and 
kept an optimization on non-paradoxical subregs.

Take this code:
long long a, b, c, d;
int x;
...
c = a * b;
d = (int) x * (a * b);
In my view, tree-level optimization will catch (a * b) as a redundant 
expression.  RTL-level optimization will catch that the high-part of 
"(int) x" is zero.

Roger proposed lowering 64-bit arithmetic to 32-bit in tree-ssa!  How 
would you do it?  Take

long long a, b, c;
c = a + b;
Would it be
c = ((int)a + (int)b)
+ ((int) (a >> 32) + (int) (b >> 32)
   + ((unsigned int) a < (unsigned int) b)) << 32;
Or will you introduce new tree codes and uglifying tree-ssa?  Seriously...
This is a very inaccurate characterization of CSE.  Yes, it does those
things, but eliminating common subexpressions is indeed the major task
it performs.
It was.  Right now, the only thing that fold_rtx tries to simplify is
   (mult:SI (reg:SI 58) 8)
to
   (ashiftrt:SI (reg:SI 58) 3)
Only to find out it is not a valid memory_operand...  I have a patch to 
completely disable calling fold_rtx recursively, only equiv_constant. 
That was meant to be part 3/n of the cleanup fold_rtx series.  I was 
prepared to take responsibility for every pessimization resulting from 
these cleanups, and I expected to be sure I'd find a better way to do 
the same thing.

A 7000-lines constant propagator...
I think there's a serious conceptual issue in making the tree level too
machine-dependent.  The *whole point* of doing tree-level optimizations
is to do machine-*independent* optimizations.  Trees are machine-independent
and RTL is machine-dependent.  If we go too far away from that, I think
we miss the point.
No, the whole point of doing tree-level optimizations is to be aware of 
high-level concepts before they are lowered.  No need to worry about 
support for QImode-size arithmetic.  No need to worry if 64-bit 
multiplication had to be lowered.

Besides, the RTL optimizers are not exactly a part of GCC to be proud
of if "ugliness" is a measure.
Really?
The biggest and less readable files right now are combine.c, reload.c, 
reload1.c.  cse.c is big (though not extreme) but unreadable.

OTOH, stuff like simplify-rtx.c or especially fold-const.c is big but 
readable.

Of course GCC will always need a low-level IR.  But, combine is
instruction selection in the worst possible way; 

It served GCC well for decades, so I hardly think that's a fair statement.
Never heard about dynamic programming?
reload is register allocation in the worst possible way, 

Reload is not supposed to do register allocation.  To the extent that
it does, I agree with you.  But what this has to do with the issue of
tree vs. RTL optimization is something I don't follow.  Surely you
aren't suggesting doing register allocation at the tree level?
No, he's suggesting cleaning up stuff, so that it is easier to stop 
doing things in the worst possible way.  He's suggesting to be realistic 
once code has run completely out of control.

Luckily some GWP people do care about cleaning up.  Richard Henderson 
did a lot of work on cleaning up RTL things left from olden times (think 
eh, nested functions, addressof, save_expr,...), Zack did some work on 
this ground in the past as well, Bernd is maybe the only guy who could 
pursue something such as reload-brench...

I hate to make "clubs" out of a community, but it looks like only some 
people care of the state of the code...  Steven has done most of the 
work for removing the define_function_unit processor descriptions.  I 
removed ~5000 lines of code after tree-ssa went in (including awful 
stuff such as protect_from_queue, which made sense maybe in 1990, and 
half of stmt.c).  Kazu is also in the CSE-cleanup game.  Maybe, link in 
my case, it's only because I have limited time to spend on GCC and think 
that cleaning up is a productive way to use this time.  But anyway I 
think it is worth the effort.

Paolo


GCC 4.0 RC2 Available

2005-04-18 Thread Mark Mitchell

RC2 is available here:

  ftp://gcc.gnu.org/pub/gcc/prerelease-4.0.0-20050417/

As before, I'd very much appreciate it if people would test these bits
on primary and secondary platforms, post test results with the
contrib/test_summary script, and send me a message saying whether or
not there are any regressions, together with a pointer to the results.

I'll be updating the (now ill-named)

  http://gcc.gnu.org/wiki/Last-Minute%20Requests%20for%204.0.0

page with that information as it comes in.

Except for any previously approved but not yet applied RC2 patches,
the 4.0 branch is now frozen.  Even changes to the documentation need
my approval now -- not because I'll have anything to say, but because
I might spin the release at any moment, and I don't want to have a
situation where we somehow get half a patch.

The changes that I anticipate between now and the final release are
(a) documentation changes, (b) a patch for 20991, and (c) a possible
patch for 20973.  Other than that, I will only consider patches that
fix egregious problems, like a fail to bootstrap on a primary
platform.

For a dot-zero release, GCC 4.0 is a nice piece of work.  We'll get it
out the door in the next few days; then on to 4.0.1...

--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]



Re: My opinions on tree-level and RTL-level optimization

2005-04-18 Thread Richard Kenner
> Please try to keep this discussion on a civil level!

I am (for a change, maybe) not the one who started making the
discussion uncivil.  

I'm sorry, but in my opinion that doesn't matter.  I don't call people
names or make personal attacks no matter what I'm responding to.

> This is a very inaccurate characterization of CSE.  Yes, it does those
> things, but eliminating common subexpressions is indeed the major task
> it performs.

Only on the first pass.  I have looked at how many actual common
subexpressions it eliminates in the two passes following GCSE, and
it is really almost nothing.  

OK, but that's a completely different matter.  I thought you were talking
about the *complexity* of the passes and what the code was *intending*
to do.  It's certainly the case that each successive pass finds less to
do and that also means that with tree-ssa optimizers, the RTL optimizers
have less to do.  I see that as a *good thing*, but one that doesn't at
all address the complexity of any code.

> (Kenner mentioned CSE around loops, but that is already gone.)

It certainly wasn't a really problematic piece of code, I think.

Not in complexity, but certainly in time.

But it turned out that CSE around basic blocks (-fcse-skip-blocks) was
still a very useful thing to do (and it still was, when I looked at it
again a couple of weeks ago).

And I would *very much* like to know why!  My view was always that any
global CSE at all should render it unnecessary but GCSE did not.  Now we're
doing extensive global optimization at tree level, but it's *still* needed.
That shouldn't be the case.  I think we *really* need to understand why
it's still needed as part of the issue of replacing optimizers.

Agreed with "too machine-dependent".  But what is that, exactly?  No
machine dependence at all?  Some, and if so, what?  I'm curious what
you think about this, I have no idea yet, really.

I certainly don't have a precise idea, though my gut feeling is that
anything but the most simple parameterization is too much.  I think this
is one of the major issues facing GCC today.

But a machine specific lowering pass, for example, would probably be a
good thing.  Just to expose more expressions to the tree optimizers,
so they can do more work and the RTL optimizers do not have to do that
much work.

I'm not sure this requires much, if any, machine-specificity.  The major
source of expressions we lose are addressing and that's pretty
machine-independent.  The problem on machines like the x86 is that if we
go too far, combine can't fix it up because the expression will be used
in multiple places.  I think this sort of thing is also one of the major
issues we need to face.

But, much to the credit of their authors, many of the existing RTL
passes just _work_ quite well, which is why it is so difficult to
rewrite them :-/

More to the point, a lot of the complexity in the RTL passes reflects the
underlying complexity of the machines that code is being generated for,
so there's a limit as to how simple they can become no matter what IL is used.

If you look at e.g. how regmove handles (or really does not handle)
basic block boundaries, 

To be honest, regmove is one of my least favorite RTL passes ...

or at how CSE does its path following (which, no doubt, made sense
when it was written that way)

It's not that it "made sense" when it was written that way, just the
there was no other way to do it at the time.  Now, one might be
tempted to rewrite it using the CFG infrastructure, except that it
should be eliminated entirely, so that would be a waste of time.

Similarly, if you see how heavily some ports rely on reload to fix up
instructions, 

That's *always* been a mistake and one that I've been fighting for years.
This is the distinction between predicates and constraints.  On the ports
I've done, I've been very careful to keep the amount of rewriting by reload
to an absolute minimum and I know others have been equally careful.
However, the ports that date back to before GCC 2 have had to have htis
retrofitted and that hasn't always been done as well as it could have been.

Combine finds the insns to combine in a rather random pick-and-try
fashion, with at most two or three instructions.  

No, exactly two or three.  It's not *random*.

And it intermixes the actual instruction (or rather, insn) selection
with a bunch of optimizations, which IMHO should be split out into a
separate pass.

Agreed.  That's been on my list for a long time and Roger and others
have come a long way there with simplify-rtx.c.


Re: My opinions on tree-level and RTL-level optimization

2005-04-18 Thread Richard Kenner
Well, paradoxical subregs are just a mess:

Agreed, but I wasn't talking about the paradoxical case.

optimizations on paradoxical subregs are better served at the tree
level, because it is just obfuscation of e.g. QImode arithmetic.

Not clear: I think this is a more complex issue.

The biggest and less readable files right now are combine.c, reload.c, 
reload1.c.  cse.c is big (though not extreme) but unreadable.

Hmm.. I'd consider combine.c quite readable.  I agree about reload, of course.

Luckily some GWP people do care about cleaning up.  Richard Henderson
did a lot of work on cleaning up RTL things left from olden times
(think eh, nested functions, addressof, save_expr,...), Zack did some
work on this ground in the past as well, Bernd is maybe the only guy
who could pursue something such as reload-brench...

Lots of us care about cleanups.  It's actually been one of my priorities
too over the years.




hot/cold vs glibc

2005-04-18 Thread Daniel Jacobowitz
Hi Caroline,

You've made this change to assemble_start_function (unidiff format):

+  last_text_section = no_section;
+  in_section = no_section;
   resolve_unique_section (decl, 0, flag_function_sections);
+
+  /* Switch to the correct text section for the start of the function.  */
+
   function_section (decl);
+  if (flag_reorder_blocks_and_partition 
+  && !hot_label_written)
+ASM_OUTPUT_LABEL (asm_out_file, hot_section_label);

Why did you need to reset in_section?  This causes an extra .text to be
emitted before every function.  It also breaks the (ugly, non-unit-at-a-time
compatible, but otherwise working) mechanism that glibc uses to generate
crti.o and crtn.o, so I can no longer build a mips64-linux toolchain using
HEAD.

-- 
Daniel Jacobowitz
CodeSourcery, LLC


Re: Processor-specific code

2005-04-18 Thread Vincent Lefevre
On 2005-04-17 19:34:40 -0700, Brooks Moses wrote:
> Yes, the standard refers to changing the rounding mode "if the processor
> supports [it]" -- but consider what the standard means by "processor":
> "The combination of a computing system and the means by which programs
> are transformed for use on that computing system is called a processor
> in this standard." This very clearly includes the compiler, not just the
> hardware.

Cannot other languages call functions written in Fortran? If one can
do that, what if some Fortran function inherits a FPU state different
from the default one (e.g. in the rounding mode to zero)?

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / SPACES project at LORIA


Re: My opinions on tree-level and RTL-level optimization

2005-04-18 Thread Roger Sayle

On Mon, 18 Apr 2005, Paolo Bonzini wrote:
> Roger proposed lowering 64-bit arithmetic to 32-bit in tree-ssa!  How
> would you do it?  Take
>
>  long long a, b, c;
>  c = a + b;
>
> Would it be
>
>  c = ((int)a + (int)b)
>  + ((int) (a >> 32) + (int) (b >> 32)
> + ((unsigned int) a < (unsigned int) b)) << 32;
>
> Or will you introduce new tree codes and uglifying tree-ssa?
> Seriously...


I think you may have misinterpreted or over-interpreted
what I meant by performing optimizations/lowering, earlier vs.
later.  When I suggested the i386.c backend should lower DImode
operations earlier, my intention was to lower during RTL expansion
instead of where it currently happens after reload.  Whilst doing
it in tree-ssa would technically also be "doing it earlier", this
isn't really what I had in mind.

The fact that DImode moves aren't lowered, and that the high and
low parts of a DImode register can't be independently live/dead
(for register allocation) is the reason for PR17236, and why GCC
generates poorer code than Intel on something as simple as "a * b".


Not that it would be impossible to do this with trees.  It's not
uncommon for representations to be lowered, for example high-level
trees to low-level trees.  RTL has always done this, where addressof
could be seen as a lowering pass, as can combine, reload, regstack
and the process of splitting,  where the RTL representation afterwards
is a more accurate description of the generated code than the RTL
representation before.

Roger
--



Re: My opinions on tree-level and RTL-level optimization

2005-04-18 Thread Daniel Berlin
   But it turned out that CSE around basic blocks (-fcse-skip-blocks) was
   still a very useful thing to do (and it still was, when I looked at it
   again a couple of weeks ago).
And I would *very much* like to know why!  My view was always that any
global CSE at all should render it unnecessary but GCSE did not.  Now we're
doing extensive global optimization at tree level, but it's *still* needed.
That shouldn't be the case.  I think we *really* need to understand why
it's still needed as part of the issue of replacing optimizers.

You seem to be confused.
We've known *why* CSE does stuff that GCSE doesn't catch for almost as 
long as we've had GCSE.

It's because CSE *doesn't just do CSE*!
It does value numbering, and a bunch of other things, which are not really 
implemented at the RTL level as seperate passes, and reordering RTL 
passes/running them multiple times is not cheap or easy, like it is 
with most SSA based tree passes.

Also, the viewpoint that absolutely everything CSE currently does needs to 
be done in order to remove CSE is wrong.

The correct viewpoint is "we shouldn't remove CSE until every *profitable* 
transformation it makes is subsumed by something else".

Otherwise, you've started with the unproven assumption that every 
transformation CSE makes is profitable.

--Dan


Re: hot/cold vs glibc

2005-04-18 Thread Caroline Tice
On Apr 18, 2005, at 8:35 AM, Daniel Jacobowitz wrote:
Hi Caroline,
You've made this change to assemble_start_function (unidiff format):
+  last_text_section = no_section;
+  in_section = no_section;
   resolve_unique_section (decl, 0, flag_function_sections);
+
+  /* Switch to the correct text section for the start of the 
function.  */
+
   function_section (decl);
+  if (flag_reorder_blocks_and_partition
+  && !hot_label_written)
+ASM_OUTPUT_LABEL (asm_out_file, hot_section_label);

Why did you need to reset in_section?  This causes an extra .text to be
emitted before every function.  It also breaks the (ugly, 
non-unit-at-a-time
compatible, but otherwise working) mechanism that glibc uses to 
generate
crti.o and crtn.o, so I can no longer build a mips64-linux toolchain 
using
HEAD.


That was because, up until very recently, the call to function_section 
there
used in_section to partially determine which section to switch to.  In 
the
code just above the bit in your email, there were switches to the 
hot/cold
sections to properly set alignments for the function and to insert the
hot/cold start labels.  Those calls to switch sections also set the 
value for
in_section.  I needed to blank it out again so that function_section 
would
go to the correct section.

I believe the current implementation of function_section no longer 
depends
on the value of in_section, so you could probably remove that line
without any adverse affects.

Just out of curiousity, could you be more explicit about exactly how 
having
an extra .text breaks the mechanism?  That worries me...

-- Caroline
[EMAIL PROTECTED]


line-map question

2005-04-18 Thread Devang Patel
From line_map comment at (libcpp/include/line-map.h)
/* Physical source file TO_FILE at line TO_LINE at column 0 is  
represented
   by the logical START_LOCATION.  TO_LINE+L at column C is  
represented by
   START_LOCATION+(L*(1<

What happens when column number is >= 128 ? This is PR 20907.
   and the result_location is less than the next line_map's  
start_location.
   (The top line is line 1 and the leftmost column is column 1; line/ 
column 0
   means "entire file/line" or "unknown line/column" or "not  
applicable".)
   INCLUDED_FROM is an index into the set that gives the line mapping
   at whose end the current one was included.  File(s) at the bottom
   of the include stack have this set to -1.  REASON is the reason for
   creation of this line map, SYSP is one for a system header, two for
   a C system header file that therefore needs to be extern "C"
   protected in C++, and zero otherwise.  */

Thanks,
-
Devang


Re: hot/cold vs glibc

2005-04-18 Thread Daniel Jacobowitz
On Mon, Apr 18, 2005 at 09:47:55AM -0700, Caroline Tice wrote:
> Just out of curiousity, could you be more explicit about exactly how 
> having
> an extra .text breaks the mechanism?  That worries me...

asm(".section .init");

void _init() {
  asm("@@@ MARKER @@@);
}

Then sed is used to separate the prologue (crti.o) and epilogue
(crtn.o) into different files.

Yes, it's a hack.  It's not much different from GCC's hack in
crtstuff.c.

-- 
Daniel Jacobowitz
CodeSourcery, LLC


Re: inline-unit-growth trouble

2005-04-18 Thread Andreas Krebbel
Hi,

thanks for your responses.

I've debugged a little further and found out that
the testcase breakage was caused by (the elfos.h part):

http://gcc.gnu.org/ml/gcc-patches/2005-04/msg00913.html

The elfos.h part of the patch was reverted on 04/14/2005:
http://gcc.gnu.org/ml/gcc-patches/2005-04/msg01604.html

and since then everything is back working again.

As already pointed out in the discussion thread, it is probably not
the patch to blame here. It seems that the patch reveals a linker problem.

Without the patch reverted I can reproduce the following failure on
31bit (s390) as well as on 64bit (s390x) using 
binutils-2.15.92.0.2 from kernel.org:

/usr/bin/ld: Warning: size of symbol 
`__gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true>::_S_get_pool()' 
changed from 136 in ta.o to 208 in tc.o
/usr/bin/ld: Warning: size of symbol 
`__gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, 
true>::_S_destroy_thread_key(void*)' changed from 56 in ta.o to 216 in tc.o
/usr/bin/ld: Warning: size of symbol 
`__gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, true>::_S_initialize()' 
changed from 56 in ta.o to 224 in tc.o
/usr/bin/ld: Warning: size of symbol 
`__gnu_cxx::__common_pool_policy<__gnu_cxx::__pool, 
true>::_S_initialize_once()' changed from 116 in ta.o to 288 in tc.o

The warnings disappear when --parm inline-unit-growth=.. is used to increase 
the treshold.
But you are right that they can also be triggered using different optimization
levels. So I agree that the linker should be able to sort this out.

Without the patch the symbols above are marked as "weak" in both object files. 
With the
patch both are "global"._bfd_elf_merge_symbol in binutils wants to see one of
them to be "weak" in order to accept a different size.

Because both of the symbols are in comdat .group sections and therefore are 
explicitly supposed to be merged the linker should accept a size change 
even if they are marked "global". This is already fixed in the current binutils 
cvs.


Bye,

-Andreas-


Re: My opinions on tree-level and RTL-level optimization

2005-04-18 Thread Richard Kenner
You seem to be confused.  We've known *why* CSE does stuff that GCSE
doesn't catch for almost as long as we've had GCSE.

It's because CSE *doesn't just do CSE*!  It does value numbering, and
a bunch of other things, which are not really implemented at the RTL
level as seperate passes, 

Well, sure, but most of the benefit of running those is within a basic
block.  I don't see why the combination of a global just-CSE with the
current intra-block code wouldn't be effective.

Also, the viewpoint that absolutely everything CSE currently does
needs to be done in order to remove CSE is wrong.

I'm not talking about removing CSE.  Indeed, the part of CSE that
chooses the best operand from a cost point of view likely needs to stay
forever.  I was just talking about removing the following of jumps.

The correct viewpoint is "we shouldn't remove CSE until every
*profitable* transformation it makes is subsumed by something else".

And, as I understand it, the claim is that this is not yet true for the
following of jumps and my question is why.


Can I comment out a GTY variable?

2005-04-18 Thread H. J. Lu
I am trying to comment out

static GTY (()) int foo = 0;

with

#if 0
static GTY (()) int foo = 0;
#endif

But I got an error saying something like

./gth:44: error: foo undeclared here (not in a function)

Is that expected? How can I comment it out?


H.J.



Re: Can I comment out a GTY variable?

2005-04-18 Thread Andrew Pinski
On Apr 18, 2005, at 2:11 PM, H. J. Lu wrote:
I am trying to comment out
static GTY (()) int foo = 0;
with
#if 0
static GTY (()) int foo = 0;
#endif
But I got an error saying something like
./gth:44: error: foo undeclared here (not in a function)
Is that expected? How can I comment it out?
Yes, gengtype does not read through preprocessor directives.
Comment it out like a normal comment and not using preprocessor
directives.
Thanks,
Andrew Pinski


i386 stack slot optimisation

2005-04-18 Thread Øyvind Harboe
Answer: FRAME_GROWS_DOWNWARD.

The stack slots for the registers spilled on the
stack are allocated last. When the frame grows downward,
the displacement is smaller than if the frame grows upward.

Thanks.

-- 
Øyvind Harboe
http://www.zylin.com



Unnecessary sign- and zero-extensions in GCC?

2005-04-18 Thread Nicholas Nethercote
Hi,
I've been looking at GCC's use of sign-extensions when dealing with 
integers smaller than a machine word size.  It looks like there is room 
for improvement.

Consider this C function:
short g(short x)
{
   short i;
   for (i = 0; i < 10; i++) {
  x += i;
   }
   return x;
}
On x86, using a GCC 4.0.0 20050130, with -O2 I get this code:
g:
pushl   %ebp
xorl%edx, %edx
movl%esp, %ebp
movswl  8(%ebp),%ecx
.p2align 4,,15
.L2:
leal(%ecx,%edx), %eax
movswl  %ax,%ecx# 1
leal1(%edx), %eax
movzwl  %ax, %eax   # 2
cmpw$10, %ax
movswl  %ax,%edx# 3
jne .L2
popl%ebp
movl%ecx, %eax
ret
.size   g, .-g
.p2align 4,,15
The three extensions (#1, #2, #3) here are unnecessarily conservative. 
This would be better:

g:
pushl   %ebp
xorl%edx, %edx
movl%esp, %ebp
movswl  8(%ebp),%ecx
.p2align 4,,15
.L2:
leal(%ecx,%edx), %ecx   # x += i
leal1(%edx), %edx   # i++
cmpw$10, %dx# i < 10 ?
jne .L2
popl%ebp
movswl  %cx, %eax
ret
GCC's approach seems to be *eager*, in that sign-extensions are done 
immediately after sub-word operations.  This ensures that the high bits of 
a register holding a sub-word value are always valid.

An alternative is to allow the high bits of registers holding sub-word 
values to be "junk", and do sign-extensions *lazily*, only before 
operations in which any "junk" high bits could adversely affect the 
result.  For example, if you do a right shift on a value with "junk" high 
bits you have to sign/zero-extend it first, because high bits in the 
operands can affect low bits in the result.  The same is true of division.

In contrast, an addition of two 16-bit values with "junk" high bits is ok 
if the result is also a 16-bit value.  The same is true of subtraction, 
multiplication and logical ops.  The reason is that for these operations, 
the low 16 bits of the result do not depend on the high 16 bits of the 
operands.

Although you can construct examples where the eager approach gives better 
code, in general I think the lazy approach results in better code, such as 
in the above example.  Is there a particular reason why GCC uses the eager 
approach?  Maybe it has to do with the form of GCC's intermediate 
representation?  Or are there are some subtleties to do with the lazy 
approach that I have overlooked?  Or maybe I've misunderstood GCC's 
approach.

Any comments are appreciated.  Thanks for your help.
Nick


Re: line-map question

2005-04-18 Thread Mike Stump
On Apr 18, 2005, at 9:55 AM, Devang Patel wrote:
From line_map comment at (libcpp/include/line-map.h)
/* Physical source file TO_FILE at line TO_LINE at column 0 is  
represented
   by the logical START_LOCATION.  TO_LINE+L at column C is  
represented by
   START_LOCATION+(L*(1<

What happens when column number is >= 128 ? This is PR 20907.
What should happen?  Well, col==0 would be reasonable.


Re: My opinions on tree-level and RTL-level optimization

2005-04-18 Thread Steven Bosscher
On Monday 18 April 2005 18:28, Daniel Berlin wrote:
> The correct viewpoint is "we shouldn't remove CSE until every *profitable*
> transformation it makes is subsumed by something else".
>
> Otherwise, you've started with the unproven assumption that every
> transformation CSE makes is profitable.

Well, obviously we are having this discussion because a patch got
blocked on this very assumption.

Gr.
Steven



Re: Unnecessary sign- and zero-extensions in GCC?

2005-04-18 Thread Steven Bosscher
On Monday 18 April 2005 20:53, Nicholas Nethercote wrote:
> Hi,
>
> I've been looking at GCC's use of sign-extensions when dealing with
> integers smaller than a machine word size.  It looks like there is room
> for improvement.

Is your problem the same as the one described on one of the Wiki pages,
"http://gcc.gnu.org/wiki/Exploiting Dual Mode Operation"?

Gr.
Steven



Re: Unnecessary sign- and zero-extensions in GCC?

2005-04-18 Thread Nicholas Nethercote
On Mon, 18 Apr 2005, Steven Bosscher wrote:
I've been looking at GCC's use of sign-extensions when dealing with
integers smaller than a machine word size.  It looks like there is room
for improvement.
Is your problem the same as the one described on one of the Wiki pages,
"http://gcc.gnu.org/wiki/Exploiting Dual Mode Operation"?
I think so, yes.
Nick


Re: line-map question

2005-04-18 Thread Devang Patel
On Apr 18, 2005, at 11:54 AM, Mike Stump wrote:
On Apr 18, 2005, at 9:55 AM, Devang Patel wrote:
From line_map comment at (libcpp/include/line-map.h)
/* Physical source file TO_FILE at line TO_LINE at column 0 is  
represented
   by the logical START_LOCATION.  TO_LINE+L at column C is  
represented by
   START_LOCATION+(L*(1<

What happens when column number is >= 128 ? This is PR 20907.
What should happen?  Well, col==0 would be reasonable.
Well, column 0 is represented as START_LOCATION. Some how, when  
column number
is >= 128, line-map increments physical line number. And I am kind of  
lost
reading code without any kind of comments ;-( See below from line-map.c.

-
Devang
  return map;
}
source_location
linemap_line_start (struct line_maps *set, unsigned int to_line,
unsigned int max_column_hint)
{
  struct line_map *map = &set->maps[set->used - 1];
  source_location highest = set->highest_location;
  source_location r;
  unsigned int last_line = SOURCE_LINE (map, set->highest_line);
  int line_delta = to_line - last_line;
  bool add_map = false;
  if (line_delta < 0
  || (line_delta > 10 && line_delta * map->column_bits > 1000)
  || (max_column_hint >= (1U << map->column_bits))
  || (max_column_hint <= 80 && map->column_bits >= 10))
{
  add_map = true;
}
  else
max_column_hint = set->max_column_hint;
  if (add_map)
{
  int column_bits;
  if (max_column_hint > 10 || highest > 0xC000)
{
  max_column_hint = 0;
  if (highest >0xF000)
return 0;
  column_bits = 0;
}
  else
{
  column_bits = 7;
  while (max_column_hint >= (1U << column_bits))
column_bits++;
  max_column_hint = 1U << column_bits;
}
  if (line_delta < 0
  || last_line != map->to_line
  || SOURCE_COLUMN (map, highest) >= (1U << column_bits))
map = (struct line_map*) linemap_add (set, LC_RENAME, map- 
>sysp,
  map->to_file, to_line);
  map->column_bits = column_bits;
  r = map->start_location;
}
  else
r = highest - SOURCE_COLUMN (map, highest)
  + (line_delta << map->column_bits);
  set->highest_line = r;
  if (r > set->highest_location)
set->highest_location = r;
  set->max_column_hint = max_column_hint;
  return r;
}




MIPS, libsupc++ and -G 0

2005-04-18 Thread Jonathan Larmour
On MIPS, libgcc is built with -G 0, which is used to ensure the contents 
don't assume they will be placed in the small data/bss section. Setting -G 
0 is used to allow for the possibility of large applications, or those 
where even small data may be located more than 64k away from the gp pointer.

However this is not done with libsupc++ or libstdc++. The result is that 
for some of my embedded applications, which require -G 0 themselves, 
"stderr" is far away from the gp pointer. This shouldn't matter except 
that vterminate.cc in libsupc++ was not compiled with -G 0 and thus is 
expecting to be able to use a 16-bit gp relative relocation, thus we get a 
link failure.

Was this a conscious decision or an accident? Is the best route for me to 
just add -G 0 for all mips libstdc++/libsupc++, and submit that as a patch?

Thanks in advance,
Jifl
--
eCosCentrichttp://www.eCosCentric.com/The eCos and RedBoot experts
--["No sense being pessimistic, it wouldn't work anyway"]-- Opinions==mine


Re: My opinions on tree-level and RTL-level optimization

2005-04-18 Thread Dan Nicolaescu
[EMAIL PROTECTED] (Richard Kenner) writes:

  > The correct viewpoint is "we shouldn't remove CSE until every
  > *profitable* transformation it makes is subsumed by something else".
  >
  > And, as I understand it, the claim is that this is not yet true for the
  > following of jumps and my question is why.

One reason could be that there are some aspects of alias analysis that
are implemented at RTL level, but are not implemented at tree level. 
Examples: 
 - accesses to different fields of the same struct
 - accesses to different elements of the same array
 - restricted pointers 

(Dan Berlin is working on the first one (first two?))

An example: 
struct s {  int a;  int b;};
void foo (struct s *ps,  int *p, int *__restrict__ rp, int *__restrict__ rq)
{
  ps->a = 0;
  ps->b = 1;
  if (ps->a != 0)abort ();
  p[0] = 0;
  p[1] = 1;
  if (p[0] != 0) abort ();
  rp[0] = 0;
  rq[0] = 1;
  if (rp[0] != 0) abort();
}

The tree optimizers don't do anything interesting with this function,
cse eliminates all the ifs. 

   


Re: GCC 4.0 RC1 Available

2005-04-18 Thread Laurent GUERBY
The minor "problem" is still there in RC2,  I opened PR21094 about it:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21094

Laurent

> A minor thing:
> 
> I configured with c,ada only (no C++) on x86 and x86_64-linux and got
> http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg00791.html
> http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg00790.html
> [...]
>   === libmudflap tests ===
> 
> 
> Running target unix
> FAIL: libmudflap.c++/fail24-frag.cxx (test for excess errors)
> WARNING: libmudflap.c++/fail24-frag.cxx compilation failed to produce 
> executable
> FAIL: libmudflap.c++/pass27-frag.cxx (test for excess errors)
> WARNING: libmudflap.c++/pass27-frag.cxx compilation failed to produce 
> executable
> [...]
> 
> On surface, it looks like libmudflap is running C++ tests even
> when C++ isn't there. Should I open a PR?
> 
> Laurent



Re: GCC 4.0 RC2 Available

2005-04-18 Thread Joe Buck
On Mon, Apr 18, 2005 at 07:44:03AM -0700, Mark Mitchell wrote:
> 
> RC2 is available here:
> 
>   ftp://gcc.gnu.org/pub/gcc/prerelease-4.0.0-20050417/
> 
> As before, I'd very much appreciate it if people would test these bits
> on primary and secondary platforms, post test results with the
> contrib/test_summary script, and send me a message saying whether or
> not there are any regressions, together with a pointer to the results.

Test results for i686-pc-linux-gnu on an RHEL 3.0 system are at

http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01307.html

For sparc-sun-solaris2.8, I get a failure when building the Java compiler,
but I may be doing something wrong, as I usually avoid the Java build
on Solaris (since it takes most of a day to build and test).  The message
is

stage1/xgcc -Bstage1/ -B/u/jbuck/cvs.sol2/4.0.0-pre/sparc-sun-solaris2.8/bin/
-g -O2 -DIN_GCC   -W -Wall -Wwrite-strings -Wstrict-prototypes 
-Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros 
-Wold-style-definition -DHAVE_CONFIG_H  -o jc1 \
java/parse.o java/class.o java/decl.o java/expr.o java/constants.o 
java/lang.o java/typeck.o java/except.o java/verify.o java/verify-glue.o 
java/verify-impl.o java/zextract.o java/jcf-io.o java/win32-host.o 
java/jcf-parse.o java/mangle.o java/mangle_name.o java/builtins.o 
java/resource.o java/jcf-write.o java/buffer.o java/check-init.o 
java/jcf-depend.o java/jcf-path.o java/xref.o java/boehm.o java/java-gimplify.o 
main.o  libbackend.a ../libcpp/libcpp.a -L../zlib -lz
 ../libcpp/libcpp.a ./../intl/libintl.a  ../libiberty/libiberty.a
java/parse.o(.text+0x16cc): In function `java_new_lexer':
/remote/dtg103/jbuck/gnu/src/gcc-4.0.0-20050417/gcc/java/lex.c:187: undefined 
reference to `libiconv_open'
java/parse.o(.text+0x1750):/remote/dtg103/jbuck/gnu/src/gcc-4.0.0-20050417/gcc/java/lex.c:207:
 undefined reference to `libiconv_open'
java/parse.o(.text+0x17a8):/remote/dtg103/jbuck/gnu/src/gcc-4.0.0-20050417/gcc/java/lex.c:225:
 undefined reference to `libiconv'
java/parse.o(.text+0x17b4):/remote/dtg103/jbuck/gnu/src/gcc-4.0.0-20050417/gcc/java/lex.c:227:
 undefined reference to `libiconv_close'
java/parse.o(.text+0x1898): In function `java_destroy_lexer':
/remote/dtg103/jbuck/gnu/src/gcc-4.0.0-20050417/gcc/java/lex.c:270: undefined 
reference to `libiconv_close'
java/parse.o(.text+0x358c): In function `java_read_char':
/remote/dtg103/jbuck/gnu/src/gcc-4.0.0-20050417/gcc/java/lex.c:327: undefined 
reference to `libiconv'
collect2: ld returned 1 exit status
make[2]: *** [jc1] Error 1

I do have a build report that was generated over the weekend for
sparc-sun-solaris2.8 that does not contain Java, it is at

http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01245.html

with tests for both 32 and 64 bits.  It shows additional failures in
64-bit mode that do not appear in 32-bit mode.

I have builds running for x86_64-x-linux-gnu and ia64-x-linux-gnu,
as well as hppa-hpux; I'll let you know when I have something.


Re: GCC 4.0 RC2 Available

2005-04-18 Thread Laurent GUERBY
c,ada are clean on x86 and x86_64 linux.

http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01311.html
http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01313.html

Laurent




Novell thinks you are spam

2005-04-18 Thread Steven Bosscher
was: Re: *SPAM* Re:  My opinions on tree-level and RTL-level 
optimization

On Monday 18 April 2005 19:43, Richard Kenner wrote:

an email.

Which the Novell spam filter thinks is spam.

Sorry if I miss an email from you, the reason is obvious: I throw
all messages marked "SPAM" straight to the trash can.  You mail is
obviously not spam, but somehow you've ended up on the blacklist
here :-/

Dunno what to do about that...

Gr.
Steven



Re: My opinions on tree-level and RTL-level optimization

2005-04-18 Thread Daniel Berlin
On Mon, 2005-04-18 at 13:34 -0700, Dan Nicolaescu wrote:
> [EMAIL PROTECTED] (Richard Kenner) writes:
> 
>   > The correct viewpoint is "we shouldn't remove CSE until every
>   > *profitable* transformation it makes is subsumed by something else".
>   >
>   > And, as I understand it, the claim is that this is not yet true for the
>   > following of jumps and my question is why.
> 
> One reason could be that there are some aspects of alias analysis that
> are implemented at RTL level, but are not implemented at tree level. 
> Examples: 
>  - accesses to different fields of the same struct
>  - accesses to different elements of the same array
>  - restricted pointers 
> 
> (Dan Berlin is working on the first one (first two?))

First two, but only as the second relates to data dependence.

deref (pointer base) + offset (IE ps->a vs ps->b) is very annoying to
represent at the tree level, because you have to encode the info into a
new name, so that name can be used for SSA purposes.

If we didn't rely on virtual SSA for aliasing, and did what everyone
else does (have an aliasing oracle that tells you whether two things
alias), it would make optimizers x% harder to write (i haven't really
thought deeply about how much harder it really is), but make generating
good alias info for them *much* easier.


--Dan



sync operations: where's the barrier?

2005-04-18 Thread Geoffrey Keating
Hi Richard,
The documentation for the atomic operation patterns says things like:
This pattern must issue any memory barrier instructions such that the
pattern as a whole acts as a full barrier.
Should the barrier happen before the operation, after the operation, 
are there two barriers, or is it undefined?


smime.p7s
Description: S/MIME cryptographic signature


Re: Novell thinks you are spam

2005-04-18 Thread Andreas Schwab
Steven Bosscher <[EMAIL PROTECTED]> writes:

> was: Re: *SPAM* Re:  My opinions on tree-level and RTL-level 
> optimization
>
> On Monday 18 April 2005 19:43, Richard Kenner wrote:
>
> an email.
>
> Which the Novell spam filter thinks is spam.

This is because he is using an obsolete mailer that cannot even use
4-digit years in the date.  Some people like to live in the last
millennium.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: Heads-up: volatile and C++

2005-04-18 Thread Ken Raeburn
On Apr 16, 2005, at 15:45, Nathan Sidwell wrote:
It's not clear to me which is the best approach.  (b) allows threads to
be supported via copious uses of volatile (but probably introduces
pessimizations), whereas (a) forces the thread interactions to be 
compiler
visible (but shows more promise for optimizations).
Is there anything in the language specifications (mainly C++ in this 
context, but is this an area where C and C++ are going to diverge, or 
is C likely to follow suit?) that prohibits spurious writes to a 
location?  E.g., translating:

  extern int x;
  x = 3;
  foo(); // may call pthread_*
  y = 4;
  bar(); // likewise
into:
  x <- 3
  call foo
  r1 <- x
  y <- 4
  x <- r1
  call bar
...  And does this change if x and y are members of the same struct?  
Certainly you can talk about quality of implementation issues, but 
would it be non-compliant?  It certainly would be unfriendly to 
multithreaded applications, if foo() released a lock and allowed it to 
be acquired by another thread, possibly running on another processor.

To make a more concrete example, consider two one-byte lvalues, either 
distinct variables or parts of a struct, and the early Alpha processors 
with no byte operations, where byte changes are done by loading, 
modifying, and storing word values.

My suspicion is that if the compiler doesn't need to know about threads 
per se, it at least needs to know about certain kinds of restrictions 
on behavior that would cause problems with threads.

Ken


Re: Heads-up: volatile and C++

2005-04-18 Thread Robert Dewar
Ken Raeburn wrote:
On Apr 16, 2005, at 15:45, Nathan Sidwell wrote:
It's not clear to me which is the best approach.  (b) allows threads to
be supported via copious uses of volatile (but probably introduces
pessimizations), whereas (a) forces the thread interactions to be 
compiler
visible (but shows more promise for optimizations).

Is there anything in the language specifications (mainly C++ in this 
context, but is this an area where C and C++ are going to diverge, or is 
C likely to follow suit?) that prohibits spurious writes to a location?  
Surely the deal is that spurious writes are allowed unless the
location is volatile. What other interpretation is possible?



Re: GCC 4.0 RC2 Available

2005-04-18 Thread Julian Brown
On 2005-04-18, Mark Mitchell <[EMAIL PROTECTED]> wrote:
>
> RC2 is available here:
>
>   ftp://gcc.gnu.org/pub/gcc/prerelease-4.0.0-20050417/
>
> As before, I'd very much appreciate it if people would test these bits
> on primary and secondary platforms, post test results with the
> contrib/test_summary script, and send me a message saying whether or
> not there are any regressions, together with a pointer to the results.

Results for arm-none-elf, cross-compiled from i686-pc-linux-gnu (Debian)
for C and C++ are here:

http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01301.html

Relative to RC1, there are several new tests which pass, and:

g++.dg/warn/Wdtor1.C (test for excess errors)

works whereas it didn't before.

Julian



Re: internal compiler error at dwarf2out.c:8362

2005-04-18 Thread James E Wilson
Björn Haase wrote:
In case that one should not use machine specific atttributes, *is* there a 
standard way for GCC how to implement different address spaces?
Use section attributes to force functions/variables into different 
sections, and then use linker scripts to place different sections into 
different address spaces.  You can define machine dependent attributes 
as short-hand for a section attribute, and presumably the eeprom 
attribute is an example of that.

The only thing wrong with the eeprom attribute is that it is trying to 
create its own types.  It is not necessary to create new types in order 
to get variables placed into special sections.  There is nothing wrong 
with the concept of having an eeprom attribute.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: Heads-up: volatile and C++

2005-04-18 Thread Mike Stump
On Apr 18, 2005, at 3:08 PM, Ken Raeburn wrote:
Is there anything in the language specifications (mainly C++ in  
this context, but is this an area where C and C++ are going to  
diverge, or is C likely to follow suit?) that prohibits spurious  
writes to a location?
No, in both languages.  The reason is there isn't a way to observe  
it, if there were, then the answer would be yes.

Volatile for example provides a way to observe it, and in that case,  
the spurious write is prohibited.



Re: Can I comment out a GTY variable?

2005-04-18 Thread Geoffrey Keating
"H. J. Lu" <[EMAIL PROTECTED]> writes:

> I am trying to comment out
> 
> static GTY (()) int foo = 0;
> 
> with
> 
> #if 0
> static GTY (()) int foo = 0;
> #endif
> 
> But I got an error saying something like
> 
> ./gth:44: error: foo undeclared here (not in a function)
> 
> Is that expected? How can I comment it out?

Yes.  You can't.  Why do you want to?


Re: GCC 4.0 RC2 Available

2005-04-18 Thread Joe Buck

> Joe> For sparc-sun-solaris2.8, I get a failure when building the Java 
> compiler,
> Joe> but I may be doing something wrong, as I usually avoid the Java build
> Joe> on Solaris (since it takes most of a day to build and test).  The message
> Joe> is
> 
> Joe> java/parse.o(.text+0x16cc): In function `java_new_lexer':
> Joe> /remote/dtg103/jbuck/gnu/src/gcc-4.0.0-20050417/gcc/java/lex.c:187: 
> undefined reference to `libiconv_open'

On Mon, Apr 18, 2005 at 06:00:01PM -0600, Tom Tromey wrote:
> Maybe you need -liconv, though I forget, as I haven't done a Solaris
> build in a long time.  I thought the iconv-related autoconf macros
> handled this, but perhaps they are failing somehow.  AFAIK this used
> to work.

It appears the bug is because there's a libiconv.so in /usr/local/lib on
that machine, with headers in /usr/local/include, but /usr/local/lib isn't
in my LD_LIBRARY_PATH.  configure finds the declaration and assumes it
can call the function.  Sorry, I do most of my work in GNU/Linux these
days so my Solaris setup has rotted. I'll try that one again with a
proper LD_LIBRARY_PATH.

Can someone remind me what the "make" target is to resume an interrupted
build that's in stage 2?

This could be considered a configuration bug, because while configure
finds a declaration for iconv, it does not verify (by a link test) that it
can actually be called, and the failure doesn't show up for a couple of
hours.  It seems we're inconsistent about such things; sometimes we try to
find the symbol in a library, and sometimes we just check for a
declaration.  There was a bug that was closed as INVALID for this reason,
and several duplicates, so maybe we should try to help people who run into
this difficulty as I'm at least the fourth person to encounter and report
it.







Re: Cross Compile PowerPC for ReactOS

2005-04-18 Thread James E Wilson
James Tabor wrote:
fp-bit.c:744: error: unrecognizable insn:
(call_insn:HI 53 49 59 0 fp-bit.c:743 (parallel [
(set (reg:SF 33 1)
(call (mem:SI (symbol_ref:SI ("__pack_f") [flags 0x41] 
) [0 S4 A32])
(const_int 0 [0x0])))
(use (const_int 0 [0x0]))
(clobber (scratch:SI))
This needs to match one of the call_value patterns in the rs6000.md 
file.  Presumably the call_value_local32 pattern.  The matching happens 
in recog.  You could put a breakpoint there, and try stepping through 
the code to see what is wrong.  You can print rtl by using the pr gdb 
macro.  E.g.
  print insn
  pr

call_value_local32 doesn't do much, so I'd guess it is the 
current_file_function_operand predicate.  You could try putting a 
breakpoint there also.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: Unnecessary sign- and zero-extensions in GCC?

2005-04-18 Thread Steven Bosscher
On Monday 18 April 2005 20:53, Nicholas Nethercote wrote:
> Hi,
>
> I've been looking at GCC's use of sign-extensions when dealing with
> integers smaller than a machine word size.  It looks like there is room
> for improvement.
>
> Consider this C function:
>
>  short g(short x)
>  {
> short i;
> for (i = 0; i < 10; i++) {
>x += i;
> }
> return x;
>  }
>
> On x86, using a GCC 4.0.0 20050130, with -O2 I get this code:
>
> g:
>  pushl   %ebp
>  xorl%edx, %edx
>  movl%esp, %ebp
>  movswl  8(%ebp),%ecx
>  .p2align 4,,15
> .L2:
>  leal(%ecx,%edx), %eax
>  movswl  %ax,%ecx# 1
>  leal1(%edx), %eax
>  movzwl  %ax, %eax   # 2
>  cmpw$10, %ax
>  movswl  %ax,%edx# 3
>  jne .L2
>
>  popl%ebp
>  movl%ecx, %eax
>  ret
>  .size   g, .-g
>  .p2align 4,,15
>
> The three extensions (#1, #2, #3) here are unnecessarily conservative.
> This would be better:
>
> g:
>  pushl   %ebp
>  xorl%edx, %edx
>  movl%esp, %ebp
>  movswl  8(%ebp),%ecx
>  .p2align 4,,15
> .L2:
>  leal(%ecx,%edx), %ecx   # x += i
>  leal1(%edx), %edx   # i++
>  cmpw$10, %dx# i < 10 ?
>  jne .L2
>
>  popl%ebp
>  movswl  %cx, %eax
>  ret

Here is what I get on amd64 with -m32:

.file   "t.c"
.text
.p2align 4,,15
.globl g
.type   g, @function
g:
pushl   %ebp
xorl%edx, %edx
movl%esp, %ebp
movswl  8(%ebp),%eax
.p2align 4,,15
.L2:
addl%edx, %eax
incl%edx
cmpl$10, %edx
cwtl
jne .L2
popl%ebp
ret
.size   g, .-g
.ident  "GCC: (GNU) 4.1.0 20050412 (experimental)"
.section.note.GNU-stack,"",@progbits

Looks a bit more like your optimal code ;-)

Gr.
Steven


Re: ppc32/e500/no float - undefined references in libstdc++ _Unwind_*

2005-04-18 Thread James E Wilson
Clemens Koller wrote:
/usr/local/lib/nof/libstdc++.so.6: undefined reference to 
[EMAIL PROTECTED]'
/usr/local/lib/nof/libstdc++.so.6: undefined reference to 
[EMAIL PROTECTED]'
These functions should come from libgcc_s.so or libgcc_eh.a, depending 
on whether this is a shared or static link.

Try checking to see which libgcc libraries are being linked in.  Try 
adding -v to the compiler line, or even -Wl,--verbose to see what the 
linker is doing.

Try running nm on the libgcc libraries to make sure they are OK.
If you have multiple gcc versions installed, you might be accidentally 
linking in the wrong libgcc libraries.  If you have libraries compiled 
with other gcc versions, it is possible that they are bringing in the 
wrong libgcc, from an older gcc release.

Try using g++ to link instead of gcc.  Somehow you are getting C++ 
libraries even though you seem to be linking C code, and using g++ for 
the link will make sure that the proper C++ libraries are linked in the 
proper order.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: GCC 4.0 RC2 Available

2005-04-18 Thread Joe Buck
On Mon, Apr 18, 2005 at 05:13:33PM -0700, Joe Buck wrote:
> [ solaris failure building Java compiler ]
> It appears the bug is because there's a libiconv.so in /usr/local/lib on
> that machine, with headers in /usr/local/include, but /usr/local/lib isn't
> in my LD_LIBRARY_PATH.  configure finds the declaration and assumes it
> can call the function.  Sorry, I do most of my work in GNU/Linux these
> days so my Solaris setup has rotted. I'll try that one again with a
> proper LD_LIBRARY_PATH.
> 
> Can someone remind me what the "make" target is to resume an interrupted
> build that's in stage 2?

"make bootstrap2-lean" was what I was looking for.  However, it doesn't
work; there is no -liconv on the command line for building jc1.

I manually edited the command generate, to add -liconv with an appropriate
LD_LIBRARY_PATH.  I got a successful link of jc1.

Are others successfully building on sparc-sun-solaris2.8?  Maybe it works
for others because they have no iconv header, so configure assumes there
is no iconv?


Re: Stack and Function parameters alignment

2005-04-18 Thread James E Wilson
Petar Penchev wrote:
I tried to use force_reg or PUT_MODE
but it does nothing and PUSH AL, inc S remain.
If nothing is happening, then that means the peephole isn't matching. 
The matching happens in peephole2_insns.  You could try putting a 
breakpoint there and stepping through the code to see what happens.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: compile error for gcc-4.0.0-20050410

2005-04-18 Thread James E Wilson
Guochun Shi wrote:
make[1]: Entering directory 
`/home/gshi/gcc/gcc-4.0.0-20050410/build-i686-pc-linux-gnu/libiberty'
make[1]: *** No rule to make target `../include/ansidecl.h', needed by 
`regex.o'.  Stop.
make[1]: Leaving directory 
`/home/gshi/gcc/gcc-4.0.0-20050410/build-i686-pc-linux-gnu/libiberty'
It looks like you tried to build in the source directory.  That is 
supposed to work, but we never test it, and it is known to have frequent 
problems.  Try following the directions that say to configure in a 
separate directory.  E.g.
  mkdir objdir
  cd objdir
  ../gcc-4.0.0-20050410/configure ...
This is how we build it, and this is known to work.  Since you already 
have one broken configure/build in the source directory, you probably 
need to rm -rf your current source/build tree, and extract a new source 
tree from the release tar ball.  Building in a separate directory has 
the advantage that when something goes wrong, your source tree doesn't 
get messed up.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: GCC 4.0 RC2 Available

2005-04-18 Thread Geoffrey Keating
Mark Mitchell <[EMAIL PROTECTED]> writes:

> RC2 is available here:
> 
>   ftp://gcc.gnu.org/pub/gcc/prerelease-4.0.0-20050417/
> 
> As before, I'd very much appreciate it if people would test these bits
> on primary and secondary platforms, post test results with the
> contrib/test_summary script, and send me a message saying whether or
> not there are any regressions, together with a pointer to the results.

Bad news, I'm afraid.

On powerpc-darwin8, this fails to bootstrap, with an ICE in libjava (when
trying to build gnu-xml.o).

The backtrace is:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x0004
fold_convert (type=0x4160c380, arg=0x0) at 
/Network/Servers/hills/Volumes/capanna/gkeating/co/gcc-4.0.0-20050417/gcc/fold-const.c:1886
1886  tree orig = TREE_TYPE (arg);
(gdb) bt
#0  fold_convert (type=0x4160c380, arg=0x0) at 
/Network/Servers/hills/Volumes/capanna/gkeating/co/gcc-4.0.0-20050417/gcc/fold-const.c:1886
#1  0x0028c5b4 in bit_from_pos (offset=0x28c5b4, bitpos=0x0) at 
/Network/Servers/hills/Volumes/capanna/gkeating/co/gcc-4.0.0-20050417/gcc/stor-layout.c:542
#2  0x00118438 in dbxout_type (type=0x42434680, full=0) at 
/Network/Servers/hills/Volumes/capanna/gkeating/co/gcc-4.0.0-20050417/gcc/dbxout.c:1392

The type involved is:

 
unit size 
align 16 symtab 52 alias set 15 precision 16 min  max 
pointer_to_this  chain >
ignored decl_1 VOID file gnu/xml/dom/DomElement.java line 324
align 1 offset_align 1 context  chain >

I'll run another build with a patch applied to disable libgcj on
ppc-darwin, and see how that goes.  I'll also try to work out which
patch broke it.


Re: The subreg question

2005-04-18 Thread James E Wilson
Ling-hua Tseng wrote:
> It's obvious that `movil' and `movim' are only access the partial 
> 16-bit of the 32-bit register. How can I use RTL expression to 
> represent the operations?

As you noticed, within a register, subreg can only be used for low
parts.  You can't ask for the high part of a single register.  If you
have an item that spans multiple registers, e.g. a 64-bit value that is
contained in a register pair, then you can ask for the SImode highpart
of a DImode reg and get valid RTL.  This works because the high part is
an entire register.  This isn't useful to you.

Otherwise, you can access subparts via bitfield insert/extract
operations, or logicals operations (ior/and), though this is likely to
be tedious, and may confuse optimizers.

There are high/lo_sum RTL operators that may be useful to you.  You can use
  (set (reg:SI) (high: ...))
  (set (reg:SI) (lo_sum (reg:SI) (...)))
where the first pattern corresponds to movims, and the second one to
movil.  You could just as well use ior instead of lo_sum for the second
pattern, this is probably better as movil does not do an add.

You may want to emit normal rtl for an SImode move, and then split it
into its two 16-bit parts after reload.  This will avoid confusing RTL
optimizers before reload.

We have vector modes which might be useful to you.  If you say a
register is holding a V4QI mode value, then there are natural ways to
get at the individual elements of the vector via vector operations.
-- 
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: GCC 4.0 RC2 Available

2005-04-18 Thread Andrew Pinski
On Apr 18, 2005, at 9:07 PM, Geoffrey Keating wrote:
Mark Mitchell <[EMAIL PROTECTED]> writes:
RC2 is available here:
  ftp://gcc.gnu.org/pub/gcc/prerelease-4.0.0-20050417/
As before, I'd very much appreciate it if people would test these bits
on primary and secondary platforms, post test results with the
contrib/test_summary script, and send me a message saying whether or
not there are any regressions, together with a pointer to the results.
Bad news, I'm afraid.
On powerpc-darwin8, this fails to bootstrap, with an ICE in libjava 
(when
trying to build gnu-xml.o).
This ICE looks like the same as PR 21022.  I wonder why it does not 
fail on
the mainline.
The only patch which makes sense which caused it is:
2005-04-16  Anthony Green  <[EMAIL PROTECTED]>

* Makefile.am (gnu-xml.lo, javax-imageio.lo, javax-xml.lo,
gnu-java-beans.lo, gtk-awt-peer.lo) : Sort the output of all
"find" output in order to work around the libtool bug described 
in
PR libgcj/20693.
* Makefile.in: Rebuilt.

which means it is a latent bug.
-- Pinski


Re: sync operations: where's the barrier?

2005-04-18 Thread David Edelsohn
> Geoffrey Keating writes:

Geoff> The documentation for the atomic operation patterns says things like:

>> This pattern must issue any memory barrier instructions such that the
>> pattern as a whole acts as a full barrier.

Geoff> Should the barrier happen before the operation, after the operation, 
Geoff> are there two barriers, or is it undefined?

On PowerPC, this has a lot to do with the cooperation of the
various functions referencing the memory atomically.  I am most familiar
with emitting sync (or lwsync) before the atomic operation.

David


Re: [RFC] warning: initialization discards qualifiers from pointer target type

2005-04-18 Thread James E Wilson
Devang Patel wrote:
warning: initialization discards qualifiers from pointer target  type
This warning can not be disabled using -Wno-cast-qual
(or any other warning flags). Is it intentional ?
It looks like we have been doing it this way since at least gcc-1.42. 
The same code is there, with no way to disable it, though the wording of 
the message is a little different.  I didn't try looking back any 
farther than that.  This seems rather unlikely to be an accident.

Though of course, this doesn't mean that we can't have an option to 
control it.  -Wno-cast-qual doesn't seem like the right choice, as there 
is no user cast here.  Maybe something like -Wno-discard-qual, where 
-Wdiscard-qual is the default.

I notice that these are pedwarns, not a warning as in the -Wcast-qual 
case, which means that the ISO C standard requires a diagnostic here. 
For this reason, it may not be wise to add an option to disable the 
warnings.  You may lose portability to other compilers.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: GCC 4.0 RC1 Available

2005-04-18 Thread Kaveh R. Ghazi
 >  > 2005-04-12 Paolo Bonzini <[EMAIL PROTECTED]>
 >  > 
 >  > * acx.m4 (ACX_PROG_GNAT): Remove stray break. 
 > 
 > OK for 4.0.0.

Mark,

When this patch went into 4.0, Paolo didn't regenerate the top level
configure, although the ChangeLog claims he did:
http://gcc.gnu.org/ml/gcc-cvs/2005-04/msg00842.html


The patch should also be applied to mainline, since the "break"
problem exists there too.  I'm not sure why it wasn't, but perhaps
your "OK for 4.0.0" didn't specify mainline and Paolo was being
conservative.  I think we should fix it there also.

Thanks,
--Kaveh
--
Kaveh R. Ghazi  [EMAIL PROTECTED]


Re: [RFC] warning: initialization discards qualifiers from pointer target type

2005-04-18 Thread Mike Stump
On Apr 18, 2005, at 6:29 PM, James E Wilson wrote:
This seems rather unlikely to be an accident.
I agree, I'm sure it was due to bad system header files, only some of  
which had const and others didn't.  By ignoring the issue in the  
compiler, the compiler works on such (broken) systems.  The usual  
case I think was a system that had a compiler that did const, but the  
OS hadn't added it yet, but people built software that made use of  
const, and presto, it wouldn't compile.  In the C++ world, we did the  
same thing (we used warning as I recall) for a long time, but that  
was due to bugs in the compiler, where const wasn't propagated around  
correctly internally, so, instead of pestering the user with error  
messages they could not fix, we just made it a warning by default.   
The idea was eventually we'd weed them all out and make it an error;  
today, it is an error.

I notice that these are pedwarns, not a warning as in the -Wcast- 
qual case, which means that the ISO C standard requires a  
diagnostic here. For this reason, it may not be wise to add an  
option to disable the warnings.  You may lose portability to other  
compilers.
If we had a framework for disabling warnings  it probably would  
just fit into it.  :-)



Re: sync operations: where's the barrier?

2005-04-18 Thread Geoffrey Keating
On 18/04/2005, at 6:13 PM, David Edelsohn wrote:
Geoffrey Keating writes:
Geoff> The documentation for the atomic operation patterns says things 
like:

This pattern must issue any memory barrier instructions such that the
pattern as a whole acts as a full barrier.
Geoff> Should the barrier happen before the operation, after the 
operation,
Geoff> are there two barriers, or is it undefined?

	On PowerPC, this has a lot to do with the cooperation of the
various functions referencing the memory atomically.  I am most 
familiar
with emitting sync (or lwsync) before the atomic operation.
That's what I'd planned to do, emit sync before the operation; but 
obviously it does matter, one way lets you use the intrinsic to acquire 
an object and the other way lets you use the intrinsic to release an 
object.

(Ideally it would be lwsync; actually, ideally there would be no sync 
at all, and users would be required to output one if they wanted one; 
but at present the routines are documented to have a 'sync'.  The 
question is where...)


smime.p7s
Description: S/MIME cryptographic signature


Re: [RFC] warning: initialization discards qualifiers from pointer target type

2005-04-18 Thread Devang Patel
On Apr 18, 2005, at 6:29 PM, James E Wilson wrote:
Devang Patel wrote:
warning: initialization discards qualifiers from pointer  
target  type
This warning can not be disabled using -Wno-cast-qual
(or any other warning flags). Is it intentional ?
It looks like we have been doing it this way since at least  
gcc-1.42. The same code is there, with no way to disable it, though  
the wording of the message is a little different.  I didn't try  
looking back any farther than that.  This seems rather unlikely to  
be an accident.

Though of course, this doesn't mean that we can't have an option to  
control it.  -Wno-cast-qual doesn't seem like the right choice, as  
there is no user cast here.  Maybe something like -Wno-discard- 
qual, where -Wdiscard-qual is the default.

I notice that these are pedwarns,
In that case, we can enable it only when -pedantic is used (like many  
pedwarns) ?

not a warning as in the -Wcast-qual case, which means that the ISO  
C standard requires a diagnostic here. For this reason, it may not  
be wise to add an option to disable the warnings.  You may lose  
portability to other compilers.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
-
Devang


Problems with MIPS cross compiling for GCC-4.1.0...

2005-04-18 Thread Steven J. Hill
Greetings.
While I am getting closer to full toolchain build, GCC-4.1.0 is still
not behaving the way it should. Below is the output that I am running
up against. I attempted to define a stack variable to hold the value
of zero and tried using that instead of the actual value, but nothing
worked. I had a similar problem with 'do_waitid' and I have attached
the patch just for the sake of discussion. Does anyone have some
insight on this? I am using binutils-2.15, glibc-2.3.4, 2.6.12-rc2
kernel headers and gcc-4.1.0-20050418. Thanks.
-Steve
mips-unknown-linux-gnu-gcc  -mabi=32 
../sysdeps/unix/sysv/linux/mips/pread.c -c -std=gnu99 -O2 -Wall -Winline 
-Wstrict-prototypes -Wwrite-strings -finline-limit=1 -isystem 
/home/sjhill/mips-nptl/crosstool-0.31-nptl/build/mips-unknown-linux-gnu/gcc-4.1.0-20050418-glibc-2.3.4/linux-2.6.12/include/asm-mips/mach-generic 
  -fexceptions -fasynchronous-unwind-tables   -I../include -I. 
-I/home/sjhill/mips-nptl/crosstool-0.31-nptl/build/mips-unknown-linux-gnu/gcc-4.1.0-20050418-glibc-2.3.4/build-glibc/posix 
-I.. -I../libio 
-I/home/sjhill/mips-nptl/crosstool-0.31-nptl/build/mips-unknown-linux-gnu/gcc-4.1.0-20050418-glibc-2.3.4/build-glibc 
-I../sysdeps/mips/elf -I../linuxthreads/sysdeps/unix/sysv/linux/mips 
-I../linuxthreads/sysdeps/unix/sysv/linux 
-I../linuxthreads/sysdeps/pthread -I../sysdeps/pthread 
-I../linuxthreads/sysdeps/unix/sysv -I../linuxthreads/sysdeps/unix 
-I../linuxthreads/sysdeps/mips -I../sysdeps/unix/sysv/linux/mips/mips32 
-I../sysdeps/unix/sysv/linux/mips -I../sysdeps/unix/sysv/linux 
-I../sysdeps/gnu -I../sysdeps/unix/common -I../sysdeps/unix/mman 
-I../sysdeps/unix/inet -I../sysdeps/unix/sysv 
-I../sysdeps/unix/mips/mips32 -I../sysdeps/unix/mips -I../sysdeps/unix 
-I../sysdeps/posix -I../sysdeps/mips/mips32 -I../sysdeps/mips 
-I../sysdeps/ieee754/flt-32 -I../sysdeps/ieee754/dbl-64 
-I../sysdeps/wordsize-32 -I../sysdeps/mips/fpu -I../sysdeps/ieee754 
-I../sysdeps/generic/elf -I../sysdeps/generic -nostdinc -isystem 
/opt/crosstool/mips-unknown-linux-gnu/gcc-4.1.0-20050418-glibc-2.3.4/lib/gcc/mips-unknown-linux-gnu/4.1.0/include 
-isystem 
/opt/crosstool/mips-unknown-linux-gnu/gcc-4.1.0-20050418-glibc-2.3.4/mips-unknown-linux-gnu/include 
-D_LIBC_REENTRANT -include ../include/libc-symbols.h  -DPIC -o 
/home/sjhill/mips-nptl/crosstool-0.31-nptl/build/mips-unknown-linux-gnu/gcc-4.1.0-20050418-glibc-2.3.4/build-glibc/posix/pread.o 
-MD -MP -MF 
/home/sjhill/mips-nptl/crosstool-0.31-nptl/build/mips-unknown-linux-gnu/gcc-4.1.0-20050418-glibc-2.3.4/build-glibc/posix/pread.o.dt 
-MT 
/home/sjhill/mips-nptl/crosstool-0.31-nptl/build/mips-unknown-linux-gnu/gcc-4.1.0-20050418-glibc-2.3.4/build-glibc/posix/pread.o
../sysdeps/unix/sysv/linux/mips/pread.c: In function '__libc_pread':
../sysdeps/unix/sysv/linux/mips/pread.c:69: error: memory input 6 is not 
directly addressable
../sysdeps/unix/sysv/linux/mips/pread.c:86: error: memory input 6 is not 
directly addressable
make[2]: *** 
[/home/sjhill/mips-nptl/crosstool-0.31-nptl/build/mips-unknown-linux-gnu/gcc-4.1.0-20050418-glibc-2.3.4/build-glibc/posix/pread.o] 
Error 1
make[2]: Leaving directory 
`/home/sjhill/mips-nptl/crosstool-0.31-nptl/build/mips-unknown-linux-gnu/gcc-4.1.0-20050418-glibc-2.3.4/glibc-2.3.4/posix'
make[1]: *** [posix/subdir_lib] Error 2
make[1]: Leaving directory 
`/home/sjhill/mips-nptl/crosstool-0.31-nptl/build/mips-unknown-linux-gnu/gcc-4.1.0-20050418-glibc-2.3.4/glibc-2.3.4'
make: *** [lib] Error 2

../sysdeps/unix/sysv/linux/waitid.c: In function 'do_waitid':
../sysdeps/unix/sysv/linux/waitid.c:52: error: memory input 6 is not directly addressable
../sysdeps/unix/sysv/linux/waitid.c:55: error: memory input 6 is not directly addressable

diff -ur glibc-2.3.4/sysdeps/unix/sysv/linux/waitid.c glibc-2.3.4-patched/sysdeps/unix/sysv/linux/waitid.c
--- glibc-2.3.4/sysdeps/unix/sysv/linux/waitid.c	2004-10-30 13:01:02.0 -0500
+++ glibc-2.3.4-patched/sysdeps/unix/sysv/linux/waitid.c	2005-04-18 19:01:28.334689002 -0500
@@ -47,12 +47,14 @@
 do_waitid (idtype_t idtype, id_t id, siginfo_t *infop, int options)
 {
   static int waitid_works;
+  struct rusage *sim = NULL;
+
   if (waitid_works > 0)
-return INLINE_SYSCALL (waitid, 5, idtype, id, infop, options, NULL);
+return INLINE_SYSCALL (waitid, 5, idtype, id, infop, options, sim);
   if (waitid_works == 0)
 {
   int result = INLINE_SYSCALL (waitid, 5,
-   idtype, id, infop, options, NULL);
+   idtype, id, infop, options, sim);
   if (result < 0 && errno == ENOSYS)
 	waitid_works = -1;
   else


Re: internal compiler error at dwarf2out.c:8362

2005-04-18 Thread James E Wilson
Martin Koegler wrote:
I added to the i386 version the following code (using a unmodified gcc for the rest):
With this change, I can reproduce the problem.
I noticed that I get a failure for all types, not just array types. 
This is different than what you described earlier, but perhaps the 
difference is that we don't have the questionable code to create new 
types here.

I debugged this a bit more.  The situation seems to be this:
1) We create a first array type when we parse the array decl in the 
first line (build_array_type).
2) We create a second array type when handling the typedef 
(clone_underlying_type).  This one gets TYPE_NAME set to the typedef, 
and DECL_ORIGINAL_TYPE of the typedef points to the first one.
3) We create a third array type when parsing the attributes.  See the 
call to build_type_attribute_variant in attribs.c.  This is a complete 
copy, so it still has the GROUP9_T TYPE_NAME.
4) We create a fourth array type when handling the second typedef.  This 
gets the TYPE_NAME EGROUP9_T, and the typedef has DECL_ORIGINAL_TYPE set 
to point to the third array type.

When we emit debug info, we emit debug info for the types used by the 
typedefs, which are the second and fourth one.  The debug info for the 
second one is OK.  The debug info for the fourth one runs into trouble. 
 We follow DECL_ORIGINAL_TYPE to get the third array type, and then we 
follow TYPE_NAME to get the second array type, and then we notice that 
we already emitted debug info for this type.  After we return, we double 
check to make sure we have debug info for this type, and fail, because 
this is not the same array type as we emitted earlier.

I think the broken type is the 3rd one.  I mentioned in an earlier 
message that you had two types with the same TYPE_NAME which was wrong. 
 This is happening in build_type_attribute_variant.  Clearing TYPE_NAME 
here seems to solve the problem, though I haven't done any testing to 
see if this is safe.

Maybe a bug report to keep track of this info would be useful.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: Problems with MIPS cross compiling for GCC-4.1.0...

2005-04-18 Thread Eric Christopher

> the patch just for the sake of discussion. Does anyone have some
> insight on this? I am using binutils-2.15, glibc-2.3.4, 2.6.12-rc2
> kernel headers and gcc-4.1.0-20050418. Thanks.
> 

I'd use 2.16 binutils, especially if using mainline gcc, but that's not
as relevant here...

> /home/sjhill/mips-nptl/crosstool-0.31-nptl/build/mips-unknown-linux-gnu/gcc-4.1.0-20050418-glibc-2.3.4/build-glibc/posix/pread.o
> ../sysdeps/unix/sysv/linux/mips/pread.c: In function '__libc_pread':
> ../sysdeps/unix/sysv/linux/mips/pread.c:69: error: memory input 6 is not 
> directly addressable
> ../sysdeps/unix/sysv/linux/mips/pread.c:86: error: memory input 6 is not 
> directly addressable

Probably a problem with the INLINE_SYSCALL macro? Can you post a smaller
preprocessed testcase? (or at the outside the preprocessed file)

-eric



Re: [RFC] warning: initialization discards qualifiers from pointer target type

2005-04-18 Thread Eric Christopher

> >
> > Though of course, this doesn't mean that we can't have an option to  
> > control it.  -Wno-cast-qual doesn't seem like the right choice, as  
> > there is no user cast here.  Maybe something like -Wno-discard- 
> > qual, where -Wdiscard-qual is the default.
> >
> > I notice that these are pedwarns,
> 
> In that case, we can enable it only when -pedantic is used (like many  
> pedwarns) ?

You could, but in this case it's probably best to fix the code...

-eric



Re: sync operations: where's the barrier?

2005-04-18 Thread Richard Henderson
On Mon, Apr 18, 2005 at 02:48:27PM -0700, Geoffrey Keating wrote:
> The documentation for the atomic operation patterns says things like:
> 
> >This pattern must issue any memory barrier instructions such that the
> >pattern as a whole acts as a full barrier.
> 
> Should the barrier happen before the operation, after the operation, 
> are there two barriers, or is it undefined?

Two barriers.


r~


Re: Problems with MIPS cross compiling for GCC-4.1.0...

2005-04-18 Thread Dan Kegel
Steven J. Hill wrote:
While I am getting closer to full toolchain build, GCC-4.1.0 is still
not behaving the way it should. Below is the output that I am running
up against. I attempted to define a stack variable to hold the value
of zero and tried using that instead of the actual value, but nothing
worked. I had a similar problem with 'do_waitid' and I have attached
the patch just for the sake of discussion. Does anyone have some
insight on this? I am using binutils-2.15, glibc-2.3.4, 2.6.12-rc2
kernel headers and gcc-4.1.0-20050418. Thanks.
../sysdeps/unix/sysv/linux/waitid.c: In function 'do_waitid':
../sysdeps/unix/sysv/linux/waitid.c:52: error: memory input 6 is not directly 
addressable
../sysdeps/unix/sysv/linux/waitid.c:55: error: memory input 6 is not directly 
addressable
diff -ur glibc-2.3.4/sysdeps/unix/sysv/linux/waitid.c 
glibc-2.3.4-patched/sysdeps/unix/sysv/linux/waitid.c
--- glibc-2.3.4/sysdeps/unix/sysv/linux/waitid.c2004-10-30 
13:01:02.0 -0500
+++ glibc-2.3.4-patched/sysdeps/unix/sysv/linux/waitid.c2005-04-18 
19:01:28.334689002 -0500
@@ -47,12 +47,14 @@
 do_waitid (idtype_t idtype, id_t id, siginfo_t *infop, int options)
 {
   static int waitid_works;
+  struct rusage *sim = NULL;
+
   if (waitid_works > 0)
-return INLINE_SYSCALL (waitid, 5, idtype, id, infop, options, NULL);
+return INLINE_SYSCALL (waitid, 5, idtype, id, infop, options, sim);
   if (waitid_works == 0)
 {
   int result = INLINE_SYSCALL (waitid, 5,
-  idtype, id, infop, options, NULL);
+  idtype, id, infop, options, sim);
Perhaps INLINE_SYSCALL needs some work to be gcc-4 compatible?
(tap tap tap)
Yep.  Check out the recent changes in
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h?cvsroot=glibc
I bet applying
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h.diff?r1=1.2&r2=1.3&cvsroot=glibc
and maybe the next one
http://sourceware.org/cgi-bin/cvsweb.cgi/libc/sysdeps/unix/sysv/linux/mips/mips32/sysdep.h.diff?r1=1.3&r2=1.4&cvsroot=glibc
will cure what ails ye.
- Dan
--
Trying to get a job as a c++ developer?  See 
http://kegel.com/academy/getting-hired.html


Reload Issue -- I can't believe we haven't hit this before

2005-04-18 Thread Jeffrey A Law

So the combination of the TCB merge plus the pending jump threading
changes apparently has ticked a reload bug which manifests itself with
the stage1 compiler mis-compiling the stage2 compiler.

Upon entry into local-alloc we have the following key insns:

(insn:HI 88 85 89 10 (set (reg:QI 66 [ D.23558 ])
(mem/s/u:QI (const:SI (plus:SI (symbol_ref:SI
("tree_code_length") [flags 0x40] )
(const_int 50 [0x32]))) [0 tree_code_length+50 S1
A8])) 45 {*movqi_1} (nil)
(nil))

[ ... ]

(insn:HI 201 177 231 27 (set (reg:QI 66 [ D.23558 ])
(reg:QI 95 [ tree_code_length+50 ])) 45 {*movqi_1} (nil)
(expr_list:REG_DEAD (reg:QI 95 [ tree_code_length+50 ])
(expr_list:REG_EQUAL (mem/s/u:QI (const:SI (plus:SI
(symbol_ref:SI ("tree_code_length") [flags 0x40] )
(const_int 50 [0x32]))) [0 tree_code_length+50
S1 A8])
(nil


[ It is highly likely the second insn initially was an identical copy
  of the first created by jump threading and was later simplified as
  the value of (mem (plus (tree_code_length) (50)) was lying around
  in a convenient place that dominated the second insn. ]

Anyway, we'll end up promoting the REG_EQUAL note in the second
insn into a REG_EQUIV note (update_equiv_regs) and we'll also have
added a REG_EQUIV note to the first insn.

Neither local-alloc nor global allocation are able to assign a hard
register for (reg:QI 66).

So at reload time we have:


(insn:HI 88 85 89 10 (set (reg:QI 66 [ D.23558 ])
(mem/s/u:QI (const:SI (plus:SI (symbol_ref:SI
("tree_code_length") [flags 0x40] )
(const_int 50 [0x32]))) [0 tree_code_length+50 S1
A8])) 45 {*movqi_1} (nil)
(expr_list:REG_EQUIV (mem/s/u:QI (const:SI (plus:SI (symbol_ref:SI
("tree_code_length") [flags 0x40] )
(const_int 50 [0x32]))) [0 tree_code_length+50 S1
A8])
(nil)))



[ ... ]

(insn:HI 201 177 231 26 (set (reg:QI 66 [ D.23558 ])
(reg:QI 95 [ tree_code_length+50 ])) 45 {*movqi_1} (nil)
(expr_list:REG_DEAD (reg:QI 95 [ tree_code_length+50 ])
(expr_list:REG_EQUIV (mem/s/u:QI (const:SI (plus:SI
(symbol_ref:SI ("tree_code_length") [flags 0x40] )
(const_int 50 [0x32]))) [0 tree_code_length+50
S1 A8])
(nil



Insn 88 gets deleted as it sets up the equivalence between (reg:QI 66)
and the memory location.

We record the equivalence between (reg:QI 66) and the memory location
in reg_equiv_mem.

We then proceed to eliminate (reg:QI 66) by replacing it with its
equivalent memory location.  Resulting in:

(insn:HI 201 177 231 26 (set (mem/s/u:QI (const:SI (plus:SI
(symbol_ref:SI ("tree_code_length") [flags 0x40] )
(const_int 50 [0x32]))) [0 tree_code_length+50 S1
A8])
(reg:QI 1 dx [orig:95 tree_code_length+50 ] [95])) 45 {*movqi_1}
(nil)
(expr_list:REG_DEAD (reg:QI 1 dx [orig:95 tree_code_length+50 ]
[95])
(expr_list:REG_EQUIV (mem/s/u:QI (const:SI (plus:SI
(symbol_ref:SI ("tree_code_length") [flags 0x40] )
(const_int 50 [0x32]))) [0 tree_code_length+50
S1 A8])
(nil


Which faults because the memory location is actually  read-only memory.


What's not clear to me is how best to fix this.

We could try to delete all assignments to pseudos which are equivalent
to MEMs.

We could avoid recording equivalences when the pseudo is set more than
once.

Other possibilities?

jeff



Re: GCC 4.0 RC2 Available

2005-04-18 Thread Eric Botcazou
> For sparc-sun-solaris2.8, I get a failure when building the Java compiler,
> but I may be doing something wrong, as I usually avoid the Java build
> on Solaris (since it takes most of a day to build and test).

Known glitch.  You have to find out why configure thinks you have libiconv 
installed and yet the library is not found.

> I do have a build report that was generated over the weekend for
> sparc-sun-solaris2.8 that does not contain Java, it is at
>
> http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01245.html
>
> with tests for both 32 and 64 bits.  It shows additional failures in
> 64-bit mode that do not appear in 32-bit mode.

Binutils 2.15 bug in 64-bit mode.  Didn't you get my message via Binutils' 
Bugzilla?  The patch is at:
http://sourceware.org/ml/binutils-cvs/2005-01/msg00019.html

SPARC/Solaris is OK:
http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01135.html
http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01302.html
http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01300.html
http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01299.html
http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01298.html
http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01297.html

-- 
Eric Botcazou


Re: Reload Issue -- I can't believe we haven't hit this before

2005-04-18 Thread Eric Botcazou
> So the combination of the TCB merge plus the pending jump threading
> changes apparently has ticked a reload bug which manifests itself with
> the stage1 compiler mis-compiling the stage2 compiler.
>
> [...]
>
> Which faults because the memory location is actually  read-only memory.

PR rtl-optimization/15248.

> What's not clear to me is how best to fix this.
>
> We could try to delete all assignments to pseudos which are equivalent
> to MEMs.
>
> We could avoid recording equivalences when the pseudo is set more than
> once.
>
> Other possibilities?

For 3.3 and 3.4, this was "fixed" by not recording memory equivalences that 
have the infamous RTX_UNCHANGING_P flag set.

-- 
Eric Botcazou


Re: GCC 4.0 RC1 Available

2005-04-18 Thread Paolo Bonzini
Kaveh R. Ghazi wrote:
When this patch went into 4.0, Paolo didn't regenerate the top level
configure, although the ChangeLog claims he did:
http://gcc.gnu.org/ml/gcc-cvs/2005-04/msg00842.html
 

You're right.  I was being conservative and typed the "cvs ci" filenames 
manually, but in this case there was no need because I worked off a 
fresh checkout.  Sorry.

The patch should also be applied to mainline, since the "break"
problem exists there too.  I'm not sure why it wasn't, but perhaps
your "OK for 4.0.0" didn't specify mainline and Paolo was being
conservative.  I think we should fix it there also.
 

Yes, I was.  But it looks like build machinery maintainers are being 
busy and toplevel patches are largely unnoticed.

Paolo