Re: RFC: Adding non-PIC executable support to MIPS

2008-07-02 Thread Adam Nemet
Richard Sandiford writes:
> However, IMO, your argument about MTI being the central authority
> is a killer one.  The purpose of the GNU tools should be to follow
> appropriate standards where applicable (and extend them where it
> seems wise).  So from that point of view, I agree that the GNU tools
> should follow the ABI that Nigel and MTI set down.  Consider my
> patch withdrawn.

While I'm not entirely clear how this decision came about I'd like to point
out that it is unfortunate that MTI had not sought wider consensus for this
ABI extension among MIPS implementors and the community.

We would not be in this situation with duplicated efforts and much frustration
if this proposal had been circulated properly ahead of time.

> I've been thinking about that a lot recently, since I heard about
> your implementation.  I kind-of guessed it had been agreed with MTI
> beforehand (although I hadn't realised MTI themselves had written
> the specification).  Having thought it over, I think it would be best
> if I stand down as a MIPS maintainer and if someone with the appropriate
> commercial connections is appointed instead.  I'd recommend any
> combination of yourself, Adam Nemet and David Daney (subject to
> said people being willing, of course).

Richard, while I understand your frustration I really hope that you will
reconsider your decision and remain the MIPS maintainer.  I think there is a
chance that if the community expresses that MTI should seek broader consensus
for such proposals they will do so in the future.

Your expertise as the GCC maintainer has improved the backend tremendously and
and you should be given all the information necessary to continue your great
work.

Adam


Re: Feature request - a macro defined for GCC

2008-07-02 Thread Andrew Haley
Jim Wilson wrote:
> If the Intel compiler correctly implements the GNU C language,
> then it shouldn't matter if the code is being compiled by GCC or ICC.
> Unless maybe you ran into a GCC bug, and want to enable a workaround
> only for GCC.

I think you'd want to conditionalize such a test on the GCC version
anyway.

Andrew.


Re: RFC: Adding non-PIC executable support to MIPS

2008-07-02 Thread Thiemo Seufer
Richard Sandiford wrote:
> Daniel Jacobowitz <[EMAIL PROTECTED]> writes:
> > We've shipped our version.  Richard's version has presumably also
> > shipped.
> 
> Right.
> 
> > We did negotiate the ABI changes with MTI; this is not quite
> > as good as doing it in full view, but it was the best we could manage
> > and MTI is as close to a central authority for the MIPS psABI as
> > exists today.
> >
> > Richard, what are your thoughts on reconciling the differences?  You
> > can surely guess that I want to avoid changing our ABI now, even for
> > relatively significant technical reasons - I'm all ears if there's a
> > major reason, but in the comparisons I do not see one.
> 
> I suppose I still support the trade-off between the 5-insn MIPS I stubs
> (with extra-long variation for large PLT indices) and the absolute
> .got.plt address I used.  And I still think it's shame we're treating
> STO_MIPS_PLT and STO_MIPS16 as separate; we then only have 1 bit of
> st_other unclaimed.
> 
> However, IMO, your argument about MTI being the central authority
> is a killer one.  The purpose of the GNU tools should be to follow
> appropriate standards where applicable (and extend them where it
> seems wise).  So from that point of view, I agree that the GNU tools
> should follow the ABI that Nigel and MTI set down.  Consider my
> patch withdrawn.
> 
> TBH, the close relationship between CodeSourcery and MTI
> make it difficult for a non-Sourcerer and non-MTI employee
> to continue to be a MIPS maintainer.  I won't be in-the-know
> about this sort of thing.
> 
> I've been thinking about that a lot recently, since I heard about
> your implementation.  I kind-of guessed it had been agreed with MTI
> beforehand (although I hadn't realised MTI themselves had written
> the specification).

The specification is a co-production of MTI and CS. I believe the
reason why it wasn't discussed in a wider audience is that it occured
to nobody there could be a parallel effort going on after all those
years!

> Having thought it over, I think it would be best
> if I stand down as a MIPS maintainer and if someone with the appropriate
> commercial connections is appointed instead.  I'd recommend any
> combination of yourself, Adam Nemet and David Daney (subject to
> said people being willing, of course).

FWIW, I believe a person who is _not_ in the midst of the commercial
pressures adds valuable perspective as a maintainer.


Thiemo


Inefficient loop unrolling.

2008-07-02 Thread Bingfeng Mei
Hello,
I am looking at GCC's loop unrolling and find it quite inefficient
compared with manually unrolled loop even for very simple loop. The
followings are a simple loop and its manually unrolled version. I didn't
apply any trick on manually unrolled one as it is exact replications of
original loop body. I have expected by -funroll-loops the first version
should produce code of similar quality as the second one. However,
compiled with ARM target of mainline GCC, both functions produce very
different results. 

GCC-unrolled version mainly suffers from two issues. First, the
load/store offsets are registers. Extra ADD instructions are needed to
increase offset over iteration. In the contrast, manually unrolled code
makes use of immediate offset efficiently and only need one ADD to
adjust base register in the end. Second, the alias (dependence) analysis
is over conservative. The LOAD instruction of next unrolled iteration
cannot be moved beyond previous STORE instruction even they are clearly
not aliased. I suspect the failure of alias analysis is related to the
first issue of handling base and offset address. The .sched2 file shows
that the first loop body requires 57 cycles whereas the second one takes
50 cycles for arm9 (56 cycles vs 34 cycles for Xscale).  It become even
worse for our VLIW porting due to longer latency of MUL and Load
instructions and incapability of filling all slots (120 cycles vs. 20
cycles)

By analyzing compilation phases, I believe if the loop unrolling happens
at the tree-level, or if we have an optimizing pass like "ivopts" after
loop unrolling in RTL level, GCC can produce far more efficient
loop-unrolled code.  "ivopts" pass really does a wonderful job in
optimizing induction variables. Strangely, I found some unrolling
functions at tree-level, but there is no independent tree-level loop
unrolling pass except "cunroll", which is complete unrolling.  What
prevents such a tree-level unrolling pass? Or is there any suggestion to
improve existing RTL level unrolling? Thanks in advance. 

Cheers,
Bingfeng Mei
Broadcom UK


void Unroll( short s, int * restrict b_inout, int *restrict out)
{
int i;
for (i=0; i<64; i++)
{
b_inout[i] = b_inout[i] * s;
}
}


void ManualUnroll( short s, int * restrict b_inout, int *restrict out)
{
int i;
for (i=0; i<64;)
{
b_inout[i] = b_inout[i] * s;
i++;
b_inout[i] = b_inout[i] * s;
i++;
b_inout[i] = b_inout[i] * s;
i++;
b_inout[i] = b_inout[i] * s;
i++;
b_inout[i] = b_inout[i] * s;
i++;
b_inout[i] = b_inout[i] * s;
i++;
b_inout[i] = b_inout[i] * s;
i++;
b_inout[i] = b_inout[i] * s;
i++;
}
}


arm-elf-gcc tst2.c -O2  -std=c99 -S  -v -fdump-tree-all  -da  -mcpu=arm9
-funroll-loops
Unroll:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
mov r0, r0, asl #16
stmfd   sp!, {r4, r5, r6}
mov r4, r1
mov r6, r0, asr #16
mov r5, #0
.L2:
ldr r1, [r4, r5]
add ip, r5, #4
mul r0, r6, r1
str r0, [r4, r5]
ldr r3, [r4, ip]
add r0, ip, #4
mul r2, r6, r3
str r2, [r4, ip]
ldr r1, [r4, r0]
add ip, r5, #12
mul r3, r6, r1
str r3, [r4, r0]
ldr r2, [r4, ip]
add r1, r5, #16
mul r3, r6, r2
str r3, [r4, ip]
ldr r0, [r4, r1]
add ip, r5, #20
mul r3, r6, r0
str r3, [r4, r1]
ldr r2, [r4, ip]
add r1, r5, #24
mul r0, r6, r2
str r0, [r4, ip]
ldr r3, [r4, r1]
add ip, r5, #28
mul r0, r6, r3
str r0, [r4, r1]
ldr r2, [r4, ip]
add r5, r5, #32
mul r3, r6, r2
cmp r5, #256
str r3, [r4, ip]
bne .L2
ldmfd   sp!, {r4, r5, r6}
bx  lr
.size   Unroll, .-Unroll

arm-elf-gcc tst2.c -O2  -std=c99 -S  -v -fdump-tree-all  -da  -mcpu=arm9

ManualUnroll:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
mov r0, r0, asl #16
stmfd   sp!, {r4, r5, r6, r7, r8, r9, sl, fp}
mov sl, r1
mov r9, r0, asr #16
add fp, r1, #256
.L7:
ldr r3, [sl, #0]
ldr r2, [sl, #4]
ldr r1, [sl, #8]
ldr r0, [sl, #12]
ldr ip, [sl, #16]
add r4, sl, #20
ldmia   r4, {r4, r5, r6}@ phole ldm
mul r7, r9, r3
   

Re: Inefficient loop unrolling.

2008-07-02 Thread Richard Guenther
On Wed, Jul 2, 2008 at 1:13 PM, Bingfeng Mei <[EMAIL PROTECTED]> wrote:
> Hello,
> I am looking at GCC's loop unrolling and find it quite inefficient
> compared with manually unrolled loop even for very simple loop. The
> followings are a simple loop and its manually unrolled version. I didn't
> apply any trick on manually unrolled one as it is exact replications of
> original loop body. I have expected by -funroll-loops the first version
> should produce code of similar quality as the second one. However,
> compiled with ARM target of mainline GCC, both functions produce very
> different results.
>
> GCC-unrolled version mainly suffers from two issues. First, the
> load/store offsets are registers. Extra ADD instructions are needed to
> increase offset over iteration. In the contrast, manually unrolled code
> makes use of immediate offset efficiently and only need one ADD to
> adjust base register in the end. Second, the alias (dependence) analysis
> is over conservative. The LOAD instruction of next unrolled iteration
> cannot be moved beyond previous STORE instruction even they are clearly
> not aliased. I suspect the failure of alias analysis is related to the
> first issue of handling base and offset address. The .sched2 file shows
> that the first loop body requires 57 cycles whereas the second one takes
> 50 cycles for arm9 (56 cycles vs 34 cycles for Xscale).  It become even
> worse for our VLIW porting due to longer latency of MUL and Load
> instructions and incapability of filling all slots (120 cycles vs. 20
> cycles)
>
> By analyzing compilation phases, I believe if the loop unrolling happens
> at the tree-level, or if we have an optimizing pass like "ivopts" after
> loop unrolling in RTL level, GCC can produce far more efficient
> loop-unrolled code.  "ivopts" pass really does a wonderful job in
> optimizing induction variables. Strangely, I found some unrolling
> functions at tree-level, but there is no independent tree-level loop
> unrolling pass except "cunroll", which is complete unrolling.  What
> prevents such a tree-level unrolling pass? Or is there any suggestion to
> improve existing RTL level unrolling? Thanks in advance.

On the tree level only complete unrolling is done.  The reason for this
was the difficulty (or our unwillingness) to properly tune this for the
target (loop unrolling is not a generally profitable optimization, unless
complete unrolling for small loops).

I would suggest to look into doing also partial unrolling on the tree level.

Richard.


Re: RFC: Adding non-PIC executable support to MIPS

2008-07-02 Thread Daniel Jacobowitz
On Tue, Jul 01, 2008 at 09:43:30PM +0100, Richard Sandiford wrote:
> I suppose I still support the trade-off between the 5-insn MIPS I stubs
> (with extra-long variation for large PLT indices) and the absolute
> .got.plt address I used.  And I still think it's shame we're treating
> STO_MIPS_PLT and STO_MIPS16 as separate; we then only have 1 bit of
> st_other unclaimed.

I'm undecided about the MIPS I issue, but I completely agree about
STO_MIPS16/STO_MIPS_PLT.  I wish we'd thought of that too.  At least
our implementation didn't have STO_MIPS_PIC; so there's one bit left,
and assuming we add support for ld -r (likely) we can do it your way.

For the final merged versions of these patches, even if they implement
"our" version, I hope to draw heavily on your work.  It's always high
quality and the GOT cleanups in particular look very useful.

> TBH, the close relationship between CodeSourcery and MTI
> make it difficult for a non-Sourcerer and non-MTI employee
> to continue to be a MIPS maintainer.  I won't be in-the-know
> about this sort of thing.
> 
> I've been thinking about that a lot recently, since I heard about
> your implementation.  I kind-of guessed it had been agreed with MTI
> beforehand (although I hadn't realised MTI themselves had written
> the specification).  Having thought it over, I think it would be best
> if I stand down as a MIPS maintainer and if someone with the appropriate
> commercial connections is appointed instead.  I'd recommend any
> combination of yourself, Adam Nemet and David Daney (subject to
> said people being willing, of course).

I'm sorry you feel this way.  I believe strongly that corporate
affiliation is not a useful indicator for maintainership; the system
we have set up here relies more on individual knowledge and experience
than affiliation.

We could have done more to keep everyone informed of our progress; I
could write an essay on why we didn't, but I'd rather not.  We're
talking internally about how to avoid this unfortunate coincidence in
the future.  Anyway, there's plenty of blame to go around.

I think you're doing an excellent job as a GCC maintainer, and so does
everyone I spoke to about this at CS.  If you no longer have the time
or incentive to do it, I won't argue with you about stepping down, but
please don't because of this incident.

[In any case, I'd decline; I'm trying to shed some of my existing
maintenance responsibilities so that I can spend more time on the ones
I care most about.  Anyone else want to be binutils release manager?]

-- 
Daniel Jacobowitz
CodeSourcery


RE: Inefficient loop unrolling.

2008-07-02 Thread Bingfeng Mei
The loop unrolling is often a big deal for embedded processors. It makes
10 times of performance difference for our VLIW processor in many loops.
I will look into partial loop unrolling at tree-level.  If possible, I
would like to make some contributions to GCC. I just learned my company
(Broadcom) has signed company-wide FSF copyright assignment.

Cheers,
Bingfeng

-Original Message-
From: Richard Guenther [mailto:[EMAIL PROTECTED] 
Sent: 02 July 2008 12:59
To: Bingfeng Mei
Cc: gcc@gcc.gnu.org
Subject: Re: Inefficient loop unrolling.

On Wed, Jul 2, 2008 at 1:13 PM, Bingfeng Mei <[EMAIL PROTECTED]> wrote:
> Hello,
> I am looking at GCC's loop unrolling and find it quite inefficient
> compared with manually unrolled loop even for very simple loop. The
> followings are a simple loop and its manually unrolled version. I
didn't
> apply any trick on manually unrolled one as it is exact replications
of
> original loop body. I have expected by -funroll-loops the first
version
> should produce code of similar quality as the second one. However,
> compiled with ARM target of mainline GCC, both functions produce very
> different results.
>
> GCC-unrolled version mainly suffers from two issues. First, the
> load/store offsets are registers. Extra ADD instructions are needed to
> increase offset over iteration. In the contrast, manually unrolled
code
> makes use of immediate offset efficiently and only need one ADD to
> adjust base register in the end. Second, the alias (dependence)
analysis
> is over conservative. The LOAD instruction of next unrolled iteration
> cannot be moved beyond previous STORE instruction even they are
clearly
> not aliased. I suspect the failure of alias analysis is related to the
> first issue of handling base and offset address. The .sched2 file
shows
> that the first loop body requires 57 cycles whereas the second one
takes
> 50 cycles for arm9 (56 cycles vs 34 cycles for Xscale).  It become
even
> worse for our VLIW porting due to longer latency of MUL and Load
> instructions and incapability of filling all slots (120 cycles vs. 20
> cycles)
>
> By analyzing compilation phases, I believe if the loop unrolling
happens
> at the tree-level, or if we have an optimizing pass like "ivopts"
after
> loop unrolling in RTL level, GCC can produce far more efficient
> loop-unrolled code.  "ivopts" pass really does a wonderful job in
> optimizing induction variables. Strangely, I found some unrolling
> functions at tree-level, but there is no independent tree-level loop
> unrolling pass except "cunroll", which is complete unrolling.  What
> prevents such a tree-level unrolling pass? Or is there any suggestion
to
> improve existing RTL level unrolling? Thanks in advance.

On the tree level only complete unrolling is done.  The reason for this
was the difficulty (or our unwillingness) to properly tune this for the
target (loop unrolling is not a generally profitable optimization,
unless
complete unrolling for small loops).

I would suggest to look into doing also partial unrolling on the tree
level.

Richard.




Re: Feature request - a macro defined for GCC

2008-07-02 Thread Vincent Lefevre
On 2008-07-01 11:11:42 -0700, Ian Lance Taylor wrote:
> __GNUC__ is indeed defined by the compiler proper, not by the
> preprocessor.

What do you mean here?

Even when calling the preprocessor directly, __GNUC__ is defined:

vin% gcc -dM -E -xc /dev/null | grep __GNUC__
#define __GNUC__ 4
vin% cpp -dM /dev/null | grep __GNUC__
#define __GNUC__ 4

> But that in turn does not matter, as if any non-gcc compiler *did* use
> the gcc preprocessor, it would do so via gcc -E.  In gcc, the
> preprocessor is not a separate program.

But in any case, there's a separate preprocessor: cpp. And perhaps cpp
shouldn't define __GNUC__.

(BTW, this isn't a compiler, but xrdb uses cpp by default.)

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)


Re: Feature request - a macro defined for GCC

2008-07-02 Thread Vincent Lefevre
On 2008-07-02 00:12:33 +, Joseph S. Myers wrote:
> This internal binary no longer exists. Instead, there is a "cpp"
> binary installed in the user binary directory, which calls the "cc1"
> binary to do the same preprocessing as it does when compiling; that
> is, it has the same effect as "gcc -E".

Not exactly:

vin% cpp -dM /dev/null | wc -l
128
vin% gcc -E -dM /dev/null | wc -l
gcc.real: /dev/null: linker input file unused because linking not done
0

Is it a bug of "gcc -E"?

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)


Re: Feature request - a macro defined for GCC

2008-07-02 Thread Jack Lloyd
On Wed, Jul 02, 2008 at 03:47:49PM +0200, Vincent Lefevre wrote:
> On 2008-07-02 00:12:33 +, Joseph S. Myers wrote:
> > This internal binary no longer exists. Instead, there is a "cpp"
> > binary installed in the user binary directory, which calls the "cc1"
> > binary to do the same preprocessing as it does when compiling; that
> > is, it has the same effect as "gcc -E".
> 
> Not exactly:
> 
> vin% cpp -dM /dev/null | wc -l
> 128
> vin% gcc -E -dM /dev/null | wc -l
> gcc.real: /dev/null: linker input file unused because linking not done
> 0
> 
> Is it a bug of "gcc -E"?

Not really, it just doesn't understand it needs to treat an empty file as
C... instead you have to tell it so with -x c

(wks9 ~)$ cpp -dM /dev/null | wc -l
86
(wks9 ~)$ gcc -E -dM /dev/null | wc -l
gcc: /dev/null: linker input file unused because linking not done
0
(wks9 ~)$ gcc -E -x c -dM /dev/null | wc -l
86
(wks9 ~)$ gcc -E -x c++ -dM /dev/null | wc -l
92


Re: Feature request - a macro defined for GCC

2008-07-02 Thread Andreas Schwab
Vincent Lefevre <[EMAIL PROTECTED]> writes:

> On 2008-07-02 00:12:33 +, Joseph S. Myers wrote:
>> This internal binary no longer exists. Instead, there is a "cpp"
>> binary installed in the user binary directory, which calls the "cc1"
>> binary to do the same preprocessing as it does when compiling; that
>> is, it has the same effect as "gcc -E".
>
> Not exactly:
>
> vin% cpp -dM /dev/null | wc -l
> 128
> vin% gcc -E -dM /dev/null | wc -l
> gcc.real: /dev/null: linker input file unused because linking not done
> 0
>
> Is it a bug of "gcc -E"?

You need to tell gcc that /dev/null is a C file, since it does not have
a recognized extension.

$ gcc -E -dM -xc /dev/null | wc -l
120

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


How to identify comparison of 8bit operands

2008-07-02 Thread Mohamed Shafi
Hello all,

I am involved in porting a 16bit target in gcc 4.1.2
The target that i am porting to has a minor flaw. Comparison of signed
variables will go wrong. So i have to use a different approach to do
comparison of signed operands. This obviously takes more cycles and
instructions. But the comparison of sign-extended 8bit values are
proper. So i can use the normal comparison for char and the modified
one for 16bit values. So my question is in the back-end will i be able
to identify between comparisons of signed-extended 8bit and 16bit
operands?

Regards,
Shafi


Re: Feature request - a macro defined for GCC

2008-07-02 Thread Ian Lance Taylor
Vincent Lefevre <[EMAIL PROTECTED]> writes:

>> But that in turn does not matter, as if any non-gcc compiler *did* use
>> the gcc preprocessor, it would do so via gcc -E.  In gcc, the
>> preprocessor is not a separate program.
>
> But in any case, there's a separate preprocessor: cpp. And perhaps cpp
> shouldn't define __GNUC__.

You're right, there is a program which appears to be a separate
preprocessor.  In actual fact, that program is just gcc under a
different name.

I think it would be reasonable to argue that that program should not
define __GNUC__ by default.  I don't actually know which choice people
would find more surprising.  And unfortunately I also don't know how
to find out.

Ian


Debug build

2008-07-02 Thread John Freeman

Howdy,

This is something I look into periodically, and each time I find a 
solution that's slightly better, but not what I want.  I've looked at 
the wiki (http://gcc.gnu.org/wiki/DebuggingGCC) many times, so no need 
to refer me there.  I am trying to debug the C++ front-end, and I took 
the wiki's recommendation for building a debuggable compiler

$ make CFLAGS="-g3 -O0" all-stage1
but it did not seem to build the C++ front-end, even though I configured 
with --enable-languages=c,c++.  There is no $build-dir/gcc/cc1plus and 
$build-dir/gcc/cp is empty.  Also, the command

$ make install
failed afterward with
/bin/sh: line 1: cd: ./fixincludes: No such file or directory

Any help?

- John



Re: Feature request - a macro defined for GCC

2008-07-02 Thread Vincent Lefevre
On 2008-07-02 10:10:32 -0400, Jack Lloyd wrote:
> Not really, it just doesn't understand it needs to treat an empty
> file as C... instead you have to tell it so with -x c

But is there any reason why cpp assumes C as a fallback, but not gcc
(at least with the -E option)? IMHO, this is a bit inconsistent, in
particular if cpp is seen as a synonym for "gcc -E".

vin% cpp -dM /dev/null | md5sum
d7760eedc87eba1427f096989c3e2a49  -
vin% cpp -xc -dM /dev/null | md5sum
d7760eedc87eba1427f096989c3e2a49  -
vin% cpp -xc++ -dM /dev/null | md5sum
0ce80933d788e730beec1886af757d44  -
vin% gcc -E -dM /dev/null | md5sum   
gcc.real: /dev/null: linker input file unused because linking not done
d41d8cd98f00b204e9800998ecf8427e  -
vin% gcc -E -xc -dM /dev/null | md5sum
d7760eedc87eba1427f096989c3e2a49  -
vin% gcc -E -xc++ -dM /dev/null | md5sum
0ce80933d788e730beec1886af757d44  -

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)


Re: Feature request - a macro defined for GCC

2008-07-02 Thread Richard Guenther
On Wed, Jul 2, 2008 at 5:16 PM, Vincent Lefevre <[EMAIL PROTECTED]> wrote:
> On 2008-07-02 10:10:32 -0400, Jack Lloyd wrote:
>> Not really, it just doesn't understand it needs to treat an empty
>> file as C... instead you have to tell it so with -x c
>
> But is there any reason why cpp assumes C as a fallback, but not gcc
> (at least with the -E option)? IMHO, this is a bit inconsistent, in
> particular if cpp is seen as a synonym for "gcc -E".

Because it's Cpp, the C preprocessor.

Richard.


Re: Feature request - a macro defined for GCC

2008-07-02 Thread rkiesling
Vincent Lefevre:
[ Charset ISO-8859-1 converted... ]
> On 2008-07-01 11:11:42 -0700, Ian Lance Taylor wrote:
> > __GNUC__ is indeed defined by the compiler proper, not by the
> > preprocessor.
> 
> What do you mean here?
> 
> Even when calling the preprocessor directly, __GNUC__ is defined:
> 
> vin% gcc -dM -E -xc /dev/null | grep __GNUC__
> #define __GNUC__ 4
> vin% cpp -dM /dev/null | grep __GNUC__
> #define __GNUC__ 4
> 
> > But that in turn does not matter, as if any non-gcc compiler *did* use
> > the gcc preprocessor, it would do so via gcc -E.  In gcc, the
> > preprocessor is not a separate program.
> 
> But in any case, there's a separate preprocessor: cpp. And perhaps cpp
> shouldn't define __GNUC__.
> 
> (BTW, this isn't a compiler, but xrdb uses cpp by default.)

Try:

$ echo ' ' | cpp -undef -dM -

and determine if there's any output (varies by platform).

The ctpp preprocessor undefines all builtins when -undef is present.  See the 
URL below (plug, I know).

-- 
Ctalk Home Page: http://www.ctalklang.org


Re: Inefficient loop unrolling.

2008-07-02 Thread Steven Bosscher
On Wed, Jul 2, 2008 at 1:13 PM, Bingfeng Mei <[EMAIL PROTECTED]> wrote:
> Hello,
> I am looking at GCC's loop unrolling and find it quite inefficient
> compared with manually unrolled loop even for very simple loop. The
> followings are a simple loop and its manually unrolled version. I didn't
> apply any trick on manually unrolled one as it is exact replications of
> original loop body. I have expected by -funroll-loops the first version
> should produce code of similar quality as the second one. However,
> compiled with ARM target of mainline GCC, both functions produce very
> different results.
>
> GCC-unrolled version mainly suffers from two issues. First, the
> load/store offsets are registers. Extra ADD instructions are needed to
> increase offset over iteration. In the contrast, manually unrolled code
> makes use of immediate offset efficiently and only need one ADD to
> adjust base register in the end. Second, the alias (dependence) analysis
> is over conservative. The LOAD instruction of next unrolled iteration
> cannot be moved beyond previous STORE instruction even they are clearly
> not aliased. I suspect the failure of alias analysis is related to the
> first issue of handling base and offset address. The .sched2 file shows
> that the first loop body requires 57 cycles whereas the second one takes
> 50 cycles for arm9 (56 cycles vs 34 cycles for Xscale).  It become even
> worse for our VLIW porting due to longer latency of MUL and Load
> instructions and incapability of filling all slots (120 cycles vs. 20
> cycles)

Both issues should be solvable for RTL (where unrolling belongs IMHO).
 If you file a PR in bugzilla (with this test case, target, compiler
options, etc), I promise I will analyze why we don't fold away the
ADDs, and why the scheduler doesn't glob the loads (may be due to bad
alias analysis, but maybe something else is not working properly).

Gr.
Steven


Re: Debug build

2008-07-02 Thread Ian Lance Taylor
John Freeman <[EMAIL PROTECTED]> writes:

> This is something I look into periodically, and each time I find a
> solution that's slightly better, but not what I want.  I've looked at
> the wiki (http://gcc.gnu.org/wiki/DebuggingGCC) many times, so no need
> to refer me there.  I am trying to debug the C++ front-end, and I took
> the wiki's recommendation for building a debuggable compiler
> $ make CFLAGS="-g3 -O0" all-stage1
> but it did not seem to build the C++ front-end, even though I
> configured with --enable-languages=c,c++.  There is no
> $build-dir/gcc/cc1plus and $build-dir/gcc/cp is empty.  Also, the
> command
> $ make install
> failed afterward with
> /bin/sh: line 1: cd: ./fixincludes: No such file or directory
>
> Any help?

Use --enable-stage1-languages=c,c++ .

Ian


Re: dumping the tree

2008-07-02 Thread Taras

Jaroslav Sýkora wrote:

Hello,
I am working on a research project in which I want to export a whole
syntax/semantic tree of a c++ program from the compiler. My current
solution is to use the -fdump-tree-all option and take the *.t00.tu
files (translation unit dump). I've hacked the gcc/tree-dump.c so the
exported graph is in a machine-readable xml file.
This all works quite well in gcc 4.1.0. But I've hit a problem with
gcc 4.2 and newer - the dump now doesn't contain any function bodies.
Specificaly, in tree-dump.c::dequeue_and_dump() there is
case FUNCTION_DECL:
...
dump_child ("body", DECL_SAVED_TREE (t));
where 't' points to the FUNCTION_DECL tree. It seems that
DECL_SAVED_TREE(t) is always NULL in gcc >= 4.2.
Practically I am only interested in the gimple cfg and its basic
blocks, which I used to get via DECL_STRUCT_FUNCTION(t) - but that
doesn't work now either. Working copy of my patches is available for
your reference at http://necago.ic.cz/prj/scc/
  

We support this sort of thing in Mozilla's Treehydra gcc plugin. See
http://developer.mozilla.org/en/docs/Treehydra

In addition to cfg you can get much other stuff such as types.

Taras


Re: gcc-in-cxx branch created

2008-07-02 Thread Hendrik Boom
On Wed, 25 Jun 2008 20:11:56 -0700, Ian Lance Taylor wrote:

> Ivan Levashew <[EMAIL PROTECTED]> writes:
> 
>>> Your comment makes little sense in context.  Nobody could claim that
>>> the existing gengtype code is simple.  Not many people understand how
>>> it works at all.  In order to support STL containers holding GC
>>> objects, it will need to be modified.
>>
>> I'm sure there is a library of GC-managed components in C++.
> 
> I'm sure there is too.  In gcc we use the same data structures to
> support both GC and PCH.  Switching to a set of C++ objects is likely to
> be a complex and ultimately unrewarding task.
> 
> 
>>> I don't know what you mean by your reference to the Cyclone variant of
>>> C, unless you are trying to say something about gcc's use of garbage
>>> collection.
>>>
>>>
>> Cyclone has many options for memory management. I don't know for sure
>> if there is a need for GC in GCC at all.
> 
> I would prefer it if gcc didn't use GC, but it does, and undoing that
> decision will be a long hard task which may never get done.
> 
>> Cyclone has a roots not only in C, but also ML. Some techniques like
>> pattern matching, aggregates, variadic arrays, tuples looks to be more
>> appropriate here than their C++'s metaprogrammed template analogues.
> 
> I guess I don't know Cyclone.  If you are suggesting that we use Cyclone
> instead of C++, I think that is a non-starter.  We need to use a
> well-known widely-supported language, and it must be a language which
> gcc itself supports.
> 
> Ian

There are a number of languages that would probably be better-suited to 
programming gcc than C or C++, on technical grounds alone.  Modula-3 
comes to mind.  Cyclone certainly looks like a possibility, and has the 
advantage that it would probebly be less of a shock to the existing code 
base.  But if it is a requirement for using a language that everyone 
already knows it, we will forever be doomed to C and its insecure 
brethren.

-- hendrik






Re: Debug build

2008-07-02 Thread John Freeman

Ian Lance Taylor wrote:

John Freeman <[EMAIL PROTECTED]> writes:

  

This is something I look into periodically, and each time I find a
solution that's slightly better, but not what I want.  I've looked at
the wiki (http://gcc.gnu.org/wiki/DebuggingGCC) many times, so no need
to refer me there.  I am trying to debug the C++ front-end, and I took
the wiki's recommendation for building a debuggable compiler
$ make CFLAGS="-g3 -O0" all-stage1
but it did not seem to build the C++ front-end, even though I
configured with --enable-languages=c,c++.  There is no
$build-dir/gcc/cc1plus and $build-dir/gcc/cp is empty.  Also, the
command
$ make install
failed afterward with
/bin/sh: line 1: cd: ./fixincludes: No such file or directory

Any help?



Use --enable-stage1-languages=c,c++ .
  
I did this and it did build the C++ front-end at stage 1.  However, it 
still failed to install.



You need to configure with --disable-bootstrap if you want the C++
frontend built in "stage1".  Also try make all-gcc instead (or just make
to also build the runtime).

Richard.


I think this method is different in ways that will increase the build 
time.  Can anyone confirm?


I feel that with either of these suggestions, the documentation on the 
wiki needs to be changed.  It says


"To build a debuggable compiler, configure the compiler normally and 
then ..."


Also, it says to add
CFLAGS="-g3 -O0"
to the make command.  I've done this, and although I haven't tried 
debugging the compiler, I've watched the make output, and it never 
compiles anything with these CFLAGS.  Is it pointless to include it in 
the make command?  I feel like the DebuggingGCC wiki is useless except 
for the scripts (debug, debugx) that it links.


Opening up this topic a little more, is there anyone out there who 
routinely works on GCC that would like to share their workflow for 
building (as little as possible that encompasses changes), testing, and 
debugging?


- John


Re: Debug build

2008-07-02 Thread Ian Lance Taylor
John Freeman <[EMAIL PROTECTED]> writes:

>> You need to configure with --disable-bootstrap if you want the C++
>> frontend built in "stage1".  Also try make all-gcc instead (or just make
>> to also build the runtime).
>>
>
> I think this method is different in ways that will increase the build
> time.  Can anyone confirm?

It's the other way around; configuring with --disable-bootstrap
decreases build time.

> Opening up this topic a little more, is there anyone out there who
> routinely works on GCC that would like to share their workflow for
> building (as little as possible that encompasses changes), testing,
> and debugging?

I do most work in an object directory configured with
--disable-bootstrap.  When I want a debug build I do "make CFLAGS=-g
all-gcc".  For testing I use a different object directory configured
without --disable-bootstrap.

Ian


Re: RFC: Adding non-PIC executable support to MIPS

2008-07-02 Thread Richard Sandiford
Thanks to everyone for their kind messages.  I won't drag this out
for non-MIPS folk by replying publicly to each one.

Daniel Jacobowitz <[EMAIL PROTECTED]> writes:
> the GOT cleanups in particular look very useful.

Thanks.  To be clear: the withdrawal was simply for the patches in this
message.  Although the original motivation for the GOT cleanups was to
reduce the amount of wasted space in mostly-non-PIC executables,
they're really a separate change in their own right.  My hope was that,
even without the non-PIC stuff, the new code might be more maintainable
than what we have now.

> We could have done more to keep everyone informed of our progress; I
> could write an essay on why we didn't, but I'd rather not.  We're
> talking internally about how to avoid this unfortunate coincidence in
> the future.  Anyway, there's plenty of blame to go around.

FWIW, I don't blame MTI or CS at all for this.  Duplicated effort is
part of the risk one runs with the model that both you (CS) and I were
following.  (And for the record, I say "I" because the fault was mine
rather than Specifix's.)

When I was doing the work, I was expecting to use the patches as the
basis for a discussion on this list.  And I honestly expected to have to
change some of the details as a result.  E.g. I wasn't sure what the
reaction would be to requiring MIPS II or above.  So it's no surprise
that my version as posted is not going to be used.

And when I learnt about your alternative implementation, I was expecting
some of that implementation to be chosen instead.  The difficulty was
simply that, as you rightly said, MTI are the authority here.  That made
any discussion on this list moot.

That was just an attempt to clarify things rather than force you
into writing an essay ;)

Anyway, enough of that.  Back to technical stuff.  Would it work if we
had stubs like this:

lui t7,%hi(.got.plt entry)
lw  t9,%lo(.got.plt entry)(t7)
addiu   t8,t7,%lo(.got.plt entry)
jr  t9
...
lui t7,%hi(.got.plt entry + 4)  [next entry]

and a header like this:

lui gp,%hi(.got.plt)
lw  t9,%lo(.got.plt)(gp)
addiu   gp,gp,%lo(.got.plt)
subut8,t8,gp
movet7,ra
srl t8,t8,2
jalrt9
subut8,t8,2

(Key for my benefit, 'cos I can only think in terms of numerical
registers:

t7 = $15
t8 = $24
t9 = $25)

The size of the header and first 0x1 stubs would be the same.
I think it would also preserve the resolver interface while removing
the need for the extra-large .plts.  The only incompatibility I can
see would be that objdump on older executables would not get the
[EMAIL PROTECTED] symbols right for large indices.

OTOH, perhaps you could argue that the extra complication of the
two PLT entries doesn't count for much given that the code is
already written.  It's just an idea.

Richard


Re: RFC: Adding non-PIC executable support to MIPS

2008-07-02 Thread Daniel Jacobowitz
On Wed, Jul 02, 2008 at 08:55:54PM +0100, Richard Sandiford wrote:
> The size of the header and first 0x1 stubs would be the same.
> I think it would also preserve the resolver interface while removing
> the need for the extra-large .plts.  The only incompatibility I can
> see would be that objdump on older executables would not get the
> [EMAIL PROTECTED] symbols right for large indices.
> 
> OTOH, perhaps you could argue that the extra complication of the
> two PLT entries doesn't count for much given that the code is
> already written.  It's just an idea.

Your version looks fine to me, it's ABI-preserving, the PLT entries
still work for MIPS I and still have the same runtime cost when not
resolving.  I like it - thanks!

I'm not worried about making people upgrade objdump, either.

-- 
Daniel Jacobowitz
CodeSourcery


gcc-4.2-20080702 is now available

2008-07-02 Thread gccadmin
Snapshot gcc-4.2-20080702 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20080702/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.2 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_2-branch 
revision 137396

You'll find:

gcc-4.2-20080702.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.2-20080702.tar.bz2 C front end and core compiler

gcc-ada-4.2-20080702.tar.bz2  Ada front end and runtime

gcc-fortran-4.2-20080702.tar.bz2  Fortran front end and runtime

gcc-g++-4.2-20080702.tar.bz2  C++ front end and runtime

gcc-java-4.2-20080702.tar.bz2 Java front end and runtime

gcc-objc-4.2-20080702.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.2-20080702.tar.bz2The GCC testsuite

Diffs from 4.2-20080625 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.2
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: gcc-in-cxx branch created

2008-07-02 Thread Daniel Berlin
On Wed, Jul 2, 2008 at 2:30 PM, Hendrik Boom <[EMAIL PROTECTED]> wrote:
> On Wed, 25 Jun 2008 20:11:56 -0700, Ian Lance Taylor wrote:
>
>> Ivan Levashew <[EMAIL PROTECTED]> writes:
>>
 Your comment makes little sense in context.  Nobody could claim that
 the existing gengtype code is simple.  Not many people understand how
 it works at all.  In order to support STL containers holding GC
 objects, it will need to be modified.
>>>
>>> I'm sure there is a library of GC-managed components in C++.
>>
>> I'm sure there is too.  In gcc we use the same data structures to
>> support both GC and PCH.  Switching to a set of C++ objects is likely to
>> be a complex and ultimately unrewarding task.
>>
>>
 I don't know what you mean by your reference to the Cyclone variant of
 C, unless you are trying to say something about gcc's use of garbage
 collection.


>>> Cyclone has many options for memory management. I don't know for sure
>>> if there is a need for GC in GCC at all.
>>
>> I would prefer it if gcc didn't use GC, but it does, and undoing that
>> decision will be a long hard task which may never get done.
>>
>>> Cyclone has a roots not only in C, but also ML. Some techniques like
>>> pattern matching, aggregates, variadic arrays, tuples looks to be more
>>> appropriate here than their C++'s metaprogrammed template analogues.
>>
>> I guess I don't know Cyclone.  If you are suggesting that we use Cyclone
>> instead of C++, I think that is a non-starter.  We need to use a
>> well-known widely-supported language, and it must be a language which
>> gcc itself supports.
>>
>> Ian
>
> There are a number of languages that would probably be better-suited to
> programming gcc than C or C++, on technical grounds alone.


That's great.
We have more than just technical concerns.

>   But if it is a requirement for using a language that everyone
> already knows it, we will forever be doomed to C and its insecure
> brethren.
>
This has never been listed as a requirement.
It is certainly a consideration.
The main requirement for communities like GCC for something like
changing languages is consensus or at least a large set of active
developers willing to do something and the rest of them willing to not
commit suicide if it happens.
There are secondary requirements like "not stalling for years while
moving languages", "not losing serious performance", etc.

You are free to propose whatever language you like. It is unlikely you
will get support from any of the active contributors simply saying we
should use X because Y.
The best way to show us the advantages of using some other languages
is to convert some part of GCC to use it and show how much better it
is.

This is a big job, of course.  Then again, tree-ssa was started by
diego as a side project, and gained supporters and helpers as others
decided to spend their time on it.
You may find the same thing, in which case you may find it is not hard
to convince people to move to some other language.
You may find nobody agrees with you, even after seeing parts of gcc in
this new language.
I can guarantee you you will find nobody agrees with you if you sit on
the sidelines and do nothing but complain.

--Dan