date:20050503

Re: GCC 4.1: Buildable on GHz machines only?

2005-05-03 Thread Mike Stump

On Apr 29, 2005, at 6:03 PM, Joe Buck wrote:
I've seen claims that Darwin's linker is much more efficient than the
GNU linker, though I haven't confirmed this.
:-)  I have a vague recollection this is true (32=bit only).
If someone wants to post linux numbers and the command, I'll redo on  
my box.

make all && rm */libjava/? && time make all-target-libjava
type of thing.  Better would be someone that has both Yellow dog and  
Darwin on the same hardware, but...  don't know if we'll find that  
person.

[ pause ]  Ok, here is what I get for mainline 20050428:
$ rm powerpc-apple-darwin8.0.0/libjava/.libs/libgcj.la
$ rm powerpc-apple-darwin8.0.0/libjava/libgcj.la
$ time make -j2 all-target-libjava
real3m44.677s
user1m15.000s
sys 1m19.868s
G5 DP 2.5 GHz 512MB

Re: GCC 4.1: Buildable on GHz machines only?

2005-05-03 Thread Mike Stump

On Apr 29, 2005, at 7:41 AM, Daniel Jacobowitz wrote:
On Fri, Apr 29, 2005 at 12:49:37PM +0200, Lars Segerlund wrote:
 If we do a reasonable comparison of compile times against the  
intel compiler or
 the portland group or something similar we consistenly find that  
gcc is slower
 by a couple of times 1x - 3x, ( this is only my impression, not  
backed up by
 hard data but should be in the ballpark ).

Please don't add additional speculation to this already messy subject.
Feel free to come back with data.
Finder_FE_v3 on 1GHz G4 PowerBook, 512MB RAM:
10.4 Tiger 8A428 using gcc4.0 (build results window closed), PCH,  
native target, debug on, no optimization, indexing off.
   full build: 574 seconds = 2.25 WarMarks

10.3.4 with Metroworks Code Warrior 9.0 with debug on, no optimization
   full build: 255 seconds
but please, don't let our data get in the way...  mind you, this is  
better than 36x slower, which is where it used to be, but, still  
horrible.  To get a feel for it, drive 30 on the freeway in the fast  
lane.  :-)

For source, try QT, we believe it is representative.

gcc-3_3-branch frozen

2005-05-03 Thread Gabriel Dos Reis


Consider gcc-3_3-branch as frozen. Release script is running.

Thanks,

-- Gaby

Re: PPC 64bit library status?

2005-05-03 Thread Andreas Schwab

Mike Stump <[EMAIL PROTECTED]> writes:

>> 2. libgfortran.h line 63 defines int8_t.
>
> Ick!  Sounds like the configure mechanism went haywire.

This can happen when you reconfigure.  See
 for a possible
patch.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

volatile semantics

2005-05-03 Thread Mike Stump

int avail;
int main() {
  while (*(volatile int *)&avail == 0)
continue;
  return 0;
}
Ok, so, the question is, should gcc produce code that infinitely  
loops, or should it be obligated to actually fetch from memory?   
Hint, 3.3 fetched.

I get:
L6:
b L6
on mainline and 4.0.

Re: volatile semantics

2005-05-03 Thread Paolo Bonzini

Ok, so, the question is, should gcc produce code that infinitely  loops, 
or should it be obligated to actually fetch from memory?   Hint, 3.3 
fetched.
IANA(Language)L, but I think it should definitely fetch from memory.
Paolo

Re: volatile semantics

2005-05-03 Thread Giovanni Bajo

Mike Stump <[EMAIL PROTECTED]> wrote:

> int avail;
> int main() {
>while (*(volatile int *)&avail == 0)
>  continue;
>return 0;
> }
>
>
> Ok, so, the question is, should gcc produce code that infinitely
> loops, or should it be obligated to actually fetch from memory?
> Hint, 3.3 fetched.

I agree it should fetch. Did you try -fno-strict-aliasing? Open a bugreport,
I'd say.

Giovanni Bajo

Re: big slowdown gcc 3.4.3 vs gcc 3.3.4 (64 bit)

2005-05-03 Thread Giovanni Bajo

Kenneth P.. Massey wrote:

> The code below runs significantly slower when compiled in 64 bit with
> 3.4.3 than
> it does in 3.3.4, and both are significantly slower than a 32 bit
> compile.

Thanks for the report. Would you please open a bugreport in Bugzilla?
-- 
Giovanni Bajo

Re: Backporting to 4_0 the latest friend bits

2005-05-03 Thread Michael Matz

Hi,

On Mon, 2 May 2005, Mark Mitchell wrote:

> At the same time, if the code in question doesn't mean what the person
> who wrote it wants it to mean (e.g., if it implicitly declares classes
> in the scope of the friendly class, rather than nominating other classes
> as friends), then that code should still be fixed.

No disagreement from me here.

> It's certainly in the long-term interest of KDE not to have spurious
> friend declarations around, and I'd expect that as a KDE distributor you
> would want to encourage them to use the syntax that means what they
> want, even in parallel to possibly fixing the compiler.

Yep.  /us fighting in many places ;-)


Ciao,
Michael.

PLEASE HELP!

2005-05-03 Thread R Lokesh babu

Hello,
My application (COMMERCIAL SOFTWARE) links to
libstdc++, By default Solaris OS
does not install the libstdc++ package
(libgcc-3.3-sol9-sparc-local.gz).
Can I re-distribute the lib-gcc package and install it
along with my application
installation. (Is it a violation of GPL?) :)
Or should my Installation validate and ask the user to
install lib-gcc himself 
and re-run the installation.? :(
Please clarify.
Thanks

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com

Re: fold_indirect_ref bogous

2005-05-03 Thread Jeffrey A Law

On Wed, 2005-04-27 at 16:19 +0200, Richard Guenther wrote:
> fold_indirect_ref, called from the gimplifier happily converts
> 
>  const char *a;
> 
> ...
> 
>  *(char *)&a[x] = 0;
> 
> to
> 
>  a[x] = 0;
> 
> confusing alias1 and ICEing in verify_ssa:
> 
> /net/alwazn/home/rguenth/src/gcc/cvs/gcc-4.1/gcc/testsuite/gcc.c-torture/execute/20031215-1.c:11:
> error: Statement makes a memory store, but has no V_MAY_DEFS nor
> V_MUST_DEFS
> #   VUSE ;
> ao.ch[D.1242_5] = 0;
> /net/alwazn/home/rguenth/src/gcc/cvs/gcc-4.1/gcc/testsuite/gcc.c-torture/execute/20031215-1.c:11:
> internal compiler error: verify_ssa failed.
> 
> happens only for patched gcc where C frontend and fold happen to
> produce .02.original:
> 
> ;; Function test1 (test1)
> ;; enabled by -tree-original
> 
> 
> {
>   if (ao.ch[ao.l] != 0)
> {
>   *(char *) &ao.ch[(unsigned int) ao.l] = 0;
> }
> }
> 
> then, generic is already wrong:
> 
> test1 ()
> {
>   int D.1240;
>   char D.1241;
>   unsigned int D.1242;
> 
>   D.1240 = ao.l;
>   D.1241 = ao.ch[D.1240];
>   if (D.1241 != 0)
> {
>   D.1240 = ao.l;
>   D.1242 = (unsigned int) D.1240;
>   ao.ch[D.1242] = 0;
> }
> 
> (note the missing cast).
> 
> 
> something like the following patch fixes this.
Right.  Given that I'd seen similar problems with some unrelated
changes of mine, I went ahead and put your patch through the usual
bootstrap and regression tests on my i686 box.

Installed.




* tree-ssa-ccp.c (maybe_fold_stmt_indirect): Use STRIP_TYPE_NOPS
rather than STRIP_NOPS.

Index: tree-ssa-ccp.c
===
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-ccp.c,v
retrieving revision 2.68
diff -c -p -r2.68 tree-ssa-ccp.c
*** tree-ssa-ccp.c  3 May 2005 12:19:42 -   2.68
--- tree-ssa-ccp.c  3 May 2005 14:11:10 -
*** maybe_fold_stmt_indirect (tree expr, tre
*** 1585,1591 
   substitutions.  Fold that down to one.  Remove NON_LVALUE_EXPRs that
   are sometimes added.  */
base = fold (base);
!   STRIP_NOPS (base);
TREE_OPERAND (expr, 0) = base;
  
/* One possibility is that the address reduces to a string constant.  */
--- 1585,1591 
   substitutions.  Fold that down to one.  Remove NON_LVALUE_EXPRs that
   are sometimes added.  */
base = fold (base);
!   STRIP_TYPE_NOPS (base);
TREE_OPERAND (expr, 0) = base;
  
/* One possibility is that the address reduces to a string constant.  */

Re: Q about Ada and value ranges in types

2005-05-03 Thread Diego Novillo

On Mon, May 02, 2005 at 09:46:59PM -0400, Richard Kenner wrote:

> You're not showing where this comes from, so it's hard to say.  However
> D.1480 is created by the gimplifier, not the Ada front end.  There could
> easily be a typing problem in the tree there (e.g., that of the subtraction),
> but I can't tell for sure.
> 
Yeah, I didn't show all of it, sorry.  My patch to address this
problem includes a more detailed description
(http://gcc.gnu.org/ml/gcc-patches/2005-05/msg00127.html).

Florian Weimer suggested that instead of marking the range as
varying, we could check the super-type to see if it has a wider
range.  That is true in this case; the parent type is
types__Tname_idB which has range [-2147483648, 2147483647].  But
I'm not sure if that would be true in general.

> If the Ada language allows that kind of runtime check, then my
> fix to VRP will be different. 
> 
> I don't see it as a language issue: I'd argue that the tree in statement 2
> is invalid given the typing.  That should be true for any language.
> 
Dunno.  All the operands in the snippet I showed are of the exact
same type (types__name_id___XDLU_3__3).  I'm not
really sure where this type is coming from, but it's relatively
easy to reproduce.

Configure a compiler for target i386-pc-linux-gnu (or any other
i386 variant, not sure if it occurs elsewhere) and compile
ada/sem_intr.adb with:

$ ./xgcc -B./ -c -g -O2 -gnatpg -gnata -I- -I. -Iada -I/gcc/ada 
/gcc/ada/sem_intr.adb -o ada/sem_intr.o -v -save-temps

Launch gdb and set a bkpt at tree-vrp.c:552 (extract_range_from_assert).
You should get to this ASSERT_EXPR:

ASSERT_EXPR 

which is in the following context:

-
;; basic block 44, loop depth 0, count 0
;; prev block 43, next block 84
;; pred:   43 (true,exec)
;; succ:   84 (true,exec) 45 (false,exec)
:;
D.1478_28 = sinfo__etype (e_5);
nam_30 = sinfo__chars (e_5);
D.1480_32 = nam_30 - 30361;
if (D.1480_32 <= 1) goto ; else goto ;

;; basic block 84, loop depth 0, count 0
;; prev block 44, next block 45
;; pred:   44 (true,exec)
;; succ:   50 [100.0%]  (fallthru)
:;
D.1480_94 = ASSERT_EXPR ;
goto  ();
-

So, after calling sinfo__chars() and subtracting 30361, the
FE is emitting that range check.  AFAICT, the call to
sinfo__chars(e_5) comes from ada/sem_intr.adb:148

 Nam : constant Name_Id   := Chars (E);

and 'if (D.1480_32 <= 1)' is at line 155:
 
 if Nam = Name_Op_Add


Thanks.  Diego.

Re: PLEASE HELP!

2005-05-03 Thread Ian Lance Taylor

R Lokesh babu <[EMAIL PROTECTED]> writes:

> My application (COMMERCIAL SOFTWARE) links to
> libstdc++, By default Solaris OS
> does not install the libstdc++ package
> (libgcc-3.3-sol9-sparc-local.gz).
> Can I re-distribute the lib-gcc package and install it
> along with my application
> installation. (Is it a violation of GPL?) :)
> Or should my Installation validate and ask the user to
> install lib-gcc himself 
> and re-run the installation.? :(

There is plenty of COMMERCIAL SOFTWARE which is distributed under the
GPL.  I assume that you mean that your application is proprietary, in
that you are distributing a binary without distributing the source
code.

First, I'll note that libstdc++-v3 is under a GPL+exception license.
Building objects using libstdc++-v3 header files, and linking them
into an executable with the libstdc++-v3 library, does not by itself
cause the binary to be covered by the GPL.

Now, to answer your question, you may distribute libstdc++-v3 itself.
However, that distribution is covered by the GPL, and if you
distribute it in binary form, you must also distribute the source
code, or otherwise make it available as described in the GPL.

Finally, I'll note that while I believe that the above is correct,
relying on mailing lists for legal advice is foolish.

Ian

Re: Code generation clarification (Submodels)

2005-05-03 Thread Andrew Walrond

On Monday 02 May 2005 13:01, Andrew Walrond wrote:
> Simple question, but I'm not entirely clear from reading the documentation

Anyone? Pretty please? (a ghastly pleading phrase, which is used by some 
really strange people in an attempt to have another do something they are 
reluctant to undertake)

Andrew Walrond

RE: volatile semantics

2005-05-03 Thread Dave Korn

Original Message
>From: Mike Stump
>Sent: 03 May 2005 09:42

> int avail;
> int main() {
>while (*(volatile int *)&avail == 0)
>  continue;
>return 0;
> }
> 
> 
> Ok, so, the question is, should gcc produce code that infinitely
> loops, or should it be obligated to actually fetch from memory?
> Hint, 3.3 fetched.
> 
> 
> I get:
> 
> L6:
>  b L6
> 
> on mainline and 4.0.

  Any difference if you change the function name from 'main' to something
else?

  And what happens if you precede the while line with

extern void foo (int *);

foo (&avail);

or indeed what happens if you replace the definition of avail with an extern
declaration?

extern int avail;

(you'd need to link with another module that actually instantiates it of
course).

  Theoretically gcc could know that even though avail is volatile, its
address has never been taken (since here we are right at the beginning of
main), and since it's a compiler-allocated variable rather than referencing
some external location, there is no other way that it could be written into
or read from except through a pointer, and thus it really really really
isn't going to change.  In terms of language-lawyerlyness, you could argue
that the fact that the address is never taken means that any volatile
changes would not come  under the "externally visible" criterion...

  I agree that it should probably be prevented from losing the load, even if
it is a deliberate optimisation rather than a bug, but as far as I can see,
the only time this would crop up would be when you were dealing with flag
variables of the sort that are only meant to be set by running the
application under a debugger and manually storing values into them.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today

Re: Code generation clarification (Submodels)

2005-05-03 Thread Ian Lance Taylor

Andrew Walrond <[EMAIL PROTECTED]> writes:

> If I have a gcc configured for i686-* target system and I use that compiler 
> to 
> build a package without any -m submodel options , is the generated code
>  1) only suitable for i686 and better, or
>  2) tuned for i686 and better but still OK for i386

The latter.  Configuring for i686-* is equivalent to configuring for
i386-* with the option --with-cpu=i686.  This causes the compiler to
be built as though you specified -mtune=i686 for each compile which
does not explicitly use -mtune, -march, or -mcpu.

> Whatever the answer, is it a generic rule that holds true for submodels of 
> all 
> architectures?

It should always be true, although of course there may be bugs.

> What about 32bit code generated with x86_64 targeted gcc (with -m32)?

If you configure for x86-64, I believe the default architecture will
be MMX, SSE, and SSE2.  I believe that will apply even if you use
-m32.  But I haven't really tested it.

Ian

Re: volatile semantics

2005-05-03 Thread Nathan Sidwell

Mike Stump wrote:
int avail;
int main() {
  while (*(volatile int *)&avail == 0)
continue;
  return 0;
}
Ok, so, the question is, should gcc produce code that infinitely  loops, 
or should it be obligated to actually fetch from memory?   Hint, 3.3 
fetched.
I beleive the compiler is so licensed. [5.1.2.3/2] talks about accessing
a volatile object.  If the compiled can determine the actual object
being accessed through a series of pointer and volatile cast conversions,
then I see nothing in the std saying it must behave as-if the object
were volatile when it is not.
This, of course, might not be useful to users :)
nathan
--
Nathan Sidwell::   http://www.codesourcery.com   :: CodeSourcery LLC
[EMAIL PROTECTED]:: http://www.planetfall.pwp.blueyonder.co.uk

No link to benchmarks page ?

2005-05-03 Thread hartmut . schirmer

Hi,

is there any link to gcc.gnu.org/benchmarks on the web pages ?

Hartmut

Machen Sie aus 14 Cent spielend bis zu 100 Euro!
Die neue Gaming-Area von Arcor - über 50 Onlinespiele im Angebot.
http://www.arcor.de/rd/emf-gaming-1

Re: Mainline Bootstrap failure on x86-64-linux-gnu

2005-05-03 Thread Andreas Jaeger

Diego Novillo <[EMAIL PROTECTED]> writes:

> On Sun, Apr 24, 2005 at 07:35:43PM +0200, Andreas Jaeger wrote:
>
>> I configure with:
>> 
>> /cvs/gcc/configure --prefix=/opt/gcc/4.1-devel 
>> --enable-checking=misc,tree,gc,rtl,rtlflag,assert --enable-threads=posix 
>> --enable-clocale=gnu --enable-__cxa_atexit --enable-shared 
>> --enable-languages=c,c++,treelang,java,f95,objc  --with-system-zlib 
>> x86_64-suse-linux-gnu
>> 
>> and run make bootstrap.  This is on a SUSE Linux 9.3 with GCC 3.3.5
>> (hammer-branch),
>> 
> I can't reproduce this failure anymore.  Both x86 and x86_64 get
> past this point now.  Andreas, can you still reproduce this?

Works now for me as well,

Andreas
-- 
 Andreas Jaeger, [EMAIL PROTECTED], http://www.suse.de/~aj
  SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
   GPG fingerprint = 93A3 365E CE47 B889 DF7F  FED1 389A 563C C272 A126


pgpNRJP3AIylz.pgp
Description: PGP signature

problems with -fdump-tree options (gcc 4.0.0)

2005-05-03 Thread Nico Moser

Hi,
I tried the following:
gcc -fdump-tree-all-all -c -o bla.o bla.c
and I got these:
bla.c.t02.original
bla.c.t03.generic
bla.c.t06.vcg
bla.c.t08.gimple
bla.c.t09.useless
bla.c.t11.lower
bla.c.t12.eh
bla.c.t13.cfg
bla.c.t14.oplower
I have two questions:
Where is the bla.c.t**.optimized file?
What is the bla.c.t03.generic file? The same as the  
bla.c.t02.original? I can't found it in the onlinedocumentation.

Thanks a lot.
Nico

Re: problems with -fdump-tree options (gcc 4.0.0)

2005-05-03 Thread Diego Novillo

On Tue, May 03, 2005 at 05:57:12PM +0200, Nico Moser wrote:

> Where is the bla.c.t**.optimized file?
>
You didn't use -O.  None of the optimizers run without it.

> What is the bla.c.t03.generic file? The same as the  
>
That's the IL that all FEs generate while parsing.  In some cases
.original and .generic are the same.

> bla.c.t02.original? I can't found it in the onlinedocumentation.
>
Really?  Thanks.  Could you file a documentation bug?  They
should probably be mentioned.


Diego.

Re: volatile semantics

2005-05-03 Thread Dale Johannesen

On May 3, 2005, at 7:41 AM, Nathan Sidwell wrote:
Mike Stump wrote:
int avail;
int main() {
  while (*(volatile int *)&avail == 0)
continue;
  return 0;
}
Ok, so, the question is, should gcc produce code that infinitely  
loops, or should it be obligated to actually fetch from memory?   
Hint, 3.3 fetched.
I beleive the compiler is so licensed. [5.1.2.3/2] talks about 
accessing
a volatile object.  If the compiled can determine the actual object
being accessed through a series of pointer and volatile cast 
conversions,
then I see nothing in the std saying it must behave as-if the object
were volatile when it is not.
This is correct; the standard consistently talks about the type of the 
object,
not the type of the lvalue, when describing volatile.

However, as a QOI issue, I believe the compiler should treat the 
reference as
volatile if either the object or the lvalue is volatile.  That is 
obviously the
user's intent.

Re: Code generation clarification (Submodels)

2005-05-03 Thread Andrew Walrond

Thanks Ian; much appreciated.

Andrew Walrond

Re: volatile semantics

2005-05-03 Thread Nathan Sidwell

Dale Johannesen wrote:
However, as a QOI issue, I believe the compiler should treat the 
reference as
volatile if either the object or the lvalue is volatile.  That is 
obviously the
user's intent.
I'm not disagreeing with you, but I wonder at gcc's ability to make
good on such a promise.  A cast introducing a volatile qualifier
will be a NOP_EXPR, and gcc tends to strip those at every opportunity.
Also, I wonder about the following example
int const avail = 
int main() {
  while (*(int *)&avail == Foo ())
do_something();
  return 0;
}
Seeing through the const-stripping cast is a useful optimization. We'd
have to have one rule for adding volatile and a different one for removing
const.
A further pathelogical case would be,
int main() {
  while (*(int *)(volatile int *)&avail)
do_something ();
  return 0;
}
What should this do, treat the volatile qualifier as sticky?
nathan
--
Nathan Sidwell::   http://www.codesourcery.com   :: CodeSourcery LLC
[EMAIL PROTECTED]:: http://www.planetfall.pwp.blueyonder.co.uk

Re: volatile semantics

2005-05-03 Thread Paul Koning

> "Nathan" == Nathan Sidwell <[EMAIL PROTECTED]> writes:

 Nathan> Dale Johannesen wrote:
 >> However, as a QOI issue, I believe the compiler should treat the
 >> reference as volatile if either the object or the lvalue is
 >> volatile.  That is obviously the user's intent.

 Nathan> I'm not disagreeing with you, but I wonder at gcc's ability
 Nathan> to make good on such a promise.  A cast introducing a
 Nathan> volatile qualifier will be a NOP_EXPR, and gcc tends to strip
 Nathan> those at every opportunity.

Is that true only for casts that add the volatile qualifier but
otherwise do nothing?  A quick tests suggests yes.

This change bothers me a lot.  It seems likely that this will break
existing code possibly in subtle ways.  At least it doesn't do this
when the cast is from a constant integer into a volatile int * --
which is common syntax for CSR references.

Still, never mind what the C spec appears to say, optimizing away the
cast cannot possibly what the user intended.  If the standard implies
that this is right, I'd argue that's a standards bug.  In any case,
such a transformation should at least generate a warning.  Preferably
it should revert to the old semantics.

 paul

Re: volatile semantics

2005-05-03 Thread Dale Johannesen

On May 3, 2005, at 11:03 AM, Nathan Sidwell wrote:
Dale Johannesen wrote:
However, as a QOI issue, I believe the compiler should treat the 
reference as
volatile if either the object or the lvalue is volatile.  That is 
obviously the
user's intent.
I'm not disagreeing with you, but I wonder at gcc's ability to make
good on such a promise.  A cast introducing a volatile qualifier
will be a NOP_EXPR, and gcc tends to strip those at every opportunity.
You may well be right, I haven't tried to implement it (and am not 
planning to).

Also, I wonder about the following example
int const avail = 
int main() {
  while (*(int *)&avail == Foo ())
do_something();
  return 0;
}
Seeing through the const-stripping cast is a useful optimization.
It is?  Why would somebody write that?
A further pathelogical case would be,
int main() {
  while (*(int *)(volatile int *)&avail)
do_something ();
  return 0;
}
What should this do, treat the volatile qualifier as sticky?
IMO, no, but surely we don't have to worry about this one.  Either way
is standard conformant and the user's intent is far from clear, so 
whatever
we do should be OK.

Re: volatile semantics

2005-05-03 Thread Dale Johannesen

On May 3, 2005, at 11:21 AM, Paul Koning wrote:
This change bothers me a lot.  It seems likely that this will break
existing code possibly in subtle ways.
It did, that is why Mike is asking about it. :)

Re: building gcc 4.0.0 on Solaris

2005-05-03 Thread James E Wilson

Dimitri Papadopoulos-Orfanos wrote:
As far as I can understand, it's not possible to build gcc 4.0.0 and gcc 
3.4.* using GNU binutils with current release 2.15 of GNU binutils. One 
has to use the CVS sources or at least one file.
FYI binutils-2.16 has just been released.  You might want to try that.
Not clear if there will be a binutils-2.15.1 release, but if you check 
out the binutils-2_15-release tree from sourceware.org cvs, then it 
includes the patch that you need.  It was added to the release branch 
after the binutils 2.15 release was made.

Bug ID: 4910101
Synopsis: fbe needs a way to reference section labels
Category: compiler
Subcategory: assembler-x86
Apparently this problem only shows up for x86 when using Sun tools, but 
when using GNU tools, it also shows up for sparc.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

Re: volatile semantics

2005-05-03 Thread Nathan Sidwell

Dale Johannesen wrote:
On May 3, 2005, at 11:03 AM, Nathan Sidwell wrote:

Seeing through the const-stripping cast is a useful optimization.

It is?  Why would somebody write that?
perhaps a function, which returned a non-const reference that
happened to be bound to a constant, has been inlined.
IMO, no, but surely we don't have to worry about this one.  Either way
is standard conformant and the user's intent is far from clear, so whatever
we do should be OK.
If we guarantee one to work and not the other, we need to have a clear
specification of how they differ.  What if intermediate variables -- either
explicit in the program, or implicitly during the optimization -- get
introduced?
My guess is that the wording of the standard might be the best that
could be achieved in this regard.  It would be nice to have some clear
wording indicating that Mike's example will work, but some other, possibly
closely related, example will not.
nathan
--
Nathan Sidwell::   http://www.codesourcery.com   :: CodeSourcery LLC
[EMAIL PROTECTED]:: http://www.planetfall.pwp.blueyonder.co.uk

parse bug in 4.0.0?

2005-05-03 Thread Paul Koning

This test program:

struct bar;

template  struct bar *foo (T *p)
{
return p->t;
}

produces an error in 4.0.0:

test.cc:3: error: β??barβ?? is not a template type

Without the keyword "struct" it compiles fine.  Earlier versions
(3.3.2, and I'm pretty sure 3.4.1 as well) don't complain.

paul

Re: volatile semantics

2005-05-03 Thread Dale Johannesen

On May 3, 2005, at 11:52 AM, Nathan Sidwell wrote:
Dale Johannesen wrote:
On May 3, 2005, at 11:03 AM, Nathan Sidwell wrote:

Seeing through the const-stripping cast is a useful optimization.
It is?  Why would somebody write that?
perhaps a function, which returned a non-const reference that
happened to be bound to a constant, has been inlined.
OK, I agree.
IMO, no, but surely we don't have to worry about this one.  Either way
is standard conformant and the user's intent is far from clear, so 
whatever
we do should be OK.
If we guarantee one to work and not the other, we need to have a clear
specification of how they differ.  What if intermediate variables -- 
either
explicit in the program, or implicitly during the optimization -- get
introduced?

My guess is that the wording of the standard might be the best that
could be achieved in this regard.  It would be nice to have some clear
wording indicating that Mike's example will work, but some other, 
possibly
closely related, example will not.
It's not that bad; the type of an lvalue is already well defined (it is 
"int" in
your last example, and "volatile int" in Mike's).  We just take this 
type into
account in determining whether a reference is to be treated as volatile.
(Which means we need to keep track of, or at least be able to find, both
the type of the lvalue and the type of the underlying object.  As you 
say,
gcc may have some implementation issues with this.)

And we don't have to document the behavior at all; it is not documented 
now.

Re: GCC 4.1: Buildable on GHz machines only?

2005-05-03 Thread Alexandre Oliva

On Apr 28, 2005, David Edelsohn <[EMAIL PROTECTED]> wrote:

>> Joe Buck writes:
Joe> Is there a reason why we aren't using a recent libtool?

>   Porting and testing effort to upgrade. 

FWIW, I'd love to switch to a newer version of libtool, but I don't
have easy access to as many OSs as I used to several years ago, so
whatever testing I could offer would be quite limited.

The other issue is that I'm aware of some changes that we've adopted
in GCC libtool that are in libtool CVS mainline (very unstable), but
not in the libtool 1.5 branch (stable releases come out of it) nor in
the 2.0 branch (where the next major stable release is hopefully soon
coming from).

As much as I'd rather avoid switching from one random CVS snapshot of
libtool, now heavily patched, to another random CVS snapshot, it's
either that or waiting a long time until 2.0 is released, then
backport whatever features from libtool mainline we happen to be
relying on.  Or even wait for 2.2.

At this point, it doesn't feel like switching to 1.5.16 is worth the
effort.  2.0 should be far more maintainable, and hopefully
significantly more efficient on hosts where the use of shell functions
optimized for properties of the build machine and/or the host
machine can bring us such improvement.

Thoughts?

-- 
Alexandre Oliva http://www.ic.unicamp.br/~oliva/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}

Re: GCC 4.1: Buildable on GHz machines only?

2005-05-03 Thread Alexandre Oliva

On Apr 29, 2005, Jakub Jelinek <[EMAIL PROTECTED]> wrote:

> On Fri, Apr 29, 2005 at 10:47:06AM +0100, Andrew Haley wrote:
>> Ian Lance Taylor writes:
>> > 
>> > And, yes, we clearly need to do something about the libjava build.
>> 
>> OK, I know nothing about libtool so this might not be possible, but
>> IMO the easiest way of making a dramatic difference is to cease to
>> compile every file twice, once with PIC and once without.  There would
>> be a small performance regression for statically linked Java apps, but
>> in practice Java is very hard to use with static linkage.

> Definitely.  For -static you either have the choice of linking the
> binary with -Wl,--whole-archive for libgcj.a (and likely other Java libs),
> or spend a lot of time adding more and more objects that are really
> needed, but linker doesn't pick them up.

> For the distribution, we simply remove all Java *.a libraries, but it would
> be a build time win if we don't have to compile them altogether.

We had a patch that did exactly this at some point, but RTH said it
broke GNU/Linux/alpha and never gave me the details on what the
failure mode was, and I wasn't able to trigger the error myself.  I
still have the patch in my tree, and it does indeed save lots of
cycles.

-- 
Alexandre Oliva http://www.ic.unicamp.br/~oliva/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}

Re: GCC 4.1: Buildable on GHz machines only?

2005-05-03 Thread Alexandre Oliva

On Apr 29, 2005, Richard Henderson <[EMAIL PROTECTED]> wrote:

> On Fri, Apr 29, 2005 at 01:30:13PM -0400, Ian Lance Taylor wrote:
>> I don't know of a way to tell libtool to not do duplicate compiles.
>> You can use -prefer-pic, but at least from looking at the script it
>> will still compile twice, albeit with -fPIC both times.

> Incidentally, libtool does not compile things twice when you use
> convenience libraries.

Yes, it does, because when you compile libtool still doesn't know
you're going to only use the object file for convenience libraries.
Besides, the fact that only the PIC version of object files is used
for convenience libraries is effectively a limitation of libtool, that
should eventually be addressed.

We should try to reinstate that --tag disable-static patch and get
detailed information on what broke for you, and fix that.

-- 
Alexandre Oliva http://www.ic.unicamp.br/~oliva/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}

Re: parse bug in 4.0.0?

2005-05-03 Thread Gabriel Dos Reis

Paul Koning <[EMAIL PROTECTED]> writes:

| This test program:
| 
| struct bar;
| 
| template  struct bar *foo (T *p)
| {
| return p->t;
| }
| 
| produces an error in 4.0.0:

yes, a parser bug.
good candidate for bugzilla PR.

-- Gaby

[gomp] OpenMP IL design notes

2005-05-03 Thread Diego Novillo

I have started working on connecting Dmitry's OpenMP parser to
the middle-end so that we can start generating the basic runtime
calls, which Richard should be posting soon.  With any luck, we
should have some basic functionality in a few weeks.

Initially, we will be outlining parallel sections into their own
functions.  This is mostly for implementation convenience.
However, long term we are better off incorporating parallel
markers into the IL so that we can do a better job analyzing and
optimizing.

It may be marginally quicker to be able to launch threads that
execute the same body of code because it avoids the argument
passing overhead for shared stuff and the memory indirection in
the launched functions.  But mostly, I'm interested in the IL
elements for optimization and analysis.  Launching multiple
threads on the same function body may give us more headaches than
it's worth ATM.

Essentially, we will have an IL expression for every OpenMP
pragma.  These expressions are GENERIC and the gimplifier work is
mostly in the bodies.  With few exceptions, the controlling
predicates and clauses are required to be in more or less GIMPLE
form by the standard already.

The lowering will, for now, just create a new function and
replace the block of code along the lines of tree-nested.c.
However, in the future, the parallel sections will be
single-entry single-exit regions in the CFG with the controlling
GOMP_PARALLEL_... expression as the entry block and a latch-like
exit block.  The parallel region building can be modeled after
the loop structure, but there isn't as much nesting, so it
shouldn't be too complex.  As an aside, we do need CFG region
building and the ability to have the optimizers work on
sub-regions (currently being worked on, as I understand).

In fact, even if we don't end up launching threads on the same
function body, we can keep the parallel region inside the
function throughout the optimizers and outline it at a later
point (before RTL, perhaps).

Some runtime library calls (synchronization mostly), ought to be
recognizable as such by the optimizers.  I am not sure whether to
define them as builtins, provide an attribute or make them IL
expressions.  Any suggestions/ideas?

The IL constructs mostly mirror their #pragma counterparts.  Take
these as a design draft, I have only started working on the
implementation, so I expect the design to evolve as I implement
things.  There may also be several hidden assumptions that I
expect to become embarrassingly obvious in a few weeks.  Names
prefixed with "g_" below mean "the gimplified form of ...".


Parallel regions


#pragma omp parallel [clause1 ... clauseN]
--

  GENERIC
GOMP_PARALLEL 
  
  GIMPLE
GOMP_PARALLEL 
L1:
  g_body
L2:


#pragma omp for [clause1 ... clauseN]
-

  GENERIC
GOMP_FOR 

  GIMPLE
GOMP_FOR 
L1:
  g_body
L2:

Both INIT-EXPR and INCR-EXPR are required to be in GIMPLE
form by the standard already, so there's little that need
to be done there.  Keeping them in the header itself
makes it easy to reference later when we're generating
code.


#pragma omp sections [clause1 ... clauseN]
--

  GENERIC
GOMP_SECTIONS 

  GIMPLE
GOMP_SECTIONS 
L1:
  g_body
L2:



#pragma omp section
---

  GENERIC
GOMP_SECTION 

  GIMPLE
GOMP_SECTION 
L1:
  g_body
L2:



#pragma omp single [clause1 ... clauseN]


  GENERIC
GOMP_SINGLE 

  GIMPLE
GOMP_SINGLE 
L1:
  g_body
L2:



#pragma omp master
--

  GENERIC
GOMP_MASTER 

  GIMPLE
GOMP_MASTER 
L1:
  g_body
L2:


#pragma omp critical [name]
---

  GENERIC
GOMP_CRITICAL 

  GIMPLE
GOMP_CRITICAL 
L1:
  g_body
L2:

  Here, NAME is something the runtime needs to recognize.  It will
  essentially be the name of the lock to use when emitting the
  appropriate lock call.


#pragma omp barrier
---

  GENERIC
  GIMPLE
GOMP_BARRIER


#pragma omp atomic
---

  GENERIC
  GIMPLE
GOMP_ATOMIC 

  The standard is sufficiently strict that we don't need additional
  gimplification here.  EXPRESSION-STATEMENT can only be of the form
  'VAR binop= EXPR', where EXPR must be of scalar type.  ATM, it's not
  absolutely clear to me if EXPR needs to be a GIMPLE RHS already or
  if it could be more complex.  It certainly can't reference VAR.


#pragma omp flush (var-list)


  GENERIC
  GIMPLE
GOMP_FLUSH 


#pragma omp ordered
---

  GENERIC
GOMP_ORDERED 

  GIMPLE
GOMP_O

Re: GCC 4.0, Fast Math, and Acovea

2005-05-03 Thread Scott Robert Ladd

tbp wrote:
On 4/29/05, Uros Bizjak <[EMAIL PROTECTED]> wrote:
 

Hello Scott!
   

Hello Scott & Uros,
 

Specifically, the -funsafe-math-optimizations flag doesn't work
correctly on AMD64 because the default on that platform is
-mfpmath=sse. Without specifying -mfpmath=387,
-funsafe-math-optimizations does not generate inline processor
instructions for most floating-point functions.
 

[snip]
 

It was found that moving data from SSE registers to X87 registers (and
back) only to call an x87 builtin degrades performance. Because of this,
x87 builtins are disabled for -mfpmath=sse and a normal libcall is
issued for sin(), etc functions. If someone wants to use x87 builtins,
then _all_ math operations should be done in x87 registers to avoid
costly SSE->x87 moves.
   

Shameless plug with my own performance analysis regarding SSE on x86-64.
I've ported my coherent raytracer which mostly uses intrinsics in the
hot path (and no transcendentals).
While gcc4.x compiled binaries are ~5% slower than those compiled with
icc8.1 on ia32 (best case), it's the other way around on x86-64 if not
more (on my opteron with icc8.1 and beta 9.0).
Obviously there's much less pressure on the (cough weak cough)
register allocator and in the end the generated code is way leaner.
My only gripe with fast-math is that it's the only way to enable some
optimizations while making NaNs verbotten; couple that with the lack
of cross unit IPO and you're stuck with a kind of nasty "global"
switch (unless you have room for some function calls).
 

Granted, POV-Ray may not be state-of-the-art, but then, I know quite a 
few people who say that (even legitimately) about just about every 
software product in existence.

If you have a suggestion for better benchmarks, I'm listening. Is your 
ray tracer available?

..Scott

Re: GCC 4.0, Fast Math, and Acovea

2005-05-03 Thread Diego Novillo

On Tue, May 03, 2005 at 04:45:55PM -0400, Scott Robert Ladd wrote:

> If you have a suggestion for better benchmarks, I'm listening. Is your 
> ray tracer available?
> 
I recently heard of Openbench, a project to create an open
version of the SPEC benchmarks http://www.exactcode.de/oss/openbench/

Like them or hate them, SPEC has become the standard and their
CPU tests are not altogether bad.  But benchmarking is such an
iffy endeavour, that you will have a very hard time trying to
satisfy everybody.

Diego.

Access to benchmark page from our front page

2005-05-03 Thread Diego Novillo

ISTR a link from GCC's home page into http://gcc.gnu.org/benchmarks/
but it doesn't seem to be there anymore.  Shouldn't it be on the
index on the left at least?


Thanks.  Diego.

Re: [gomp] OpenMP IL design notes

2005-05-03 Thread Lars Segerlund


 Okie, 

  I will try to look it over, right now I am very busy, and I don't know when I 
can
 get back.
  I have to remarks so far, the first is that we have to extend the gfortran 
internal
 representation also, and the second is that perhaps we don't have to have a 1 
to 1
 mapping of OMP to IL, ( thins of variables and such, I might be wrong but I 
think 
 we perhaps can do the same thing a bit easier ).

 / regards, Lars Segerlund.


On Tue, 3 May 2005 16:42:47 -0400
Diego Novillo <[EMAIL PROTECTED]> wrote:

> I have started working on connecting Dmitry's OpenMP parser to
> the middle-end so that we can start generating the basic runtime
> calls, which Richard should be posting soon.  With any luck, we
> should have some basic functionality in a few weeks.
> 
> Initially, we will be outlining parallel sections into their own
> functions.  This is mostly for implementation convenience.
> However, long term we are better off incorporating parallel
> markers into the IL so that we can do a better job analyzing and
> optimizing.
> 
> It may be marginally quicker to be able to launch threads that
> execute the same body of code because it avoids the argument
> passing overhead for shared stuff and the memory indirection in
> the launched functions.  But mostly, I'm interested in the IL
> elements for optimization and analysis.  Launching multiple
> threads on the same function body may give us more headaches than
> it's worth ATM.
> 
> Essentially, we will have an IL expression for every OpenMP
> pragma.  These expressions are GENERIC and the gimplifier work is
> mostly in the bodies.  With few exceptions, the controlling
> predicates and clauses are required to be in more or less GIMPLE
> form by the standard already.
> 
> The lowering will, for now, just create a new function and
> replace the block of code along the lines of tree-nested.c.
> However, in the future, the parallel sections will be
> single-entry single-exit regions in the CFG with the controlling
> GOMP_PARALLEL_... expression as the entry block and a latch-like
> exit block.  The parallel region building can be modeled after
> the loop structure, but there isn't as much nesting, so it
> shouldn't be too complex.  As an aside, we do need CFG region
> building and the ability to have the optimizers work on
> sub-regions (currently being worked on, as I understand).
> 
> In fact, even if we don't end up launching threads on the same
> function body, we can keep the parallel region inside the
> function throughout the optimizers and outline it at a later
> point (before RTL, perhaps).
> 
> Some runtime library calls (synchronization mostly), ought to be
> recognizable as such by the optimizers.  I am not sure whether to
> define them as builtins, provide an attribute or make them IL
> expressions.  Any suggestions/ideas?
> 
> The IL constructs mostly mirror their #pragma counterparts.  Take
> these as a design draft, I have only started working on the
> implementation, so I expect the design to evolve as I implement
> things.  There may also be several hidden assumptions that I
> expect to become embarrassingly obvious in a few weeks.  Names
> prefixed with "g_" below mean "the gimplified form of ...".
> 
> 
> Parallel regions
> 
> 
> #pragma omp parallel [clause1 ... clauseN]
> --
> 
>   GENERIC
>   GOMP_PARALLEL 
>   
>   GIMPLE
>   GOMP_PARALLEL 
>   L1:
> g_body
>   L2:
> 
> 
> #pragma omp for [clause1 ... clauseN]
> -
> 
>   GENERIC
>   GOMP_FOR  body>
> 
>   GIMPLE
>   GOMP_FOR  incr-expr, L1, L2>
>   L1:
> g_body
>   L2:
> 
>   Both INIT-EXPR and INCR-EXPR are required to be in GIMPLE
>   form by the standard already, so there's little that need
>   to be done there.  Keeping them in the header itself
>   makes it easy to reference later when we're generating
>   code.
> 
> 
> #pragma omp sections [clause1 ... clauseN]
> --
> 
>   GENERIC
>   GOMP_SECTIONS 
> 
>   GIMPLE
>   GOMP_SECTIONS 
>   L1:
> g_body
>   L2:
> 
> 
> 
> #pragma omp section
> ---
> 
>   GENERIC
>   GOMP_SECTION 
> 
>   GIMPLE
>   GOMP_SECTION 
>   L1:
> g_body
>   L2:
> 
> 
> 
> #pragma omp single [clause1 ... clauseN]
> 
> 
>   GENERIC
>   GOMP_SINGLE 
> 
>   GIMPLE
>   GOMP_SINGLE 
>   L1:
> g_body
>   L2:
> 
> 
> 
> #pragma omp master
> --
> 
>   GENERIC
>   GOMP_MASTER 
> 
>   GIMPLE
>   GOMP_MASTER 
>   L1:
> g_body
>   L2:
> 
> 
> #pragma omp critical [name]
> ---
> 
>   GENERIC
>   GOMP_CRITICAL 
> 
>   GIMPLE
>   GOMP_CRITICAL 
>   L1:
> g_body
>   L2:
> 
>   Here, NAME is something the runtime needs to recognize.  It will
>   essentially be the na

Re: [gomp] OpenMP IL design notes

2005-05-03 Thread Richard Henderson

On Tue, May 03, 2005 at 04:42:47PM -0400, Diego Novillo wrote:
>   GENERIC
>   GIMPLE
>   GOMP_ATOMIC 

Do we gain anything over expanding this to the approprate __sync_foo
builtin in the front end.?

>   GENERIC
>   GIMPLE
>   GOMP_FLUSH 

Likewise.

> #pragma omp threadprivate
> -
> 
>   This will just set an attribute in each affected _DECL.
>   Accessible with GOMP_THREADPRIVATE.

My intention is to use TLS for this, and to NOT support this feature
on any system that doesn't support TLS.  Thus this bit is synonymous
with DECL_THREAD_LOCAL.

>   GIMPLE  Same, with EXPR in GIMPLE form as per FE rules.
>   If missing, it defaults to INTEGER_ONE_NODE for
>   GOMP_SCHED_DYNAMIC and GOMP_SCHED_GUIDED.  It
>   defaults to iteration-space / num-threads for
>   GOMP_SCHED_STATIC and it emits getenv reads from
>   environment for GOM_SCHED_RUNTIME.

The getenv is done in the library.  You can leave the kind field
NULL, or set it to INTEGER_ZERO_NODE as you choose for the tree
level.

> data_clauses
> 
> 
> * CLAUSE  private (variable_list)
>   copyprivate (variable_list)
>   firstprivate (variable_list)
>   lastprivate (variable_list)
>   shared (variable_list)
>   copyin (variable_list)
>   
>   GENERIC These are fields in the GOMP_PARALLEL expression.
>   Accessed with:
> 
>   GOMP_PRIVATE
>   GOMP_FIRSTPRIVATE
>   GOMP_SHARED
>   GOMP_COPYIN
> 
>   GIMPLE  Same, with variable_list gimplified as per FE
>   rules.

These shouldn't need gimplification.  We should only have decls in
this list.

> * CLAUSE  default (shared | none)
> 
>   GENERIC This is a boolean field in the GOMP_PARALLEL
>   expression.
> 
>   GIMPLE  Same.

IMO this shouldn't escape the front end.  We have different requirements
for Fortran and C.  We should require that front ends do all symbol
resolution and provide GENERIC with a complete list of decls.  What 
reaches GENERIC should be equivalent to default(none) -- that is, all
variables are either (1) declared inside BIND_EXPRs inside the body of
the block, or (2) mentioned in one of the relevant variable lists.

> * CLAUSE  reduction (operator : variable_list)
> 
>   GENERIC A structure inside GOMP_PARALLEL with two fields
> 
>   enum tree_code operator -> PLUS_EXPR,
>  MULT_EXPR,
>  MINUS_EXPR,
>  BIT_AND_EXPR,
>  BIT_XOR_EXPR,
>  BIT_IOR_EXPR,
>  AND_EXPR,
>  OR_EXPR
>   tree variable_list
> 
>   GIMPLE  Same, with variable_list gimplified as per FE
>   rules.

This isn't generic enough.  The reduction clause can be specified multiple
times.  Thus we can see

#pragma omp for reduction(+: a, b) reduction(*: c, d)

I assume the best option would be a list or vector of operator/variable
pairs.

Also note that reduction is also legal on for constructs, and that the
firstprivate, lastprivate, and copyprivate clauses are legal on other
work sharing constructs.

r~

Re: [gomp] OpenMP IL design notes

2005-05-03 Thread Diego Novillo

On Tue, May 03, 2005 at 11:05:05PM +0200, Lars Segerlund wrote:

>   I will try to look it over, right now I am very busy, and I
>   don't know when I can get back. I have to remarks so far, the
>   first is that we have to extend the gfortran internal
>   representation also, and the second is that perhaps we don't
>
Yes, initially most of the effort will be in C/C++ since that's
the only parser we have so far.

>   have to have a 1 to 1 mapping of OMP to IL, ( thins of
>   variables and such, I might be wrong but I think we perhaps
>   can do the same thing a bit easier ).
> 
Hmm?  I'm not quite following you here.


Thanks.  Diego.

Re: GCC 4.0, Fast Math, and Acovea

2005-05-03 Thread Alexander Strange

On May 3, 2005, at 4:54 PM, Diego Novillo wrote:
On Tue, May 03, 2005 at 04:45:55PM -0400, Scott Robert Ladd wrote:

If you have a suggestion for better benchmarks, I'm listening. Is  
your
ray tracer available?


I recently heard of Openbench, a project to create an open
version of the SPEC benchmarks http://www.exactcode.de/oss/openbench/
There's also this benchmark project, although it's nowhere near  
complete yet: http://arsware.org/cms/showpage.php?cid=104

Re: volatile semantics

2005-05-03 Thread Thorsten Glaser

Mike Stump dixit:

> int avail;
> int main() {
>  while (*(volatile int *)&avail == 0)
>continue;
>  return 0;
> }

3.4.4 fetches too. I get:

.L2:
mov %eax, DWORD PTR avail
test%eax, %eax
je  .L2

This is at -O99, other levels produce similar results.

//mirabile
-- 
> Hi, does anyone sell openbsd stickers by themselves and not packaged
> with other products?
No, the only way I've seen them sold is for $40 with a free OpenBSD CD.
-- Haroon Khalid and Steve Shockley in gmane.os.openbsd.misc

Re: [gomp] OpenMP IL design notes

2005-05-03 Thread Diego Novillo

On Tue, May 03, 2005 at 02:16:35PM -0700, Richard Henderson wrote:
> On Tue, May 03, 2005 at 04:42:47PM -0400, Diego Novillo wrote:
> >   GENERIC
> >   GIMPLE
> > GOMP_ATOMIC 
> 
> Do we gain anything over expanding this to the approprate __sync_foo
> builtin in the front end.?
> 
Can the optimizers tell that this is an atomic builtin?  If so,
then no, they're not necessary.

> My intention is to use TLS for this, and to NOT support this feature
> on any system that doesn't support TLS.  Thus this bit is synonymous
> with DECL_THREAD_LOCAL.
> 
OK, good.

> These shouldn't need gimplification.  We should only have decls in
> this list.
> 
That's what I thought at first, but the standard threw me into a
loop when it mentioned "id-expression" instead of just
"identifier" in the C++ case.  If they're essentially the same,
then great.

> > * CLAUSEdefault (shared | none)
> > 
> >   GENERIC   This is a boolean field in the GOMP_PARALLEL
> > expression.
> > 
> >   GIMPLESame.
> 
> IMO this shouldn't escape the front end.  We have different requirements
> for Fortran and C.  We should require that front ends do all symbol
> resolution and provide GENERIC with a complete list of decls.  What 
> reaches GENERIC should be equivalent to default(none) -- that is, all
> variables are either (1) declared inside BIND_EXPRs inside the body of
> the block, or (2) mentioned in one of the relevant variable lists.
> 
OK, that's certainly simpler.


>   #pragma omp for reduction(+: a, b) reduction(*: c, d)
> 
> I assume the best option would be a list or vector of operator/variable
> pairs.
> 
Yes.  I was only referring to a single instance of reduction.
We'd have to have a vector of those.

> Also note that reduction is also legal on for constructs, and that the
> firstprivate, lastprivate, and copyprivate clauses are legal on other
> work sharing constructs.
> 
Yes, I tried to express that by putting common clauses in
data_clauses and have the various constructs reference it.


Diego.

Re: GCC 4.1: Buildable on GHz machines only?

2005-05-03 Thread Joe Buck

On Tue, May 03, 2005 at 04:57:10PM -0300, Alexandre Oliva wrote:
> At this point, it doesn't feel like switching to 1.5.16 is worth the
> effort.  2.0 should be far more maintainable, and hopefully
> significantly more efficient on hosts where the use of shell functions
> optimized for properties of the build machine and/or the host
> machine can bring us such improvement.

> Thoughts?

Richard Henderson showed that the libjava build spends 2/3 of its time
in libtool, and that his hand-hacked (but not portable) modification to
invoke the appropriate binutils commands directly gave a huge speedup.
To me, 300% overhead means major breakage, so we need a better libtool.
However, this better libtool might not yet exist.

Re: Q about Ada and value ranges in types

2005-05-03 Thread Richard Kenner

 Yeah, I didn't show all of it, sorry.  My patch to address this
 problem includes a more detailed description
 (http://gcc.gnu.org/ml/gcc-patches/2005-05/msg00127.html).

As of right now, I don't think this is a VRP problem, but something wrong
with the tree Ada produces.

 Configure a compiler for target i386-pc-linux-gnu (or any other
 i386 variant, not sure if it occurs elsewhere) and compile
 ada/sem_intr.adb with:

I'm out of town until tomorrow and will do this then.

[wwwdocs] patch for Re: Access to benchmark page from our front page

2005-05-03 Thread Gerald Pfeifer

On Tue, 3 May 2005, Diego Novillo wrote:
> ISTR a link from GCC's home page into http://gcc.gnu.org/benchmarks/
> but it doesn't seem to be there anymore.  Shouldn't it be on the
> index on the left at least?

You mean, like the following?  Good idea.

Installed.

Gerald

Index: style.mhtml
===
RCS file: /cvs/gcc/wwwdocs/htdocs/style.mhtml,v
retrieving revision 1.74
diff -u -3 -p -r1.74 style.mhtml
--- style.mhtml 17 Nov 2004 05:57:51 -  1.74
+++ style.mhtml 3 May 2005 22:33:35 -
@@ -241,6 +241,7 @@
   Front ends
   Back ends
   Extensions
+  Benchmarks

Re: No link to benchmarks page ?

2005-05-03 Thread Gerald Pfeifer

On Tue, 3 May 2005 [EMAIL PROTECTED] wrote:
> is there any link to gcc.gnu.org/benchmarks on the web pages ?

Yes, but _rather_ well hidden, to be honest.

Thanks for the hint! I just added a pointer to that page from the 
navigation bar on our main page.

Gerald

Re: [gomp] OpenMP IL design notes

2005-05-03 Thread Richard Henderson

On Tue, May 03, 2005 at 05:27:26PM -0400, Diego Novillo wrote:
> > Do we gain anything over expanding this to the approprate __sync_foo
> > builtin in the front end.?
> > 
> Can the optimizers tell that this is an atomic builtin?  If so,
> then no, they're not necessary.

Sure, in the same way we know what "strlen" is.

> That's what I thought at first, but the standard threw me into a
> loop when it mentioned "id-expression" instead of just
> "identifier" in the C++ case.  If they're essentially the same,
> then great.

id-expression is a non-terminal from the iso c++ grammar.  ;-)

It means someone can write e.g. Class::static_member.  As far
as you are concerned it means DECL.

> Yes.  I was only referring to a single instance of reduction.
...
> Yes, I tried to express that by putting common clauses in
> data_clauses and have the various constructs reference it.

Ah, you confused me.


r~

Re: Global Objects initialization Problem.......

2005-05-03 Thread James E Wilson

Satendra Pratap wrote:
I am using a cross compiler "sparclet-aout-gcc". I have written my own
main function and does not link to libgcc's main function while
linking is done. I m not able to initialize the global objects The
generated executable format is "a.out".
You have so much modified stuff here that it is unlikely a volunteer 
will be willing or able to help you.  You will have to debug this 
yourself.  Step through your initialization functions in gdb to see why 
the global constructor isn't being called.  Try using objdump to look at 
the program, and make sure the appropriate sections have appropriate data.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

Re: Dirac, GCC-4.0.0 and SIMD optimisations on x86 architecture

2005-05-03 Thread James E Wilson

Anuradha Suraparaju wrote:
My question is how do I report this as a bug? What information do I
need to provide in the bug report? Did anybody else face similar
problems with GCC-4.0.0 and MMX-enabled programs.
See
http://gcc.gnu.org/bugs.html
for info on reporting bugs.
If you can narrow this down to a small testcase, then you are more 
likely to get a solution from us.  If you want us compile the entire 
Dirac project and take a look, we probably won't bother.

There have been changes to the MMX support in gcc, but without specific 
details about your testcase, it is hard to say anything definite.  For 
instance, we don't know what the Dirac --enable-mmx option does.  Which 
specific gcc options does it enable?

What about SSE?  The SSE support is generally preferred over the older 
MMX support.  Does Dirac make any use of this?  If not, perhaps it should.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

RE: GCC 4.1: Buildable on GHz machines only?

2005-05-03 Thread Dave Korn

Original Message
>From: Joe Buck
>Sent: 03 May 2005 23:04

> On Tue, May 03, 2005 at 04:57:10PM -0300, Alexandre Oliva wrote:
>> At this point, it doesn't feel like switching to 1.5.16 is worth the
>> effort.  2.0 should be far more maintainable, and hopefully
>> significantly more efficient on hosts where the use of shell functions
>> optimized for properties of the build machine and/or the host
>> machine can bring us such improvement.
> 
>> Thoughts?
> 
> Richard Henderson showed that the libjava build spends 2/3 of its time
> in libtool, and that his hand-hacked (but not portable) modification to
> invoke the appropriate binutils commands directly gave a huge speedup.
> To me, 300% overhead means major breakage, so we need a better libtool.
> However, this better libtool might not yet exist.

  Ok, here's a really *nasty* kludge:  libtool is basically a big script
that generates command lines for the other tools based on passed-in args and
local configure settings, yeh?  And a lot of the time it's used for lots and
lots and lots of library files one after another in exactly the same way,
yes?

  So couldn't quite a lot of its uses be replaced in the makefile machinery
with something that invokes it just once (for a given batch of library
source files in a single build object subdir), and records the command line
that it generates, and just uses sed to duplicate all the different source
and object file names into the right places in many repetitions of it?

  I did a little experiment.  I took a recent build directory of HEAD.  I
keep all the output from the build process in a script file, so I fetched
all the lines that invoke libtool with

grep -B0 -A1 libtool build.log | grep -v -- ^-- > libt.txt

  This hopefully gets the libtool invocation and the command line that
libtool generates.  It also gets a few false positives, such as mentions in
configure output, but they're a small number.  The output contains 761
lines, 744 of which are unique. 

[EMAIL PROTECTED] /gnu/obj-HEAD> grep -B0 -A1 libtool build.log | grep -v -- 
^-- >
libt.txt
[EMAIL PROTECTED] /gnu/obj-HEAD> wc -l libt.txt
761 libt.txt
[EMAIL PROTECTED] /gnu/obj-HEAD> sort < libt.txt | uniq | wc -l
744
[EMAIL PROTECTED] /gnu/obj-HEAD>

  Then I use sed to replace all the names of source and object files I can
find with dummy text:

[EMAIL PROTECTED] /gnu/obj-HEAD> sed < libt.txt -e 's/[-a-zA-Z0-9_]*\.cc/SRC/g' 
-e '
s/[-a-zA-Z0-9_]*\.o/OBJ/g' -e 's/[-a-zA-Z0-9_]*\..o/OBJ/g' -e
's/[-a-zA-Z0-9_]*
\.c/SRC/g' -e 's/[-a-zA-Z0-9_]*\.f90/SRC/g' -e 's/[-a-zA-Z0-9_]*\.m/SRC/g'
>lib
t2.txt

  Now look what that does!

[EMAIL PROTECTED] /gnu/obj-HEAD> wc -l libt2.txt
761 libt2.txt
[EMAIL PROTECTED] /gnu/obj-HEAD> sort < libt2.txt | uniq | wc -l
63
[EMAIL PROTECTED] /gnu/obj-HEAD>

  If I manually edit out the false positives from that, there are only 41
lines.

  Now, libjava wasn't included in this build because it seems to be disabled
on cygwin target, but it's probably much the same.  So ISTM that vast
swathes of libtool invocations could be replaced by a far simpler generation
of command lines from templates, with a bit of help from libtool to generate
the templates.  Give libtool an option where it only generates the command
line instead of invoking it, pass it args that look like "-c SRC" and "-o
OBJ", and then postprocess it to substitute real names in.

  Is there some terrible gotcha that I don't understand libtool well enough
to see here?

cheers,
  DaveK
-- 
Can't think of a witty .sigline today

Re: GCC 4.1: Buildable on GHz machines only?

2005-05-03 Thread Peter O'Gorman

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Dave Korn wrote:
|   Ok, here's a really *nasty* kludge:  libtool is basically a big script
| that generates command lines for the other tools based on passed-in args and
| local configure settings, yeh?  And a lot of the time it's used for lots and
| lots and lots of library files one after another in exactly the same way,
| yes?

Peter
- --
Peter O'Gorman - http://www.pogma.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.0 (Darwin)
iQCVAwUBQngOnriDAg3OZTLPAQItXAP+NTK3ye0bzQOAWMYlctBfADpVvvTa+cB8
4Ft4AkUwdIQ1hdUjq5aNWF2btksxD33rd3Idse5esLzeTY3zXajkqkFTJWc2HKlM
h9ZM/1lc6Q2mQq/bbzj2kAH+TRfck3jFQlkoOwo7gJB8xrDROMX0LdvpAKeN9DP0
n1hTLfwVweE=
=ZQUv
-END PGP SIGNATURE-

Re: C54x port: some general technical questions

2005-05-03 Thread James E Wilson

Jonathan Bastien-Filiatrault wrote:
* We have defined BIT_PER_WORD to 16 and UNITS_PER_WORD to 1. On this
DSP, there are two 40-bits accumulators. How do we make GCC take
advantage of this and which machine mode do we use ?
GCC has little support for non-power-of-2 sized accumulators. 
Traditionally this would be done by using PSImode (because your target 
has a 64-bit SImode), which you can enable by using the undocumented 
PARTIAL_INT_MODE(SI) macro in your target modes.def file.  See other 
targets for existing examples of this.

Also, gcc has little support for targets with BITS_PER_UNIT != 8.  I 
think the (ti)c4x is the only one currently supported, and it is being 
obsoleted due to lack of maintenance.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

RE: GCC 4.1: Buildable on GHz machines only?

2005-05-03 Thread Dave Korn

Original Message
>From: Peter O'Gorman
>Sent: 04 May 2005 00:52

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Dave Korn wrote:
>>   Ok, here's a really *nasty* kludge:  libtool is basically a big script
>> that generates command lines for the other tools based on passed-in args
>> and local configure settings, yeh?  And a lot of the time it's used for
>> lots and lots and lots of library files one after another in exactly the
>> same way, yes?
> 
> 
> 

"  Caching compile commands
gives nice speedups even the first run as the source and target file name
are
replaced by a generic placeholder before caching (the 2.6 branch of GTK+
requires only 9 different compile commands), ...  "

  Cool!  I'll give it a try and see if I can get some figures!

cheers,
  DaveK
-- 
Can't think of a witty .sigline today

Re: GCC 4.1: Buildable on GHz machines only?

2005-05-03 Thread Peter O'Gorman

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Joe Buck wrote:
| Richard Henderson showed that the libjava build spends 2/3 of its time
| in libtool, and that his hand-hacked (but not portable) modification to
| invoke the appropriate binutils commands directly gave a huge speedup.
| To me, 300% overhead means major breakage, so we need a better libtool.
| However, this better libtool might not yet exist.
Probably doesn't. Ralf has done lots of work on libtool HEAD, making it 20%
faster, but that will not be in a libtool release anytime soon.
Part of the problem here is the use of a convenienve library to hold several
thousand object files and then making a shared library with the convenience
library. On many platforms, those without a --whole-archive flag, libtool
will extract the convenience archive all over again. Linking the shared
library all in one go would be faster.
Peter
- --
Peter O'Gorman - http://www.pogma.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.0 (Darwin)
iQCVAwUBQngR9riDAg3OZTLPAQLUwwP+I+xq38TklAgu/YSi81QJn4UzbOCOrRro
5SWfj7QM9Os66QxpKp6Ds0l0lREr3p/ytj4OlHtZ4NeAMt33rD4j5KGaK3K83jbj
Qcij/uJHHoSe3KJftnoJg/9/RWAWlxhFTS5oJhgBOSpcdYtrdAdj9m2k1qV+BQum
q2ZuThhgd2c=
=lYSE
-END PGP SIGNATURE-

Re: [gomp] OpenMP IL design notes

2005-05-03 Thread Ian Lance Taylor

Diego Novillo <[EMAIL PROTECTED]> writes:

>   GENERIC
>   GOMP_PARALLEL 
>   
>   GIMPLE
>   GOMP_PARALLEL 
>   L1:
> g_body
>   L2:

I personally find it kind of baffling to have the same tree code act
differently in GENERIC and GIMPLE, a la SWITCH_EXPR.  It seems to add
confusion for minimal benefit.  If you are suggesting that the single
tree code GOMP_PARALLEL have different operands in GENERIC and GIMPLE,
can I suggest that you instead use two different tree coes?

Ian

Re: [gomp] OpenMP IL design notes

2005-05-03 Thread Diego Novillo

On Tue, May 03, 2005 at 08:23:59PM -0400, Ian Lance Taylor wrote:
> Diego Novillo <[EMAIL PROTECTED]> writes:
> 
> >   GENERIC
> > GOMP_PARALLEL 
> >   
> >   GIMPLE
> > GOMP_PARALLEL 
> > L1:
> >   g_body
> > L2:
> 
> I personally find it kind of baffling to have the same tree code act
> differently in GENERIC and GIMPLE, a la SWITCH_EXPR.  It seems to add
> confusion for minimal benefit.  If you are suggesting that the single
> tree code GOMP_PARALLEL have different operands in GENERIC and GIMPLE,
> can I suggest that you instead use two different tree coes?
> 
That is a fundamental feature of both GENERIC and GIMPLE. GIMPLE
is a non-strict subset of GENERIC.  Every program in GIMPLE form
is also in GENERIC form.  The reverse, however, is not true.

If we did that we would need different codes for every tree code
(MODIFY_EXPR, PLUS_EXPR, etc).  Similarly, we would need
different codes when we move from High-GIMPLE into Low-GIMPLE.
I'm not sure that's worth the effort.  From the point of view of
analysis and optimizations the differences between the different
ILs are mostly in the grammar, not their syntax.

We could perhaps incorporate tokens to tell which IL we are
dealing with.  Currently, that is not really necessary because
the IL is given implicitly by the phase of compilation that you
are in.  In the future, we may need to make the distinction if we
become capable of starting the compilation process from an
arbitrary IL dump file.

Diego.

Re: [gomp] OpenMP IL design notes

2005-05-03 Thread Diego Novillo

On Tue, May 03, 2005 at 03:59:24PM -0700, Richard Henderson wrote:

> Sure, in the same way we know what "strlen" is.
> 
Excellent.  I'll get rid of them then.

> > That's what I thought at first, but the standard threw me into a
> > loop when it mentioned "id-expression" instead of just
> > "identifier" in the C++ case.  If they're essentially the same,
> > then great.
> 
> id-expression is a non-terminal from the iso c++ grammar.  ;-)
> 
Ah, OK.  Standardese puts me to sleep in no time ;)


Diego.

Re: [wwwdocs] patch for Re: Access to benchmark page from our front page

2005-05-03 Thread Diego Novillo

On Wed, May 04, 2005 at 12:48:27AM +0200, Gerald Pfeifer wrote:

> You mean, like the following?  Good idea.
> 
> Installed.
> 
Cool.  Thanks.


Diego.

Re: Q about Ada and value ranges in types

2005-05-03 Thread Diego Novillo

On Tue, May 03, 2005 at 06:21:11PM -0400, Richard Kenner wrote:

> As of right now, I don't think this is a VRP problem, but something wrong
> with the tree Ada produces.
> 
That'd be good.  If that's the case, we can make VRP assert that
the range derived from such types agrees with the type's range.



Thanks.  Diego.

Re: Incomplete instatitiation of virtual registers

2005-05-03 Thread James E Wilson

Martin Koegler wrote:
I notice, that your last change in function.c forgets virtual
registers in the RTL in some conditions. In older version (the last I used was 
20050412),
this has not happend.
Patches should go to gcc-patches instead of the gcc list.
If you want us to continue accepting patches from you, then you need to 
fill out a copyright assignment form.

Recursively calling instantiate_virtual_regs_in_insn does not look 
right.  We need a better explanation of what is going wrong here.  Since 
we don't have a copy of your port, you will have to explain in detail 
what happens as virtual register instantiation is happening, in order to 
explain how this occurs.  Normally, we would ask for a testcase, but 
that won't work in this case.

It is also possible that this is a bug in your port, if you are 
accidentally emitting the virtual register yourself somehow during or 
after virtual register instantiation.

I know of one PR that has since been filed for a problem with the new 
virtual register instantiation code.  That is PR 21328.  This doesn't 
appear to be related.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

Re: [gomp] OpenMP IL design notes

2005-05-03 Thread Ian Lance Taylor

Diego Novillo <[EMAIL PROTECTED]> writes:

> > I personally find it kind of baffling to have the same tree code act
> > differently in GENERIC and GIMPLE, a la SWITCH_EXPR.  It seems to add
> > confusion for minimal benefit.  If you are suggesting that the single
> > tree code GOMP_PARALLEL have different operands in GENERIC and GIMPLE,
> > can I suggest that you instead use two different tree coes?
> > 
> That is a fundamental feature of both GENERIC and GIMPLE. GIMPLE
> is a non-strict subset of GENERIC.  Every program in GIMPLE form
> is also in GENERIC form.  The reverse, however, is not true.

Well, sure.

But it seems to me that there is a difference.  PLUS_EXPR always takes
two operands and adds them together.  Given a PLUS_EXPR, you always
where to find the operands, and more or less what they mean.  I don't
find that confusing, although I agree that the operands themselves may
be different in GENERIC and GIMPLE.

SWITCH_EXPR is different.  Sometimes you use SWITCH_BODY, and
sometimes you use SWITCH_LABELS.  How do you know which one to use?
It depends on whether you have GENERIC or GIMPLE.

In particular, this matters for functions like block_may_fallthru,
which are called with both GENERIC and GIMPLE.

> If we did that we would need different codes for every tree code
> (MODIFY_EXPR, PLUS_EXPR, etc).  Similarly, we would need
> different codes when we move from High-GIMPLE into Low-GIMPLE.
> I'm not sure that's worth the effort.  From the point of view of
> analysis and optimizations the differences between the different
> ILs are mostly in the grammar, not their syntax.

If I understand what you are saying, I am complaining about the
specific cases where the difference is in the syntax.  If a tree takes
a different set of operands in GENERIC and GIMPLE, or if the operands
have significantly different meanings, then I think we should use a
different tree code.  If the operands are more or less the same, then
I think using the same tree code is fine.

This is obviously all just my opinion, as somebody who came late to
this stuff and is trying to understand it.

Ian

Re: GCC 4.1: Buildable on GHz machines only?

2005-05-03 Thread Richard Henderson

On Wed, May 04, 2005 at 09:06:14AM +0900, Peter O'Gorman wrote:
> Part of the problem here is the use of a convenienve library to hold several
> thousand object files and then making a shared library with the convenience
> library. On many platforms, those without a --whole-archive flag, libtool
> will extract the convenience archive all over again. Linking the shared
> library all in one go would be faster.

Frankly, this is only a very small part of the problem.  MOST of the
time is wasted in the thousands of libtool invokations that preceed
the final link.


r~

Re: FW: GCC Cross Compiler for cygwin

2005-05-03 Thread James E Wilson

Kai Ruottu wrote:
GCC configure.  But there are long-standing bugs in the GCC sources and 
workarounds/fixes are required.
Since you seem to have an understanding of the problems here, perhaps 
you could file some bugzilla bug reports to document them.

then not... As told the "eabi" is not and one must
use a wacky command like :
powerpc-eabi-gcc  -mads -O2  -o hello_ppc-eabi.x hello.c
powerpc-eabi-gcc  -myellowknife -O2  -o hello_ppc-eabi.x hello.c
I'd recommend the -msim option here.  It is probably easier to get 
working on the simulator first, before trying to use hardware.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com

Re: Incomplete instatitiation of virtual registers

2005-05-03 Thread Richard Henderson

On Tue, May 03, 2005 at 05:44:47PM -0700, James E Wilson wrote:
> Recursively calling instantiate_virtual_regs_in_insn does not look 
> right.

Indeed it is not.

I'd like to see the define_insn for {addhi3}.  I'm a bit confused as
to how I could have missed iterating over what appears like it ought
to be match_operand 0.

> I know of one PR that has since been filed for a problem with the new 
> virtual register instantiation code.  That is PR 21328.

Actually, 21318, but yes this is unrelated.  The symptom in that case
is an ICE in simplify_subreg.

r~

Re: [gomp] OpenMP IL design notes

2005-05-03 Thread Diego Novillo

On Tue, May 03, 2005 at 08:48:20PM -0400, Ian Lance Taylor wrote:

> If I understand what you are saying, I am complaining about the
> specific cases where the difference is in the syntax.
>
Drat, trapped in my own web of logic and definitions ;)  Yes,
that's exactly what I was saying and now I see the inconsistency
you were pointing out.

Hmm, I'm not quite sure how to go about this.  There is another
case where we make these magic passes, COND_EXPR.  In GENERIC
each arm is a BIND_EXPR, in GIMPLE each arm is just a label.  I
was essentially trying to pull the same stunt.

I kind of like the idea of taking an operand code and twist its
operands slightly when we lower the IL.  But then, I'm not the
kind of person you'd want in a language standards committee.

So, there would be 2 such inconsistencies with our current IL
levels: SWITCH_EXPR and COND_EXPR.  What would you change?
Perhaps it could be a nice cleanup of the abstraction.

Regarding the GOMP_* codes, perhaps it would suffice to do:

GENERIC GOMP_PARALLEL 

GIMPLE  GOMP_PARALLEL 
L1:
  g_body

(we can then either put a GOMP_PARALLEL_END marker, or just
figure out the region using std region building techniques).

Notice that we will be very tempted to change that 'GOTO_EXPR L1'
to just 'L1'.  No point emitting the GOTO_EXPR when we "know"
that is always a GOTO_EXPR in GIMPLE.  That's what led to the
COND_EXPR current structure.

> This is obviously all just my opinion, as somebody who came late to
> this stuff and is trying to understand it.
> 
Which is exactly the kind of POV that can usually spot
inconsistencies such as this one :)

Thanks.  Diego.

Re: GCC 4.0, Fast Math, and Acovea

2005-05-03 Thread tbp

On 5/3/05, Scott Robert Ladd <[EMAIL PROTECTED]> wrote:
> tbp wrote:
> Granted, POV-Ray may not be state-of-the-art, but then, I know quite a
> few people who say that (even legitimately) about just about every
> software product in existence.
True. Still, POV has evolved from dkbtrace and it shows sometimes.

> If you have a suggestion for better benchmarks, I'm listening. Is your
> ray tracer available?
It's way too rough for general consumption yet, and quite specialized
anyway (very large geometry).

With specific kludges for each compiler, here's the hierarchy for the
hand vectorized rendering:
ia32:   icc8.1, gcc4.1 (-5% at least), msvc2k3 (-20%)
x86-64: gcc4.1, icc9.0 (-7% at least)
It varies a bit, depending on features being hammered by specific
scenes, but the order is unchanged (note that the x86-64 version has
only been tested on k8 so far).

GCC shows an edge in the SAH kdtree compiler part (branchy code) on
x86-64, with a >40% improvement over the ia32 versions (and icc9.1
which definitely gets lost).
That's more than welcome, given the time it takes to produce those
freaking trees :)

Anecdotically gcc is only one to get the parsing of large memory
mapped files right (or put another way, the idiom used), being 2x
faster than every other compilers on every platform.

Packing booleans?

2005-05-03 Thread Sam Lauber

Would it be possible to have a -fpack-bools option that packs booleans into
the smallest form possible (8 booleans -> 1 8-bit reg, etc.) into a register
(or memory, as the case may be)?

Samuel Lauber
-- 
___
Surf the Web in a faster, safer and easier way:
Download Opera 8 at http://www.opera.com

Powered by Outblaze

70 matches

Mail list logo