Re: Graphite build fails if PPL configured with --disable-shared
On Mon, 2009-05-11 at 13:07 -0700, Ian Lance Taylor wrote: > Another Graphite build issue: it appears that I must not use > --disable-shared when I configure PPL. If I do use --disable-shared, I > get this: > > /home/iant/gnu/ppl-0.10.2-install/lib/libppl_c.a(ppl_c_implementation_common.o): > In function `finalize': > /home/iant/gnu/ppl-0.10.2/interfaces/C/../../src/ppl.hh:1842: undefined > reference to `operator delete(void*)' > > followed by thousands of similar errors. This is unfortunate, as it > means that I must manually set LD_LIBRARY_PATH to the directory where > the PPL library is installed. This also makes it harder for anybody > else to run the compiler that I build. This needs to be fixed. > > > Also, a minor issue: cloog "make clean" fails: > > rm /version.h > rm: cannot remove `/version.h': No such file or directory Last time I tried I was able to build a gmp/mpfr/ppl/cloog static and libstdc++ dynamic (system) GCC: http://gcc.gnu.org/ml/gcc/2009-03/msg00856.html As shown in the discussion last option must be: --with-host-libstdc++=/usr/lib/libstdc++.so.6 Laurent
Re: Code generation problem with optimizations enabled
- Original Message > From: Jamie Prescott > To: gcc@gcc.gnu.org > Sent: Monday, May 11, 2009 11:59:23 PM > Subject: Code generation problem with optimizations enabled > If I disable the optimizations, everything is fine and the 'fcmp' is there. > Even with optimizations enabled, the RTL dump shows the missing 'cmpdf' > present > and correctly recognized. It being: What I noticed is that if I CC_STATUS_INIT (in xxx_notice_update_cc()) even for insn that do not require it (that are almost all of them - being only cmp/fcmp/test that modify cc0), cmpdf gets emitted regularly. Normally all the insn but cmp/fcmp/test set "none" in their cc attribute, and xxx_notice_update_cc() does nothing in that case. While cmp and fcmp (that set the cc attribute to "compare") do CC_STATUS_INIT and records DEST and SRC operands. Am I doing it wrong? - Jamie
Re: cout Issue
2009/5/12 Arthur Schwarz: > > Program and particulars below. > > When line 27 is commented out, line 26 is output. When line 27 is not > commented, line 26 is not output except that if x.file contains a line feed > the null line line 26 & line 27 are output. If x.file does not contain a line > feed, only line 27 is output. > > Does the line feed have an effect on the 'cout <<' of line 26 of the program? > > Note. The code is awful and this is an example. Hi Arthur, This question is off-topic on this list as it has nothing to do with development of gcc, the gcc-help list or a C++ forum would be better. I would guess that your file has DOS-style line-endings so a carriage-return is output after line 26, and line 27 is overwriting the output of line 26. Jonathan
incomplete tree dump with flag -fdump-tree-all
Hi, I found that the tree dump (xxx.c.t00.tu file) with -fdump-tree-all flag is incomplete in gcc-4.1.2. And the tree-dump.c is not modified for this bug up to 4.4.0. When a function body contains a for-statement or if-statement, the stmt-list will break, and the rest of the function body is lost. I did some search and found an early discussion thread about -fdump-tree-original-raw: http://gcc.gnu.org/ml/gcc-bugs/2008-07/msg00695.html I tried that patch and it didn't work for the "tu" dump. And I found that the output with -fdump-tree-all-slim flag is the same as with -fdump-tree-all flag. How can I find the rest of the statements from the broken statement-list? Thx. -Fengzhe
Re: Code generation problem with optimizations enabled
> What I noticed is that if I CC_STATUS_INIT (in xxx_notice_update_cc()) even > for insn that > do not require it (that are almost all of them - being only cmp/fcmp/test > that modify cc0), > cmpdf gets emitted regularly. If so, you should not be using cc0, but a CCmode register instead. See for example how the fr30 port implements compare-and-jump. I'm currently committing a merge that changes a bit how the compare-and-jumps are realized; if you wait a few hours, you'll get a more up-to-date example. Paolo
plugins callbacks and data
Hello All In the current plugin API, the function register_callback is used to register callback routines (eg PLUGIN_FINISH_UNIT) in which case the callback is expected to be a routine. But this same function register_callback is used also to register some data to plugins, without any call back functions, eg for PLUGIN_PASS_MANAGER_SETUP. Perhaps we could have two different functions: 1. register_callback like before for true callbacks and 2. register_data for registering data, like for PLUGIN_PASS_MANAGER_SETUP or PLUGIN_INFO, declared as void register_data (const char *plugin_name, enum plugin_event event,, void* user_data); ? What do you think? BTW, tjhe current gcc/doc/plugins.texi don't mention PLUGIN_INFO, unless I am mistaken. And the enum plugin_event there is not the same as in gcc-plugins.h Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: Trouble building Graphite
On Mon, May 11, 2009 at 9:18 PM, Ian Lance Taylor wrote: > I'm having some trouble building the Graphite support. > > Using ftp://gcc.gnu.org/pub/gcc/infrastructure/ppl-0.10.2.tar.gz : > > * Unlike gcc, does not support a --with-gmp option. > + Does support a --with-libgmpxx-prefix option. > * If GMP was not built with C++ support, fails at build time. > * If GMP was not built with exception support, complains at configure > time, and recommends using CPPFLAGS=-fexceptions when building GMP. > + CPPFLAGS is for preprocessor flags, and -fexceptions is not a > preprocessor flag. However, I admit that setting CFLAGS does not > work correctly, as GMP seems to have special requirements for it. > + I think they mean -funwind-tables anyhow. I think they mean -fexceptions. At least only if built with that ppl configure no longer complains. And it works for me setting CFLAGS. Richard. > Using ftp://gcc.gnu.org/pub/gcc/infrastructure/cloog-ppl-0.15.3.tar.gz : > > * The --with-ppl configure option does not work. > + The configure script refers to ${ppl_prefix} without ever setting > it. > > Could the Graphite maintainers please look into these issues? Thanks. > > Ian >
Re: plugins callbacks and data
2009/5/12 Basile STARYNKEVITCH : > Hello All > > In the current plugin API, the function register_callback is used to > register callback routines (eg PLUGIN_FINISH_UNIT) in which case the > callback is expected to be a routine. But this same function > register_callback is used also to register some data to plugins, without any > call back functions, eg for PLUGIN_PASS_MANAGER_SETUP. > > Perhaps we could have two different functions: > > 1. register_callback like before for true callbacks > > and > > 2. register_data for registering data, like for PLUGIN_PASS_MANAGER_SETUP > or PLUGIN_INFO, declared as > void register_data (const char *plugin_name, > enum plugin_event event,, > void* user_data); > ? > > What do you think? No strong preference one way or the other. One small advantage of the current model is that we have only switch statement over all "enum plugin_event" values. The function already existed when I joined, maybe Le-Chu has some opinion about your proposal. > > Regards. Cheers, -- Rafael Avila de Espindola Google | Gordon House | Barrow Street | Dublin 4 | Ireland Registered in Dublin, Ireland | Registration Number: 368047
Re: Trunk unfrozen, cond-optab branch merged
Paolo Bonzini wrote: > Subject says it all, I guess. And so it does now. wwwdocs was also updated. Paolo
Intermediate representation
Hello. First I apologize for my english but I'm french and sometimes I make mistakes. I have to create a file containing some informations by processing C++ code source for work. But why create a whole lexical analyzer while I can use the intermediate representation tree of GCC ? To do it, I'm going to introduce a fonction in GCC which analyze the intermediate representation tree used by GCC and create a file containing all of the informations I need. I read all the informations I found about the tree and now my problem is that I don't know how to introduce this very function into GCC (I haven't found anything about it over the net). What is the best way to do it ? Create a new "back-end" ? I read about it and I don't think it could work. Can I put it into the source code instead ? If so, where can I put it ? I work on egcs1.1, but I know that the tree is implemented in, so it can work. I'm sorry to bother you with this but I searched a long time over the net and didn't find any clue about the way to do it. Thank you very much. Nicolas COLLIN
Re: [plugins] Name for pass_all_optimizations
On Thu, Apr 23, 2009 at 22:58, Justin Seyster wrote: > Unless that's not a good place to put plug-in passes, I propose > giving the pass_all_optimizations pass the name "all_optimizations." > I believe that there are a handful of other unnamed passes that might > also be useful for plug-in developers. Sounds like a good idea. You can send patches against mainline now, since all the plugin functionality has already been merged and we are still in stage 1. You will need a copyright assignment, if you don't already have one. Diego.
Re: Object file for Module is too large
Hi Alison, This issue is not specific to Fortran, but it's specific to Darwin (you say that "the large object files have been observed on many other platforms", but could you give a list of such platforms?): $ cat a.c int x[999] = { 0 }; $ gcc -c a.c && ls -lh a.o -rw-r--r-- 1 fx wheel38M May 12 13:43 a.o $ size a.o __TEXT __DATA __OBJC others dec hex 0 39960 0 399626259fc while on x86_64-linux, I get: $ cat a.c int x[999] = { 0 }; $ gcc -c a.c && ls -lh a.o -rw-r--r-- 1 fx fx 959 May 12 13:44 a.o $ size a.o textdata bss dec hex filename 0 0 3996399626259fc a.o The different between the two is between .bss (x86_64-linux) and .data (darwin). I don't know enough about Mach-O to tell if it's a bug or a feature :) FX
Re: Intermediate representation
Nicolas COLLIN wrote: > Hello. Hi again Nicolas, > I have to create a file containing some informations by processing C++ > code source for work. > But why create a whole lexical analyzer while I can use the intermediate > representation tree of GCC ? > To do it, I'm going to introduce a fonction in GCC which analyze the > intermediate representation tree used by GCC and create a file > containing all of the informations I need. > I read all the informations I found about the tree and now my problem is > that I don't know how to introduce this very function into GCC (I > haven't found anything about it over the net). > What is the best way to do it ? Create a new "back-end" ? I read about > it and I don't think it could work. Can I put it into the source code > instead ? If so, where can I put it ? > I work on egcs1.1, but I know that the tree is implemented in, so it can > work. It is a shame you are stuck using such an old version of the compiler, in modern GCC we have just added a plugin feature which is ideal for your purpose. In such old GCC, there is nothing like that. A back-end is not the way to do this. The back-ends only get to see parts of the semantic info that the mid-end presents to them to drive instruction selection. I think what you probably want to do is call your code from somewhere around the top of rest_of_compilation() in gcc/toplev.c, and it will get a chance to process the trees for all the functions and data items declared in the program. Note that you'll have to cope with seeing each item one by one on separate calls to your function. If that's a problem you'll need to figure out a way to maintain state between the consecutive calls, which won't be difficult, but just in case you were expecting it, you should know that there is no one time at which the compiler keeps the entire tree representation of all functions and declarations in memory at the same time. (I don't know exactly when the -funit-at-a-time option was introduced into GCC, but I'm fairly sure it wasn't in EGCS.) cheers, DaveK
New GCC releases comparison and comparison of GCC4.4 and LLVM2.5 on SPEC2000
A few people asked me to do a new comparison of GCC releases and LLVM as the new GCC release and LLVM were out recently. You can find the comparison on http://vmakarov.fedorapeople.org/spec/ The comparison for x86 (32-bit mode) was done on Pentium4 and for x86_64 (64-bit mode) on Core I7. Some changes in the performance were big since GCC 3.2 and it is sometimes hard to see small changes on the posted graphs. Therefore I put original tables used to generate the graphs.
Re: Code generation problem with optimizations enabled
Thank you Paolo, I'll take a look at it. Is there a reason why the fcmp insn was dropped with such implementation? - Jamie - Original Message > From: Paolo Bonzini > To: Jamie Prescott > Cc: gcc@gcc.gnu.org > Sent: Tuesday, May 12, 2009 1:31:53 AM > Subject: Re: Code generation problem with optimizations enabled > > > What I noticed is that if I CC_STATUS_INIT (in xxx_notice_update_cc()) even > for insn that > > do not require it (that are almost all of them - being only cmp/fcmp/test > > that > modify cc0), > > cmpdf gets emitted regularly. > > If so, you should not be using cc0, but a CCmode register instead. > > See for example how the fr30 port implements compare-and-jump. I'm > currently committing a merge that changes a bit how the > compare-and-jumps are realized; if you wait a few hours, you'll get a > more up-to-date example. > > Paolo
Re: incomplete tree dump with flag -fdump-tree-all
Fengzhe Zhang writes: > I found that the tree dump (xxx.c.t00.tu file) with -fdump-tree-all > flag is incomplete in gcc-4.1.2. And the tree-dump.c is not modified > for this bug up to 4.4.0. > > When a function body contains a for-statement or if-statement, the > stmt-list will break, and the rest of the function body is lost. > > I did some search and found an early discussion thread about > -fdump-tree-original-raw: > http://gcc.gnu.org/ml/gcc-bugs/2008-07/msg00695.html > > I tried that patch and it didn't work for the "tu" dump. And I found > that the output with -fdump-tree-all-slim flag is the same as with > -fdump-tree-all flag. > > How can I find the rest of the statements from the broken statement-list? Thx. I think you will need to add code to gcc/cp/cxx-pretty-print.c and/or gcc/cp/dump.c to handle IF_STMT and FOR_STMT (and WHILE_STMT and DO_STMT too, I expect). I haven't looked into it in detail, though. Ian
Re: Trouble building Graphite
Ian Lance Taylor wrote: I'm having some trouble building the Graphite support. Using ftp://gcc.gnu.org/pub/gcc/infrastructure/ppl-0.10.2.tar.gz : * Unlike gcc, does not support a --with-gmp option. + Does support a --with-libgmpxx-prefix option. What is the trouble with this? I mean, is it a matter of syntax (you prefer the option to be called --with-gmp) or semantics (the --with-libgmpxx-prefix does not do the right thing)? * If GMP was not built with C++ support, fails at build time. Yes, the C++ interface of GMP is required. On the other hand, also the core of PPL is written in C++. In whhich sense requiring the C++ interface of GMP is a trouble? * If GMP was not built with exception support, complains at configure time, and recommends using CPPFLAGS=-fexceptions when building GMP. Well, "complain" is not the right word. The PPL configuration script simply warns about the fact that the bounded memory capabilities of the PPL are not available. Which is not a problem for GCC, since these capabilities are not used by CLooG. The message was designed not to alarm people unnecessarily. It says: "This is OK, if you do not plan to use the bounded memory capabilities offered by the PPL." Do you think a different wording could help? + CPPFLAGS is for preprocessor flags, and -fexceptions is not a preprocessor flag. However, I admit that setting CFLAGS does not work correctly, as GMP seems to have special requirements for it. In facto, our use of CPPFLAGS is motivated by the fact that using CFLAGS for that purpose was not working, once upon a time. See: http://www.cs.unipr.it/pipermail/ppl-devel/2001-October/000639.html http://www.cs.unipr.it/pipermail/ppl-devel/2001-October/000663.html Perhaps it works now: we will check again and, in case it works, we will amend the configuration script, documentation and web site. + I think they mean -funwind-tables anyhow. We do that because: -funwind-tables Similar to -fexceptions, except that it will just generate any needed static data, but will not affect the generated code in any other way. You will normally not enable this option; instead, a language processor that needs this handling would enable it on your behalf. Please let us know if we are mistaken on this point. Generally speaking, we are 100% willing to improve the PPL as much as possible: any suggestion is welcome in this respect. Please mail to ppl-de...@cs.unipr.it All the best, Roberto -- Prof. Roberto Bagnara Computer Science Group Department of Mathematics, University of Parma, Italy http://www.cs.unipr.it/~bagnara/ mailto:bagn...@cs.unipr.it
Re: Graphite build fails if PPL configured with --disable-shared
Janis Johnson wrote: On Mon, 2009-05-11 at 13:07 -0700, Ian Lance Taylor wrote: Another Graphite build issue: it appears that I must not use --disable-shared when I configure PPL. If I do use --disable-shared, I get this: /home/iant/gnu/ppl-0.10.2-install/lib/libppl_c.a(ppl_c_implementation_common.o): In function `finalize': /home/iant/gnu/ppl-0.10.2/interfaces/C/../../src/ppl.hh:1842: undefined reference to `operator delete(void*)' followed by thousands of similar errors. This is unfortunate, as it means that I must manually set LD_LIBRARY_PATH to the directory where the PPL library is installed. This also makes it harder for anybody else to run the compiler that I build. This needs to be fixed. I get around this by setting LDFLAGS for the ppl configure: LDFLAGS="-static" \ ./configure \ --prefix=$PREFIX \ --build=powerpc-linux \ --with-gnu-ld \ --with-libgmp-prefix=$PREFIX \ --with-libgmpxx-prefix=$PREFIX \ --disable-shared I am not sure I understand: we trust that Libtool, which provides us with the --disable-shared option, will do the right thing. And it seems it does here: the static library is built and passes its checks. Perhaps you want something different from what --disable-shared promises, that is, not to build any shared libraries? I copy libstdc++.a into the directory with the other GCC host libraries (gmp/mpfr/ppl/cloog/mpc). Building these libraries is indeed quite painful. Any suggestion about how to improve the PPL is welcome. This, of course, applies also to the build machinery. All the best, Roberto -- Prof. Roberto Bagnara Computer Science Group Department of Mathematics, University of Parma, Italy http://www.cs.unipr.it/~bagnara/ mailto:bagn...@cs.unipr.it
Re: Code generation problem with optimizations enabled
- Original Message > From: Paolo Bonzini > To: Jamie Prescott > Cc: gcc@gcc.gnu.org > Sent: Tuesday, May 12, 2009 1:31:53 AM > Subject: Re: Code generation problem with optimizations enabled > > > What I noticed is that if I CC_STATUS_INIT (in xxx_notice_update_cc()) even > for insn that > > do not require it (that are almost all of them - being only cmp/fcmp/test > > that > modify cc0), > > cmpdf gets emitted regularly. > > If so, you should not be using cc0, but a CCmode register instead. > > See for example how the fr30 port implements compare-and-jump. I'm > currently committing a merge that changes a bit how the > compare-and-jumps are realized; if you wait a few hours, you'll get a > more up-to-date example. Thanks Paolo, that worked great and simplified things quite a bit. - Jamie
Re: New GCC releases comparison and comparison of GCC4.4 and LLVM2.5 on SPEC2000
On May 12, 2009, at 6:56 AM, Vladimir Makarov wrote: A few people asked me to do a new comparison of GCC releases and LLVM as the new GCC release and LLVM were out recently. You can find the comparison on http://vmakarov.fedorapeople.org/spec/ The comparison for x86 (32-bit mode) was done on Pentium4 and for x86_64 (64-bit mode) on Core I7. Some changes in the performance were big since GCC 3.2 and it is sometimes hard to see small changes on the posted graphs. Therefore I put original tables used to generate the graphs. Looking at the llvm 2.5 vs gcc 4.4 comparison is very interesting, thank you for putting this together Vladimir! I find these numbers particularly interesting because you're comparing simple options like - O2 and -O3 instead of the crazy spec tuning mix :). This is much more likely to be representative of what real users will get on their apps. Some random thoughts: 1. I have a hard time understanding the code size numbers. Does 10% mean that GCC is generating 10% bigger or 10% smaller code than llvm? 2. You change two variables in your configurations: micro architecture and pointer size. Would you be willing to run x86-32 Core i7 numbers as well? LLVM in particular is completely untuned for the (really old and quirky) "netburst" architecture, but I'm interested to see how it runs for you on more modern Core i7 or Core2 processors in 32-bit mode. 3. Your SPEC FP benchmarks tell me two things: GCC 4.4's fortran support is dramatically better than 4.2's (which llvm 2.5 uses), and your art/mgrid hacks apparently do great stuff :). 4. Your SPEC INT numbers are more interesting to me. It looks like you guys have some significant wins in 175.vpr, 197.crafty, and other benchmarks. At some point, I'll have to see what you guys are doing :) Thanks for the info, great stuff! -Chris
Re: Graphite build fails if PPL configured with --disable-shared
Roberto Bagnara writes: > Janis Johnson wrote: >> On Mon, 2009-05-11 at 13:07 -0700, Ian Lance Taylor wrote: >>> Another Graphite build issue: it appears that I must not use >>> --disable-shared when I configure PPL. If I do use --disable-shared, I >>> get this: >>> >>> /home/iant/gnu/ppl-0.10.2-install/lib/libppl_c.a(ppl_c_implementation_common.o): >>> In function `finalize': >>> /home/iant/gnu/ppl-0.10.2/interfaces/C/../../src/ppl.hh:1842: undefined >>> reference to `operator delete(void*)' >>> >>> followed by thousands of similar errors. This is unfortunate, as it >>> means that I must manually set LD_LIBRARY_PATH to the directory where >>> the PPL library is installed. This also makes it harder for anybody >>> else to run the compiler that I build. This needs to be fixed. >> >> I get around this by setting LDFLAGS for the ppl configure: >> >> LDFLAGS="-static" \ >> ./configure \ >> --prefix=$PREFIX \ >> --build=powerpc-linux \ >> --with-gnu-ld \ >> --with-libgmp-prefix=$PREFIX \ >> --with-libgmpxx-prefix=$PREFIX \ >> --disable-shared > > I am not sure I understand: we trust that Libtool, which provides us > with the --disable-shared option, will do the right thing. And it > seems it does here: the static library is built and passes its checks. > > Perhaps you want something different from what --disable-shared promises, > that is, not to build any shared libraries? > >> I copy libstdc++.a into the directory with the other GCC host >> libraries (gmp/mpfr/ppl/cloog/mpc). >> >> Building these libraries is indeed quite painful. > > Any suggestion about how to improve the PPL is welcome. This, of course, > applies also to the build machinery. I don't think this is a problem with PPL. The problem is that PPL uses libstdc++ and gcc does not. Thus, linking against PPL configured with --disable-shared requires also linking against libstdc++. That is the part which needs to be improved when using gcc with PPL. We have ways to do it, but they are not good ways, and they are not documented on the Graphite_build wiki page. Ian
Re: New GCC releases comparison and comparison of GCC4.4 and LLVM2.5 on SPEC2000
Chris Lattner wrote: On May 12, 2009, at 6:56 AM, Vladimir Makarov wrote: A few people asked me to do a new comparison of GCC releases and LLVM as the new GCC release and LLVM were out recently. You can find the comparison on http://vmakarov.fedorapeople.org/spec/ The comparison for x86 (32-bit mode) was done on Pentium4 and for x86_64 (64-bit mode) on Core I7. Some changes in the performance were big since GCC 3.2 and it is sometimes hard to see small changes on the posted graphs. Therefore I put original tables used to generate the graphs. Looking at the llvm 2.5 vs gcc 4.4 comparison is very interesting, thank you for putting this together Vladimir! I find these numbers particularly interesting because you're comparing simple options like -O2 and -O3 instead of the crazy spec tuning mix :). This is much more likely to be representative of what real users will get on their apps. Some random thoughts: 1. I have a hard time understanding the code size numbers. Does 10% mean that GCC is generating 10% bigger or 10% smaller code than llvm? The change is reported relative to LLVM. So 10% means that GCC generates 10% bigger code than LLVM and -10% means that GCC generates 10% less code. 2. You change two variables in your configurations: micro architecture and pointer size. Would you be willing to run x86-32 Core i7 numbers as well? LLVM in particular is completely untuned for the (really old and quirky) "netburst" architecture, but I'm interested to see how it runs for you on more modern Core i7 or Core2 processors in 32-bit mode. I used the same processor (P4) and options for x86 as for the GCC release comparison. I did not know that LLVM is badly tuned for P4, sorry. I could do the same comparison for x86 on Core i7 without specific tuning (there is no tuning for i7 yet) but it takes a lot of time. May be it will be ready on next week. 3. Your SPEC FP benchmarks tell me two things: GCC 4.4's fortran support is dramatically better than 4.2's (which llvm 2.5 uses), and your art/mgrid hacks apparently do great stuff :). 4. Your SPEC INT numbers are more interesting to me. It looks like you guys have some significant wins in 175.vpr, 197.crafty, and other benchmarks. At some point, I'll have to see what you guys are doing :) Thanks for the info, great stuff! -Chris
Re: New GCC releases comparison and comparison of GCC4.4 and LLVM2.5 on SPEC2000
On Tue, 12 May 2009, Chris Lattner wrote: > 1. I have a hard time understanding the code size numbers. Does 10% mean that > GCC is generating 10% bigger or 10% smaller code than llvm? I have a different comment on the code size numbers: could we have comparisons of code size for -Os rather than (or in addition to) -O2 and -O3? If someone is particularly concerned with code size, -Os is what they are expected to use. -- Joseph S. Myers jos...@codesourcery.com
Re: New GCC releases comparison and comparison of GCC4.4 and LLVM2.5 on SPEC2000
Joseph S. Myers wrote: On Tue, 12 May 2009, Chris Lattner wrote: 1. I have a hard time understanding the code size numbers. Does 10% mean that GCC is generating 10% bigger or 10% smaller code than llvm? I have a different comment on the code size numbers: could we have comparisons of code size for -Os rather than (or in addition to) -O2 and -O3? If someone is particularly concerned with code size, -Os is what they are expected to use. Thanks for pointing this, Joseph. Yes, it would be interesting to see how GCC code size is changed with -Os (as the performance too). But probably it is even more interesting for embedded processors. When I am less busy, I'll try to do it.
Re: Trouble building Graphite
Roberto Bagnara writes: > Ian Lance Taylor wrote: >> I'm having some trouble building the Graphite support. >> >> Using ftp://gcc.gnu.org/pub/gcc/infrastructure/ppl-0.10.2.tar.gz : >> >> * Unlike gcc, does not support a --with-gmp option. >> + Does support a --with-libgmpxx-prefix option. > > What is the trouble with this? I mean, is it a matter of syntax > (you prefer the option to be called --with-gmp) or semantics > (the --with-libgmpxx-prefix does not do the right thing)? Let me start by saying that my message was aimed at the gcc developers who have brought PPL and CLooG into the gcc build. My message was not aimed at the PPL developers. When MPFR and GMP were brought into the build, Kaveh spent quite a bit of time getting everything working smoothly. I think that the graphite developers need to spend a similar amount of time getting the PPL and CLooG builds working smoothly. --with-gmp vs. --with-libgmpxx-prefix is a matter of syntax. Since all gcc developers have to build these packages, it's inconvenient to have to remember different configure options for different packages. >> * If GMP was not built with C++ support, fails at build time. > > Yes, the C++ interface of GMP is required. On the other hand, > also the core of PPL is written in C++. In whhich sense requiring > the C++ interface of GMP is a trouble? This is not a problem with PPL. It's a problem with the existing build instructions for gcc developers. >> * If GMP was not built with exception support, complains at configure >> time, and recommends using CPPFLAGS=-fexceptions when building GMP. > > Well, "complain" is not the right word. The PPL configuration script > simply warns about the fact that the bounded memory capabilities of > the PPL are not available. Which is not a problem for GCC, since these > capabilities are not used by CLooG. The message was designed not > to alarm people unnecessarily. It says: "This is OK, if you do not > plan to use the bounded memory capabilities offered by the PPL." > Do you think a different wording could help? Since I don't actually know anything about PPL, the message didn't mean anything to me. I didn't know whether GCC used those features or not. So this is a problem with the existing build instructions: they need to document this message and state clearly that it may be ignored for purposes of using PPL with GCC. >> + CPPFLAGS is for preprocessor flags, and -fexceptions is not a >> preprocessor flag. However, I admit that setting CFLAGS does not >> work correctly, as GMP seems to have special requirements for it. > > In facto, our use of CPPFLAGS is motivated by the fact that using CFLAGS > for that purpose was not working, once upon a time. See: > > http://www.cs.unipr.it/pipermail/ppl-devel/2001-October/000639.html > http://www.cs.unipr.it/pipermail/ppl-devel/2001-October/000663.html > > Perhaps it works now: we will check again and, in case it works, > we will amend the configuration script, documentation and web site. It will still fail as these messages describe. When the user sets CFLAGS, it overrides the default CFLAGS setting. The best way to make this work may be to work with the GMP developers. Again, this is not a responsibility of the PPL developers, and in fact I have no idea whether this matters for the ways in which GCC uses PPL. >> + I think they mean -funwind-tables anyhow. > > We do that because: > >-funwind-tables >Similar to -fexceptions, except that it will just generate any >needed static data, but will not affect the generated code in any >other way. You will normally not enable this option; instead, a >language processor that needs this handling would enable it on your As far as I know, enabling -funwind-tables for C code is sufficient to throw exceptions from C++ code to C++ code across that C code. The documentation is somewhat misleading. You never need to specify this option when your program is written entirely in one language. Things are different in multi-language programs. Ian
Re: New GCC releases comparison and comparison of GCC4.4 and LLVM2.5 on SPEC2000
On May 12, 2009, at 11:05 AM, Vladimir Makarov wrote: Chris Lattner wrote: On May 12, 2009, at 6:56 AM, Vladimir Makarov wrote: A few people asked me to do a new comparison of GCC releases and LLVM as the new GCC release and LLVM were out recently. You can find the comparison on http://vmakarov.fedorapeople.org/ spec/ The comparison for x86 (32-bit mode) was done on Pentium4 and for x86_64 (64-bit mode) on Core I7. Some changes in the performance were big since GCC 3.2 and it is sometimes hard to see small changes on the posted graphs. Therefore I put original tables used to generate the graphs. Looking at the llvm 2.5 vs gcc 4.4 comparison is very interesting, thank you for putting this together Vladimir! I find these numbers particularly interesting because you're comparing simple options like -O2 and -O3 instead of the crazy spec tuning mix :). This is much more likely to be representative of what real users will get on their apps. Some random thoughts: 1. I have a hard time understanding the code size numbers. Does 10% mean that GCC is generating 10% bigger or 10% smaller code than llvm? The change is reported relative to LLVM. So 10% means that GCC generates 10% bigger code than LLVM and -10% means that GCC generates 10% less code. Ok! It is interesting that GCC seems to generate consistently larger code at both -O2 and -O3 in x86-64 mode (over 20% larger in -O3). Perhaps that also is impacting the compile time numbers as well. 2. You change two variables in your configurations: micro architecture and pointer size. Would you be willing to run x86-32 Core i7 numbers as well? LLVM in particular is completely untuned for the (really old and quirky) "netburst" architecture, but I'm interested to see how it runs for you on more modern Core i7 or Core2 processors in 32-bit mode. I used the same processor (P4) and options for x86 as for the GCC release comparison. I did not know that LLVM is badly tuned for P4, sorry. I could do the same comparison for x86 on Core i7 without specific tuning (there is no tuning for i7 yet) but it takes a lot of time. May be it will be ready on next week. No problem at all, I appreciate you running the numbers! It would also be very interesting to include LLVM's LTO support, which gives a pretty dramatic win on SPEC. However, I don't know how difficult it is to use on linux (on the mac, you just pass -O4 at compile time, and everything works). I've heard that Gold has a new plugin to make LTO transparent on linux as well, but I have no experience with it, and it is probably more trouble than you want to take. Does gcc 4.4 include the LTO branch yet? -Chris 3. Your SPEC FP benchmarks tell me two things: GCC 4.4's fortran support is dramatically better than 4.2's (which llvm 2.5 uses), and your art/mgrid hacks apparently do great stuff :). 4. Your SPEC INT numbers are more interesting to me. It looks like you guys have some significant wins in 175.vpr, 197.crafty, and other benchmarks. At some point, I'll have to see what you guys are doing :) Thanks for the info, great stuff! -Chris
Re: Graphite build fails if PPL configured with --disable-shared
On Tue, 2009-05-12 at 18:46 +0200, Roberto Bagnara wrote: > Any suggestion about how to improve the PPL is welcome. This, of course, > applies also to the build machinery. Hi Roberto, I added some instructions on how to build to the GCC wiki (end of page): http://gcc.gnu.org/wiki/Graphite_Build They worked with ppl-0.10 and cloog-ppl-0.15 however they now fail with ppl-0.10.2 and cloog-ppl-0.15.3 on cloog-ppl-0.15.3 configure: ... checking for ppl_c.h... no configure: error: Can't find PPL headers. Looking at config.log: configure:20698: gcc -c -g -O2 -I/include -I/n/17/guerby/install-ppl2/gmp-4.2.4/include conftest.c >&5 configure is not adding the -I for ppl hence the failure. I checked and the wanted ppl_c.h was correctly installed so I don't think ppl-0.10.2 is the issue. Looking more at cloog-ppl/configure I find stuff like: << # Check whether --with-ppl or --without-ppl was given. if test "${with_ppl+set}" = set; then withval="$with_ppl" fi; # Check whether --with-polylib_prefix or --without-polylib_prefix was given. if test "${with_polylib_prefix+set}" = set; then withval="$with_polylib_prefix" fi; # Check whether --with-polylib_exec_prefix or --without-polylib_exec_prefix was given. if test "${with_polylib_exec_prefix+set}" = set; then withval="$with_polylib_exec_prefix" fi; # Check whether --with-polylib_builddir or --without-polylib_builddir was given. if test "${with_polylib_builddir+set}" = set; then withval="$with_polylib_builddir" fi; >> Wich is obviously broken since all the tests are setting the same variable $withval and so --with-ppl just doesn't work. I looked at cloog-ppl-0.15 configure and it was ok. Now I don't know how to fix configury stuff but may be someone can help here. Also it would be nice if cloog-ppl-0.15.3.tar.gz top level directory was named with version "cloog-ppl-0.15.3" instead of the current version-less "cloog-ppl". Thanks for your help, Laurent
Re: New GCC releases comparison and comparison of GCC4.4 and LLVM2.5 on SPEC2000
Vladimir Makarov wrote: Chris Lattner wrote: 2. You change two variables in your configurations: micro architecture and pointer size. Would you be willing to run x86-32 Core i7 numbers as well? LLVM in particular is completely untuned for the (really old and quirky) "netburst" architecture, but I'm interested to see how it runs for you on more modern Core i7 or Core2 processors in 32-bit mode. I used the same processor (P4) and options for x86 as for the GCC release comparison. I was wrong here GCC-LLVM comparison does not use -mtune=pentium4 as it was used for GCC releases. So default x86 tunings for the compilers were used for the 32-bit comparison. Still the results might look different on Core i7. Sorry, I missed to mention that I used an additional option -mpc64 for 32-bit GCC4.4. It is not possible to generate SPECFP2000 expected results by GCC4.4 without this option. LLVM does not support this option. And this option can significantly improve the performance. So 32-bit comparison of SPECFP2000 should be taken with a grain of salt. I've just corrected page http://vmakarov.fedorapeople.org/spec/llvmgcc32.html by adding these comments.
Re: Graphite build fails if PPL configured with --disable-shared
Laurent GUERBY writes: > Looking more at cloog-ppl/configure I find stuff like: > > << > # Check whether --with-ppl or --without-ppl was given. > if test "${with_ppl+set}" = set; then > withval="$with_ppl" > > fi; > > > # Check whether --with-polylib_prefix or --without-polylib_prefix was given. > if test "${with_polylib_prefix+set}" = set; then > withval="$with_polylib_prefix" > > fi; > > # Check whether --with-polylib_exec_prefix or --without-polylib_exec_prefix > was given. > if test "${with_polylib_exec_prefix+set}" = set; then > withval="$with_polylib_exec_prefix" > > fi; > > # Check whether --with-polylib_builddir or --without-polylib_builddir was > given. > if test "${with_polylib_builddir+set}" = set; then > withval="$with_polylib_builddir" > > fi; >>> > > Wich is obviously broken since all the tests are setting the same > variable $withval and so --with-ppl just doesn't work. I looked at > cloog-ppl-0.15 configure and it was ok. The variable withval is only for use in the third argument (ACTION-IF-GIVEN) of AC_ARG_WITH. In all other places the variable with_PACKAGE should be used. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."
Re: Graphite build fails if PPL configured with --disable-shared
On Tue, 2009-05-12 at 21:31 +0200, Andreas Schwab wrote: > Laurent GUERBY writes: > > > Looking more at cloog-ppl/configure I find stuff like: > > > > << > > # Check whether --with-ppl or --without-ppl was given. > > if test "${with_ppl+set}" = set; then > > withval="$with_ppl" > > > > fi; > > > > > > # Check whether --with-polylib_prefix or --without-polylib_prefix was given. > > if test "${with_polylib_prefix+set}" = set; then > > withval="$with_polylib_prefix" > > > > fi; > > > > # Check whether --with-polylib_exec_prefix or --without-polylib_exec_prefix > > was given. > > if test "${with_polylib_exec_prefix+set}" = set; then > > withval="$with_polylib_exec_prefix" > > > > fi; > > > > # Check whether --with-polylib_builddir or --without-polylib_builddir was > > given. > > if test "${with_polylib_builddir+set}" = set; then > > withval="$with_polylib_builddir" > > > > fi; > >>> > > > > Wich is obviously broken since all the tests are setting the same > > variable $withval and so --with-ppl just doesn't work. I looked at > > cloog-ppl-0.15 configure and it was ok. > > The variable withval is only for use in the third argument > (ACTION-IF-GIVEN) of AC_ARG_WITH. In all other places the variable > with_PACKAGE should be used. When I search for with_ppl in configure I get in order: ... # Check whether --with-ppl or --without-ppl was given. if test "${with_ppl+set}" = set; then withval="$with_ppl" fi; ... echo "$as_me:$LINENO: checking for Parma Polyhedral Library (PPL)" >&5 echo $ECHO_N "checking for Parma Polyhedral Library (PPL)... $ECHO_C" >&6 if test "x$with_ppl" != "x" -a "x$with_ppl" != "xno"; then if test "x$with_polylib_prefix" != "x" -o "x$with_polylib_exec_prefix" != "x" -o "x$with_polylib_builddir" != "x"; then { { echo "$as_me:$LINENO: error: --with-polylib and --with-ppl are mutually exclusive" >&5 echo "$as_me: error: --with-polylib and --with-ppl are mutually exclusive" >&2;} { (exit 1); exit 1; }; } fi if test "x$with_ppl" != "xyes" ; then ... So 0.15.3 configure does not set $with_ppl variable at all. Laurent
Re: Graphite build fails if PPL configured with --disable-shared
Laurent GUERBY writes: > So 0.15.3 configure does not set $with_ppl variable at all. Sure it does. Look at the argument parsing loop. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."
Re: Graphite build fails if PPL configured with --disable-shared
On Tue, 2009-05-12 at 21:49 +0200, Andreas Schwab wrote: > Laurent GUERBY writes: > > > So 0.15.3 configure does not set $with_ppl variable at all. > > Sure it does. Look at the argument parsing loop. I added a dump and $with_ppl is indeed set correctly but $ppl_prefix (which is used for -I if I follow correctly) is empty: echo "$as_me:$LINENO: checking for Parma Polyhedral Library (PPL)" >&5 echo $ECHO_N "checking for Parma Polyhedral Library (PPL)... $ECHO_C" >&6 echo WITH_PPL "$with_ppl" > zconf echo PPL_PREFIX "$ppl_prefix" >> zconf if test "x$with_ppl" != "x" -a "x$with_ppl" != "xno"; then if test "x$with_polylib_prefix" != "x" -o "x$with_polylib_exec_prefix" != "x" -o "x$with_polylib_builddir" != "x"; then { { echo "$as_me:$LINENO: error: --with-polylib and --with-ppl are mutually exclusive" >&5 echo "$as_me: error: --with-polylib and --with-ppl are mutually exclusive" >&2;} { (exit 1); exit 1; }; } fi if test "x$with_ppl" != "xyes" ; then echo "$as_me:$LINENO: result: installed in $ppl_prefix" >&5 echo "${ECHO_T}installed in $ppl_prefix" >&6 CPPFLAGS="-I$ppl_prefix/include $CPPFLAGS" ... $ cat zconf WITH_PPL /n/17/guerby/install-ppl2/ppl-0.10.2 PPL_PREFIX $ Do you know what part of configure is supposed to set $ppl_prefix? In the 0.15 configure the code was using $with_prefix directly: << echo "$as_me:$LINENO: checking for Parma Polyhedral Library (PPL)" >&5 echo $ECHO_N "checking for Parma Polyhedral Library (PPL)... $ECHO_C" >&6; } if test "x$with_ppl" != "x"; then if test "x$with_polylib_prefix" != "x" -o "x$with_polylib_exec_prefix" != "x" -o "x$with_polylib_builddir" != "x"; then { { echo "$as_me:$LINENO: error: --with-polylib and --with-ppl are mutually exclusive" >&5 echo "$as_me: error: --with-polylib and --with-ppl are mutually exclusive" >&2;} { (exit 1); exit 1; }; } fi { echo "$as_me:$LINENO: result: installed in $with_ppl" >&5 echo "${ECHO_T}installed in $with_ppl" >&6; } POLYHEDRAL_BACKEND=ppl CPPFLAGS="-I$with_ppl/include -DCLOOG_PPL_BACKEND $CPPFLAGS" >> Or may be just replace $ppl_prefix in 0.15.3 configure.in and configure? Thanks for your help, Laurent
Re: Graphite build fails if PPL configured with --disable-shared
On Tue, 2009-05-12 at 18:46 +0200, Roberto Bagnara wrote: > Janis Johnson wrote: > > On Mon, 2009-05-11 at 13:07 -0700, Ian Lance Taylor wrote: > >> Another Graphite build issue: it appears that I must not use > >> --disable-shared when I configure PPL. If I do use --disable-shared, I > >> get this: > >> > >> /home/iant/gnu/ppl-0.10.2-install/lib/libppl_c.a(ppl_c_implementation_common.o): > >> In function `finalize': > >> /home/iant/gnu/ppl-0.10.2/interfaces/C/../../src/ppl.hh:1842: undefined > >> reference to `operator delete(void*)' > >> > >> followed by thousands of similar errors. This is unfortunate, as it > >> means that I must manually set LD_LIBRARY_PATH to the directory where > >> the PPL library is installed. This also makes it harder for anybody > >> else to run the compiler that I build. This needs to be fixed. > > > > I get around this by setting LDFLAGS for the ppl configure: I was wrong, I use these flags for other reasons. > > LDFLAGS="-static" \ > > ./configure \ > > --prefix=$PREFIX \ > > --build=powerpc-linux \ > > --with-gnu-ld \ > > --with-libgmp-prefix=$PREFIX \ > > --with-libgmpxx-prefix=$PREFIX \ > > --disable-shared > > I am not sure I understand: we trust that Libtool, which provides us > with the --disable-shared option, will do the right thing. And it > seems it does here: the static library is built and passes its checks. The --disable-shared option worked as expected. What I had problems with was finding my static versions of libgmp and libgmpxx; configure kept finding the default shared versions, which were too old, until I added LDFLAGS="-static" before ./configure, and passed LDFLAGS="-all-static" to make and make check. I had assumed that by specifying their locations, configure and make would be able to use those particular libraries. > Perhaps you want something different from what --disable-shared promises, > that is, not to build any shared libraries? > > > I copy libstdc++.a into the directory with the other GCC host > > libraries (gmp/mpfr/ppl/cloog/mpc). > > > > Building these libraries is indeed quite painful. > > Any suggestion about how to improve the PPL is welcome. This, of course, > applies also to the build machinery. > All the best, One small change would be to use --with-gmp as a configure option. It's not clear whether it's necessary to use both --with-libgmp-prefix and --with-libgmpxx-prefix. Other packages that GCC needs (MPFR, CLoog, MPC) use --with-gmp for the GMP package. Janis
Re: [plugins] Name for pass_all_optimizations
Great! I actually got around to submitting a patch before the weekend, but Andrew Pinski noted that naming these passes results in some unwanted dump files. I plan to have a patch ready soon to fix that up. --Justin On Tue, May 12, 2009 at 8:02 AM, Diego Novillo wrote: > On Thu, Apr 23, 2009 at 22:58, Justin Seyster wrote: > >> Unless that's not a good place to put plug-in passes, I propose >> giving the pass_all_optimizations pass the name "all_optimizations." >> I believe that there are a handful of other unnamed passes that might >> also be useful for plug-in developers. > > Sounds like a good idea. You can send patches against mainline now, > since all the plugin functionality has already been merged and we are > still in stage 1. You will need a copyright assignment, if you don't > already have one. > > > Diego. >
Re: Trouble building Graphite
On Mon, May 11, 2009 at 14:18, Ian Lance Taylor wrote: > I'm having some trouble building the Graphite support. > > Using ftp://gcc.gnu.org/pub/gcc/infrastructure/ppl-0.10.2.tar.gz : > > * Unlike gcc, does not support a --with-gmp option. > + Does support a --with-libgmpxx-prefix option. > * If GMP was not built with C++ support, fails at build time. > * If GMP was not built with exception support, complains at configure > time, and recommends using CPPFLAGS=-fexceptions when building GMP. > + CPPFLAGS is for preprocessor flags, and -fexceptions is not a > preprocessor flag. However, I admit that setting CFLAGS does not > work correctly, as GMP seems to have special requirements for it. > + I think they mean -funwind-tables anyhow. > > Using ftp://gcc.gnu.org/pub/gcc/infrastructure/cloog-ppl-0.15.3.tar.gz : > > * The --with-ppl configure option does not work. > + The configure script refers to ${ppl_prefix} without ever setting > it. > > Could the Graphite maintainers please look into these issues? Thanks. I will prepare patches for the graphite build instructions and will look at the cloog-ppl configure bug. Sebastian
Re: New GCC releases comparison and comparison of GCC4.4 and LLVM2.5 on SPEC2000
> It would also be very interesting to include LLVM's LTO support, which gives > a pretty dramatic win on SPEC. However, I don't know how difficult it is to > use on linux (on the mac, you just pass -O4 at compile time, and everything > works). I've heard that Gold has a new plugin to make LTO transparent on > linux as well, but I have no experience with it, and it is probably more > trouble than you want to take. Does gcc 4.4 include the LTO branch yet? For spec all that you (should) need is to link with a gold with plugins enabled and pass -use-gold-plugin to llvm-gcc. For software that uses static libraries you will also need the bfd plugin support (currently in code review). I am going on vacation tomorrow, but might read my mail from time to time. Ping me if you need help. The current trunk includes some patches from LTO, but not the streamer. > -Chris Cheers, -- Rafael Avila de Espindola Google | Gordon House | Barrow Street | Dublin 4 | Ireland Registered in Dublin, Ireland | Registration Number: 368047
Re: New GCC releases comparison and comparison of GCC4.4 and LLVM2.5 on SPEC2000
On Tue, May 12, 2009 at 7:45 PM, Chris Lattner wrote: > 2. You change two variables in your configurations: micro architecture and > pointer size. Would you be willing to run x86-32 Core i7 numbers as well? > LLVM in particular is completely untuned for the (really old and quirky) > "netburst" architecture, but I'm interested to see how it runs for you on > more modern Core i7 or Core2 processors in 32-bit mode. FWIW, GCC is also completely untuned for NetBurst. There isn't even a scheduler description for the P4, and there also isn't anything for the funny branch predictor. > 3. Your SPEC FP benchmarks tell me two things: GCC 4.4's fortran support is > dramatically better than 4.2's (which llvm 2.5 uses), and your art/mgrid > hacks apparently do great stuff :). Something like the "art hack" is in ipa-struct-reorg, but it is not enabled at any level. If gcc outperforms llvm on art by much, it's more likely that some important opportunities for art are being overlooked by llvm. There also isn't anything special done for mgrid, except predictive commoning (CSE around loops) which is not a hack, in the sense it is helpful for a lot of numerical code and triggers several times in things like generic Fortran blas/lapack routines. Hope this helps, Ciao! Steven
gcc-4.4-20090512 is now available
Snapshot gcc-4.4-20090512 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.4-20090512/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.4 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_4-branch revision 147449 You'll find: gcc-4.4-20090512.tar.bz2 Complete GCC (includes all of below) gcc-core-4.4-20090512.tar.bz2 C front end and core compiler gcc-ada-4.4-20090512.tar.bz2 Ada front end and runtime gcc-fortran-4.4-20090512.tar.bz2 Fortran front end and runtime gcc-g++-4.4-20090512.tar.bz2 C++ front end and runtime gcc-java-4.4-20090512.tar.bz2 Java front end and runtime gcc-objc-4.4-20090512.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.4-20090512.tar.bz2The GCC testsuite Diffs from 4.4-20090505 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.4 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: New GCC releases comparison and comparison of GCC4.4 and LLVM2.5 on SPEC2000
>> 3. Your SPEC FP benchmarks tell me two things: GCC 4.4's fortran support is >> dramatically better than 4.2's (which llvm 2.5 uses), and your art/mgrid >> hacks apparently do great stuff :). > > Something like the "art hack" is in ipa-struct-reorg, but it is not > enabled at any level. If gcc outperforms llvm on art by much, it's > more likely that some important opportunities for art are being > overlooked by llvm. > > There also isn't anything special done for mgrid, except predictive > commoning (CSE around loops) which is not a hack, in the sense it is > helpful for a lot of numerical code and triggers several times in > things like generic Fortran blas/lapack routines. Indeed, we have a couple of benchmark-inspired optimizations for SPEC2006 (division/modulo power-of-two, see PR26026; and ifcombine), and we optimize MATMUL (TRANSPOSE (A), B) which helps galgel a lot. But both of this may trigger quite a lot on other code, and LLVM also benefits from the galgel one :-) because it's done in the front-end. Paolo