------- Comment #3 from ubizjak at gmail dot com 2007-06-26 19:43 ------- (In reply to comment #0) > gfortran seemingly generates an significatly inferior internal TREE > representation than g95 as for Polyhedron's induct.f90 gfortran is 18% slower > than g95, which is based on GCC 4.0.3. (Compared with other compilers the > difference is even larger.)
> If one looks at -ftree-vectorizer-verbose, GCC 4.3 is able to vectorize 3 > loops > with gfortran whereas GCC 4.0 vectorizes 0 loops with g95. The problem is in -ftree-vectorize: gfortran -march=core2 -ffast-math -funroll-loops -ftree-loop-linear -ftree-vectorize -msse3 -O3 pr32084.f90 time ./a.out real 0m2.941s user 0m2.940s sys 0m0.004s gfortran -march=core2 -ffast-math -funroll-loops -ftree-loop-linear -msse3 -O3 pr32084.f90 time ./a.out real 0m1.574s user 0m1.572s sys 0m0.004s The testcase runs 47% faster without -ftree-vectorize. gcc -v Target: x86_64-unknown-linux-gnu ... gcc version 4.3.0 20070622 (experimental) vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU X6800 @ 2.93GHz stepping : 5 cpu MHz : 2933.435 cache size : 4096 KB This is marked a "tree-optimization" bug because we have no "vectorizer" component to choose from. -- ubizjak at gmail dot com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ubizjak at gmail dot com Status|UNCONFIRMED |NEW Component|fortran |tree-optimization Ever Confirmed|0 |1 Last reconfirmed|0000-00-00 00:00:00 |2007-06-26 19:43:36 date| | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32084