Re: Research Region Based Memory Management for Imperative Languages

2010-08-27 Thread Uday P. Khedker

We have had a long term plan (which has not fructified until now) of
implementing a static analysis for improving garbage collection. Our
paper in TOPLAS (http://portal.acm.org/citation.cfm?id=1290521) describes
our early work. The main bottle neck for our purpose is a good pointer
analysis and we have shifted our focus to that. Nothing shareable yet
on that front.

It may still be worthwhile implementing it using the existing pointer
analysis but I don't have the bandwidth for it. If someone wants to explore
improving dynamic allocation, this may be a good beginning. I will be
very happy to provide information.

Uday Khedker.

Matt Davis wrote, On Friday 27 August 2010 06:39 AM:

Hello,
I am just trying to settle down on my PhD Computer Science dissertation
topic.  I want something low-level, compiler related, and more so
useful/practical.  I am considering region-based memory management, to show
memory efficiency and safety.  For imperative languages, such as c, this is
rather difficult from static-analysis alone (e.g. aliasing and weak-typing).
However, I do believe region-based management is possible.  If I were to take
something of this nature on for my topic, would it be valuable research, and is
it even worth the effort?  I am by far any kind of compiler guru, and figured
you all might know best.

The other option, would be to implement such concepts in a research language,
which can still be interesting, but I'm not sure how practical.

-Matt


Gengtype : strange code in output_type_enum

2010-08-27 Thread jeremie . salvucci
Hello all,

While hacking on gengtype with Basile, we noticed a strange piece of code at 
line 2539 in gcc/gengtype.c r162692

static void
output_type_enum (outf_p of, type_p s)
{
  if (s->kind == TYPE_PARAM_STRUCT && s->u.s.line.file != NULL) /* Strange code 
@@*/
{
  oprintf (of, ", gt_e_");
  output_mangled_typename (of, s);
}
  else if (UNION_OR_STRUCT_P (s) && s->u.s.line.file != NULL)
{
  oprintf (of, ", gt_ggc_e_");
  output_mangled_typename (of, s);
}
  else
oprintf (of, ", gt_types_enum_last");
}

We think that the enum type_kind discriminates fields union in struct type. So 
for TYPE_PARAM_STRUCT we believe that 
the param_struct field of union u inside struct type is used. If this is true, 
the test s->u.s.line.file != NULL is meaningless when s->kind == 
TYPE_PARAM_STRUCT, it should be s->u.param_struct.line.file != NULL instead in 
our opinion.

However, the existing code appears to work but we don't understand why.

Or can a type have a kind TYPE_PARAM_STRUCT and only have s->u.s valid? It 
might be related to the code in new_structure near line  638 of gengtype.c 
which sets ls->kind = TYPE_LANG_STRUCT.

Perhaps TYPE_PARAM_STRUCT has two different roles. If that is indeed the case, 
we have to distinguish them when serializing gengtype's state.

Cheers.

-- 

Jeremie Salvucci & Basile Starynkevitch


Re: Gengtype : strange code in output_type_enum

2010-08-27 Thread Laurynas Biveinis
2010/8/27  :

> We think that the enum type_kind discriminates fields union in struct type. 
> So for TYPE_PARAM_STRUCT we believe that
> the param_struct field of union u inside struct type is used. If this is 
> true, the test s->u.s.line.file != NULL is meaningless when s->kind == 
> TYPE_PARAM_STRUCT, it should be s->u.param_struct.line.file != NULL instead 
> in our opinion.
>
>
> Or can a type have a kind TYPE_PARAM_STRUCT and only have s->u.s valid? It 
> might be related to the code in new_structure near line  638 of gengtype.c 
> which sets ls->kind = TYPE_LANG_STRUCT.
>
> Perhaps TYPE_PARAM_STRUCT has two different roles. If that is indeed the 
> case, we have to distinguish them when serializing gengtype's state.

I don't have time to investigate this right now to come up with an
answer, but did you try producing gengtype debugging dump and looking
there for structs that have these combinations of properties?
Especially since -

> However, the existing code appears to work but we don't understand why.

Cheers,
-- 
Laurynas


Re: Gengtype : strange code in output_type_enum

2010-08-27 Thread jeremie . salvucci
"Or can a type have a kind TYPE_PARAM_STRUCT and only have s->u.s valid? It 
might be related to the code in new_structure near line  638 of gengtype.c 
which sets ls->kind = TYPE_LANG_STRUCT."

Forget about this sentence, Basile messed up TYPE_PARAM_STRUCT & 
TYPE_LANG_STRUCT (and is typing this).

Cheers

-- 

Jeremie Salvucci & Basile Starynkevitch



Clustering switch cases

2010-08-27 Thread Paulo J. Matos
Hi,

I have been analysing the gcc4.4 code due to the way it's handling:
1  extern void f(const char *);
2  extern void g(int);
3
4  #define C(n) case n: f(#n); break
5
6  void g(int n)
7  {
8  switch(n)
9  {
10 C(0); C(1); C(2); C(3); C(4); C(5); C(6); C(7); C(8); C(9);
11 C(10); C(11); C(12); C(13); C(14); C(15); C(16); C(17);
C(18); C(19);
12 C(20); C(21); C(22); C(23); C(24); C(25); C(26); C(27);
C(28); C(29);
13
14 C(1000); C(1001); C(1002); C(1003); C(1004); C(1005);
C(1006); C(1007); C(1008); C(1009);
15 }
16 }

The interesting thing about this is that GCC generates much better code if I do:
1  extern void f(const char *);
2  extern void g(int);
3
4  #define C(n) case n: f(#n); break
5
6  void g(int n)
7  {
8  switch(n)
9  {
10 C(0); C(1); C(2); C(3); C(4); C(5); C(6); C(7); C(8); C(9);
11 C(10); C(11); C(12); C(13); C(14); C(15); C(16); C(17);
C(18); C(19);
12 C(20); C(21); C(22); C(23); C(24); C(25); C(26); C(27);
C(28); C(29);
13 }
14 switch(n)
15 {
16 C(1000); C(1001); C(1002); C(1003); C(1004); C(1005);
C(1006); C(1007); C(1008); C(1009);
17 }
18 }

In the first case, it generates a binary tree, and in the second two
jump tables. The jump tables solution is much more elegant (at least
in our situation), generating less code and being faster.
Now, what I am wondering is the reason why GCC doesn't try to cluster
the cases trying to find for clusters of contiguous values in the
switch.

If there is no specific reason then I would implement such pass, which
would before expansion split switches according to value clustering,
since I find it would be a good code improvement.

Currently GCC seems to only use jump table is the range of the switch
is not much bigger than its count, which works well in most cases
except when you have big switches with clusters of contiguous values
(like the first example I sent).

Any comments on this would be appreciated.

-- 
PMatos


Better performance on older version of GCC

2010-08-27 Thread Corey Kasten
Hello all,

I have two computers with two different versions of GCC. Otherwise the
two systems have identical hardware. I have a processor and memory
intensive benchmark program which I compile on both systems and I cannot
understand why the system with older GCC version compiles faster code. 

System A has GCC version "4.1.2 20070925 (Red Hat 4.1.2-33)"
System B has GCC version "4.3.0 20080428 (Red Hat 4.3.0-8)"

I find that the executable compiled on system A runs faster (on both
systems) than the executable compiled on system B (on both system), by a
factor about approximately 4 times. I have attempted to play with the
GCC optimizer flags and have not been able to get System B (with the
later GCC version) to compile code with any better performance. Could
someone please help figure this out?

Below is the GCC command I run on System A followed by the verbose
output:
gcc -v -Wall -DOFFLINE_WEIGHTS -DDOUBLEP -g bfbenchmark_threaded.c -lm
-lrt -lpthread -O3 -o bfbenchmark_threaded

---BEGIN OUTPUT-
Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-languages=c,c++,objc,obj-c
++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-cpu=generic
--host=i386-redhat-linux
Thread model: posix
gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)
 /usr/libexec/gcc/i386-redhat-linux/4.1.2/cc1 -quiet -v
-DOFFLINE_WEIGHTS -DDOUBLEP bfbenchmark_threaded.c -quiet -dumpbase
bfbenchmark_threaded.c -mtune=generic -auxbase bfbenchmark_threaded -g
-O3 -Wall -version -o /tmp/ccvxPCd0.s
ignoring nonexistent directory
"/usr/lib/gcc/i386-redhat-linux/4.1.2/../../../../i386-redhat-linux/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/lib/gcc/i386-redhat-linux/4.1.2/include
 /usr/include
End of search list.
GNU C version 4.1.2 20070925 (Red Hat 4.1.2-33) (i386-redhat-linux)
compiled by GNU C version 4.1.2 20070925 (Red Hat 4.1.2-33).
GGC heuristics: --param ggc-min-expand=100 --param
ggc-min-heapsize=131072
Compiler executable checksum: ab322ce5b87a7c6c23d60970ec7b7b31
 as -V -Qy -o /tmp/ccU8kZL1.o /tmp/ccvxPCd0.s
GNU assembler version 2.17.50.0.18 (i386-redhat-linux) using BFD version
version 2.17.50.0.18-1 20070731
 /usr/libexec/gcc/i386-redhat-linux/4.1.2/collect2 --eh-frame-hdr
--build-id -m elf_i386 --hash-style=gnu
-dynamic-linker /lib/ld-linux.so.2 -o
bfbenchmark_threaded /usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crt1.o 
/usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crti.o 
/usr/lib/gcc/i386-redhat-linux/4.1.2/crtbegin.o 
-L/usr/lib/gcc/i386-redhat-linux/4.1.2 -L/usr/lib/gcc/i386-redhat-linux/4.1.2 
-L/usr/lib/gcc/i386-redhat-linux/4.1.2/../../.. /tmp/ccU8kZL1.o -lm -lrt 
-lpthread -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed 
-lgcc_s --no-as-needed /usr/lib/gcc/i386-redhat-linux/4.1.2/crtend.o 
/usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crtn.o
---END OUTPUT-



Below is the GCC command I run on System A followed by the verbose
output:
gcc -v -Wall -DOFFLINE_WEIGHTS -DDOUBLEP -g bfbenchmark_threaded.c -lm
-lrt -lpthread -O3 -o bfbenchmark_threaded

---BEGIN OUTPUT-
Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap
--enable-shared --enable-threads=posix --enable-checking=release
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada
--enable-java-awt=gtk --disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--disable-libjava-multilib --with-cpu=generic --build=i386-redhat-linux
Thread model: posix
gcc version 4.3.0 20080428 (Red Hat 4.3.0-8) (GCC) 
COLLECT_GCC_OPTIONS='-v' '-Wall' '-DOFFLINE_WEIGHTS' '-DDOUBLEP' '-g'
'-O3' '-o' 'bfbenchmark_threaded' '-mtune=generic'
 /usr/libexec/gcc/i386-redhat-linux/4.3.0/cc1 -quiet -v
-DOFFLINE_WEIGHTS -DDOUBLEP bfbenchmark_threaded.c -quiet -dumpbase
bfbenchmark_threaded.c -mtune=generic -auxbase bfbenchmark_threaded -g
-O3 -Wall -version -o /tmp/ccB4B5PI.s
ignoring nonexistent directory
"/usr/lib/gcc/i386-redhat-linux/4.3.0/include-fixed"
ignoring nonexistent directory
"/usr/lib/gcc/i386-redhat-linux/4.3.0/../../../../i386-redhat-linux/include"
#include "..." sear

Re: Better performance on older version of GCC

2010-08-27 Thread H.J. Lu
On Fri, Aug 27, 2010 at 6:44 AM, Corey Kasten
 wrote:
> Hello all,
>
> I have two computers with two different versions of GCC. Otherwise the
> two systems have identical hardware. I have a processor and memory
> intensive benchmark program which I compile on both systems and I cannot
> understand why the system with older GCC version compiles faster code.
>
> System A has GCC version "4.1.2 20070925 (Red Hat 4.1.2-33)"
> System B has GCC version "4.3.0 20080428 (Red Hat 4.3.0-8)"
>
> I find that the executable compiled on system A runs faster (on both
> systems) than the executable compiled on system B (on both system), by a
> factor about approximately 4 times. I have attempted to play with the
> GCC optimizer flags and have not been able to get System B (with the
> later GCC version) to compile code with any better performance. Could
> someone please help figure this out?
>

Can you try gcc 4.5.1?

-- 
H.J.


Re: Better performance on older version of GCC

2010-08-27 Thread Nathan Froyd
On Fri, Aug 27, 2010 at 09:44:25AM -0400, Corey Kasten wrote:
> I find that the executable compiled on system A runs faster (on both
> systems) than the executable compiled on system B (on both system), by a
> factor about approximately 4 times. I have attempted to play with the
> GCC optimizer flags and have not been able to get System B (with the
> later GCC version) to compile code with any better performance. Could
> someone please help figure this out?

It's almost impossible to tell what's going on without an actual
testcase.  You might not be able to provide the actual code, but you
could try distilling it down to something you could release.

-Nathan


Re: Gengtype : strange code in output_type_enum

2010-08-27 Thread jeremie . salvucci
We recompiled GCC-trunk r162692 with the following modification :

In function output_type_enum of gcc/gengtype.c, we replaced 

-  if (s->kind == TYPE_PARAM_STRUCT && s->u.s.line.file != NULL)
+  if (s->kind == TYPE_PARAM_STRUCT && s->u.param_struct.line.file != NULL)

And Gengtype works like before with c,c++, lto enabled.

Do you think we have to submit a one line patch (if yes, could it be reviewed 
quickly)? We don't know why the old version works, and we think writing 
u.s.line.file is incorrect for TYPE_PARAM_STRUCT (even if it happens to work by 
luck), since the union u.param_struct member is the only valid for 
TYPE_PARAM_STRUCT. 



-- 

Jeremie Salvucci & Basile Starynkevitch




Re: Gengtype : strange code in output_type_enum

2010-08-27 Thread Arnaud Charlet
> In function output_type_enum of gcc/gengtype.c, we replaced 
> 
> -  if (s->kind == TYPE_PARAM_STRUCT && s->u.s.line.file != NULL)
> +  if (s->kind == TYPE_PARAM_STRUCT && s->u.param_struct.line.file !=
> NULL)
> 
> And Gengtype works like before with c,c++, lto enabled.
> 
> Do you think we have to submit a one line patch (if yes, could it be reviewed

Sure, one line patches are actually welcome since they are well isolated and
easy to review, as opposed to large big patches containing unrelated stuff
which have basically zero chance to get accepted/reviewed (other than
"please break you patch into multiple pieces).

Arno


Re: Gengtype : strange code in output_type_enum

2010-08-27 Thread Laurynas Biveinis
2010/8/27  :
> We recompiled GCC-trunk r162692 with the following modification :
>
> In function output_type_enum of gcc/gengtype.c, we replaced
>
> -  if (s->kind == TYPE_PARAM_STRUCT && s->u.s.line.file != NULL)
> +  if (s->kind == TYPE_PARAM_STRUCT && s->u.param_struct.line.file != NULL)
>
> And Gengtype works like before with c,c++, lto enabled.
>
> Do you think we have to submit a one line patch (if yes, could it be reviewed 
> quickly)? We don't know why the old version works, and we think writing 
> u.s.line.file is incorrect for TYPE_PARAM_STRUCT (even if it happens to work 
> by luck), since the union u.param_struct member is the only valid for 
> TYPE_PARAM_STRUCT.

One-line patches are welcome, but in this instance could you please
find out how the old code worked before changing it (as you admit, you
don't understand it).

-- 
Laurynas


Re: Gengtype : strange code in output_type_enum

2010-08-27 Thread Ian Lance Taylor
jeremie.salvu...@free.fr writes:

> While hacking on gengtype with Basile, we noticed a strange piece of code at 
> line 2539 in gcc/gengtype.c r162692
>
> static void
> output_type_enum (outf_p of, type_p s)
> {
>   if (s->kind == TYPE_PARAM_STRUCT && s->u.s.line.file != NULL) /* Strange 
> code @@*/
> {
>   oprintf (of, ", gt_e_");
>   output_mangled_typename (of, s);
> }
>   else if (UNION_OR_STRUCT_P (s) && s->u.s.line.file != NULL)
> {
>   oprintf (of, ", gt_ggc_e_");
>   output_mangled_typename (of, s);
> }
>   else
> oprintf (of, ", gt_types_enum_last");
> }
>
> We think that the enum type_kind discriminates fields union in struct type. 
> So for TYPE_PARAM_STRUCT we believe that 
> the param_struct field of union u inside struct type is used. If this is 
> true, the test s->u.s.line.file != NULL is meaningless when s->kind == 
> TYPE_PARAM_STRUCT, it should be s->u.param_struct.line.file != NULL instead 
> in our opinion.

I agree that this is wrong.

> However, the existing code appears to work but we don't understand why.

That one is fairly easy.  If you look at the generated code, you will
see that those values are only used to pass to gt_pch_note_object.  From
there they will eventually be passed to either ggc_pch_count_object or
ggc_pch_alloc_object.  The default page allocator ignores this type.
The zone allocator does use the type, but nobody uses that allocator.
And even if you do use the zone allocator, it will work correctly if
perhaps suboptimally as long as it always gets the same type for a given
struct, which I believe will happen.

You should send in a tested patch to fix that problem (and nothing
else).

Ian


Re: Clustering switch cases

2010-08-27 Thread Ian Lance Taylor
"Paulo J. Matos"  writes:

> In the first case, it generates a binary tree, and in the second two
> jump tables. The jump tables solution is much more elegant (at least
> in our situation), generating less code and being faster.
> Now, what I am wondering is the reason why GCC doesn't try to cluster
> the cases trying to find for clusters of contiguous values in the
> switch.
>
> If there is no specific reason then I would implement such pass, which
> would before expansion split switches according to value clustering,
> since I find it would be a good code improvement.
>
> Currently GCC seems to only use jump table is the range of the switch
> is not much bigger than its count, which works well in most cases
> except when you have big switches with clusters of contiguous values
> (like the first example I sent).

I don't know of any specific reason not to look for clusters of switch
cases.  The main issue would be the affect on compilation time.  If you
can do it with an algorithm which is linear in the number of cases, then
I think it would be an acceptable optimization.

Ian


Re: Better performance on older version of GCC

2010-08-27 Thread Corey Kasten
On Fri, 2010-08-27 at 06:50 -0700, Nathan Froyd wrote:
> On Fri, Aug 27, 2010 at 09:44:25AM -0400, Corey Kasten wrote:
> > I find that the executable compiled on system A runs faster (on both
> > systems) than the executable compiled on system B (on both system), by a
> > factor about approximately 4 times. I have attempted to play with the
> > GCC optimizer flags and have not been able to get System B (with the
> > later GCC version) to compile code with any better performance. Could
> > someone please help figure this out?
> 
> It's almost impossible to tell what's going on without an actual
> testcase.  You might not be able to provide the actual code, but you
> could try distilling it down to something you could release.
> 
> -Nathan

Thanks for the reply Nathan.

I have attached an archive with the test case code. The code is built by
build.sh and outputs the number of microseconds to complete the
processing.

Compiling with GCC version "4.1.2 20070925 (Red Hat 4.1.2-33)" produces
code that runs in about 66% of the time than does GCC version "4.3.0
20080428 (Red Hat 4.3.0-8)"

Thanks

Corey


testbenchmark.100827.1050.tgz
Description: application/compressed-tar


Re: Clustering switch cases

2010-08-27 Thread Paulo J. Matos
On Fri, Aug 27, 2010 at 3:47 PM, Ian Lance Taylor  wrote:
>
> I don't know of any specific reason not to look for clusters of switch
> cases.  The main issue would be the affect on compilation time.  If you
> can do it with an algorithm which is linear in the number of cases, then
> I think it would be an acceptable optimization.
>

Thanks. I will be working on it. I will let you know how it goes.

Cheers,
-- 
PMatos


Re: Clustering switch cases

2010-08-27 Thread Richard Guenther
On Fri, Aug 27, 2010 at 4:47 PM, Ian Lance Taylor  wrote:
> "Paulo J. Matos"  writes:
>
>> In the first case, it generates a binary tree, and in the second two
>> jump tables. The jump tables solution is much more elegant (at least
>> in our situation), generating less code and being faster.
>> Now, what I am wondering is the reason why GCC doesn't try to cluster
>> the cases trying to find for clusters of contiguous values in the
>> switch.
>>
>> If there is no specific reason then I would implement such pass, which
>> would before expansion split switches according to value clustering,
>> since I find it would be a good code improvement.
>>
>> Currently GCC seems to only use jump table is the range of the switch
>> is not much bigger than its count, which works well in most cases
>> except when you have big switches with clusters of contiguous values
>> (like the first example I sent).
>
> I don't know of any specific reason not to look for clusters of switch
> cases.  The main issue would be the affect on compilation time.  If you
> can do it with an algorithm which is linear in the number of cases, then
> I think it would be an acceptable optimization.

In fact we might want to move switch optimization up to the tree level
(just because it's way easier to deal with there).  Thus, lower switch
to a mixture of binary tree & jump-tables (possibly using perfect
hashing).

Richard.


Re: Better performance on older version of GCC

2010-08-27 Thread Richard Guenther
On Fri, Aug 27, 2010 at 5:02 PM, Corey Kasten
 wrote:
> On Fri, 2010-08-27 at 06:50 -0700, Nathan Froyd wrote:
>> On Fri, Aug 27, 2010 at 09:44:25AM -0400, Corey Kasten wrote:
>> > I find that the executable compiled on system A runs faster (on both
>> > systems) than the executable compiled on system B (on both system), by a
>> > factor about approximately 4 times. I have attempted to play with the
>> > GCC optimizer flags and have not been able to get System B (with the
>> > later GCC version) to compile code with any better performance. Could
>> > someone please help figure this out?
>>
>> It's almost impossible to tell what's going on without an actual
>> testcase.  You might not be able to provide the actual code, but you
>> could try distilling it down to something you could release.
>>
>> -Nathan
>
> Thanks for the reply Nathan.
>
> I have attached an archive with the test case code. The code is built by
> build.sh and outputs the number of microseconds to complete the
> processing.
>
> Compiling with GCC version "4.1.2 20070925 (Red Hat 4.1.2-33)" produces
> code that runs in about 66% of the time than does GCC version "4.3.0
> 20080428 (Red Hat 4.3.0-8)"

-fcx-limited-range or -fcx-fortran-rules.  4.3 now is more conforming than 4.1.

Richard.

> Thanks
>
> Corey
>


specs and X-s and canadian X-s.

2010-08-27 Thread IainS
When doing native bootstraps things are nice and easy ...  the target  
spec  definitions are also appropriate for the host.


===

I've been doing some canadian X-s  (specifically darwin 9 => darwin 7).

So, ... when doing B == H != T  the specs might need to be different  
from  B != H == T (or B=H=T)


In effect the specs for the linker are used to generate code for the  
'target' on the 'host'.


the problem comes when the B-host needs different specs from the  
native case.


So when doing   B == H != T (first cross - to build a compiler capable  
of making T code on B) we need specs that the B understands
(this all works quite easily using --with-sysroot= .. etc. - and needs  
to use the right specs to allow generation on the B system)
[ a specific case is inserting a path spec to point to the sysroot -  
which the B linker understands]


When doing B != H == T (second cross - to make the compiler for T  
hosted on T) we need to generate native specs that T understands.

[in my example the path spec causes the T==H linker to fail]

===

Is it legitimate to wrap these circumstances with #ifndef  
CROSS_DIRECTORY_STRUCTURE ... #endif  in the target headers.

(or maybe there's a flag somewhere that indicates H == T ?)

I recognize that this is changing the target headers depending on the  
host ..
..  but at the moment I can't see how else to do it; the specs say "do  
'this' to generate correct code for the target"

..  but "this" might well be host-dependent.

It doesn't seem to belong in confg/mh-* or gcc/config/x-*

any insight much appreciated.

Iain


Re: Better performance on older version of GCC

2010-08-27 Thread Corey Kasten
On Fri, 2010-08-27 at 17:09 +0200, Richard Guenther wrote:
> On Fri, Aug 27, 2010 at 5:02 PM, Corey Kasten
>  wrote:
> > On Fri, 2010-08-27 at 06:50 -0700, Nathan Froyd wrote:
> >> On Fri, Aug 27, 2010 at 09:44:25AM -0400, Corey Kasten wrote:
> >> > I find that the executable compiled on system A runs faster (on both
> >> > systems) than the executable compiled on system B (on both system), by a
> >> > factor about approximately 4 times. I have attempted to play with the
> >> > GCC optimizer flags and have not been able to get System B (with the
> >> > later GCC version) to compile code with any better performance. Could
> >> > someone please help figure this out?
> >>
> >> It's almost impossible to tell what's going on without an actual
> >> testcase.  You might not be able to provide the actual code, but you
> >> could try distilling it down to something you could release.
> >>
> >> -Nathan
> >
> > Thanks for the reply Nathan.
> >
> > I have attached an archive with the test case code. The code is built by
> > build.sh and outputs the number of microseconds to complete the
> > processing.
> >
> > Compiling with GCC version "4.1.2 20070925 (Red Hat 4.1.2-33)" produces
> > code that runs in about 66% of the time than does GCC version "4.3.0
> > 20080428 (Red Hat 4.3.0-8)"
> 
> -fcx-limited-range or -fcx-fortran-rules.  4.3 now is more conforming than 
> 4.1.
> 
> Richard.
> 
> > Thanks
> >
> > Corey
> >

Richard,

-fcx-limited-range worked great on both my real benchmark and my test
achive. GCC didn't recognize -fcx-fortran-rules, but obviously I don't
need it.

Thanks so much,
Corey

  



Errors when invoking refs_may_alias_p_1

2010-08-27 Thread Hongtao
Hi all,

I have instrumented a function call like foo(&a,&b) into the gimple SSA
representation (gcc-4.5) and the consequent optimizations can not pass
my instrumented code. The back traces are as followings. The error
occurred when the pass dse tried to test if the call I inserted may use
a memory reference. It is because the arguments &a is not a SSA_VAR or
INDIRECT_REF, so the assert in function

bool
refs_may_alias_p_1 (ao_ref *ref1, ao_ref *ref2, bool tbaa_p)

  gcc_assert ((!ref1->ref
   || SSA_VAR_P (ref1->ref)
   || handled_component_p (ref1->ref)
   || INDIRECT_REF_P (ref1->ref)
   || TREE_CODE (ref1->ref) == TARGET_MEM_REF
   || TREE_CODE (ref1->ref) == CONST_DECL)
  && (!ref2->ref
  || SSA_VAR_P (ref2->ref)
  || handled_component_p (ref2->ref)
  || INDIRECT_REF_P (ref2->ref)
  || TREE_CODE (ref2->ref) == TARGET_MEM_REF
  || TREE_CODE (ref2->ref) == CONST_DECL));
was violated.

Does anyone know why the function arguments must be a SSA_VAR or
INDIRECT_REF here? Have I missed to perform any actions to maintain the
consistency of Gimple SSA?


#0  0x76fc8ee0 in exit () from /lib/libc.so.6
#1  0x005ae4ce in diagnostic_action_after_output
(context=0x1323880, diagnostic=0x7fffd870) at
../../src/gcc/diagnostic.c:198
#2  0x005aed54 in diagnostic_report_diagnostic
(context=0x1323880, diagnostic=0x7fffd870) at
../../src/gcc/diagnostic.c:424
#3  0x005afdc3 in internal_error (gmsgid=0xddfb57 "in %s, at
%s:%d") at ../../src/gcc/diagnostic.c:709
#4  0x005aff4f in fancy_abort (file=0xe42670
"../../src/gcc/tree-ssa-alias.c", line=786, function=0xe427e0
"refs_may_alias_p_1")
at ../../src/gcc/diagnostic.c:763
#5  0x008a1adb in refs_may_alias_p_1 (ref1=0x7fffdab0,
ref2=0x7fffdb50, tbaa_p=1 '\001')
at ../../src/gcc/tree-ssa-alias.c:775
#6  0x008a2b12 in ref_maybe_used_by_call_p_1
(call=0x76790630, ref=0x7fffdb50) at
../../src/gcc/tree-ssa-alias.c:1133
#7  0x008a2d2e in ref_maybe_used_by_call_p (call=0x76790630,
ref=0x76848048) at ../../src/gcc/tree-ssa-alias.c:1147
#8  0x008a2dfa in ref_maybe_used_by_stmt_p (stmt=0x76790630,
ref=0x76848048) at ../../src/gcc/tree-ssa-alias.c:1179
#9  0x008bf275 in dse_possible_dead_store_p
(stmt=0x7683e820, use_stmt=0x7fffdca8) at
../../src/gcc/tree-ssa-dse.c:212
#10 0x008bfeb9 in dse_optimize_stmt (dse_gd=0x7fffddd0,
bd=0x156bd30, gsi=...) at ../../src/gcc/tree-ssa-dse.c:297
#11 0x008c029d in dse_enter_block (walk_data=0x7fffdde0,
bb=0x76a75068) at ../../src/gcc/tree-ssa-dse.c:370
#12 0x00cc26a5 in walk_dominator_tree (walk_data=0x7fffdde0,
bb=0x76a75068) at ../../src/gcc/domwalk.c:185
#13 0x008c0812 in tree_ssa_dse () at
../../src/gcc/tree-ssa-dse.c:430
#14 0x0073af0a in execute_one_pass (pass=0x13cced0) at
../../src/gcc/passes.c:1572
#15 0x0073b21a in execute_pass_list (pass=0x13cced0) at
../../src/gcc/passes.c:1627
#16 0x0073b238 in execute_pass_list (pass=0x1312720) at
../../src/gcc/passes.c:1628
#17 0x0086e372 in tree_rest_of_compilation
(fndecl=0x76b93500) at ../../src/gcc/tree-optimize.c:413
#18 0x009fa7c5 in cgraph_expand_function (node=0x76be7000)
at ../../src/gcc/cgraphunit.c:1548
#19 0x009faa49 in cgraph_expand_all_functions () at
../../src/gcc/cgraphunit.c:1627
#20 0x009fb07e in cgraph_optimize () at
../../src/gcc/cgraphunit.c:1875
#21 0x009f9461 in cgraph_finalize_compilation_unit () at
../../src/gcc/cgraphunit.c:1096
#22 0x004a9e93 in c_write_global_declarations () at
../../src/gcc/c-decl.c:9519
#23 0x008180d4 in compile_file () at ../../src/gcc/toplev.c:1065
#24 0x0081a1c5 in do_compile () at ../../src/gcc/toplev.c:2417
#25 0x0081a286 in toplev_main (argc=21, argv=0x7fffe0f8) at
../../src/gcc/toplev.c:2459
#26 0x00519c6b in main (argc=21, argv=0x7fffe0f8) at
../../src/gcc/main.c:35

Thanks,

Hongtao
Purdue University




RE: Tutorial Proposal for GCC Summit

2010-08-27 Thread Paralkar Anmol-B07584
Hello Prof. Khedker,

 Your tutorial would be very useful. I am trying my best
 to attend the Summit, this being an important motivator.

 Thank you.

Sincerely,
Anmol P. Paralkar

> -Original Message-
> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf
Of
> Uday Khedker
> Sent: Wednesday, August 18, 2010 11:50 AM
> To: gcc@gcc.gnu.org
> Subject: Tutorial Proposal for GCC Summit
> 
> Dear Friends,
> 
> I have submitted  a tutorial proposal for GCC Summit.  This email to
get
> an idea  of whether this  tutorial will be  of interest to  a
sufficient
> number  of people.  We have  been conducting  expanded versions  of
this
> tutorial  in India  for past  few years  and it  seems to  generate
some
> interest. Some friends in US suggested  that we should hold some
version
> outside of India hence I have sent in a proposal to GCC Summit.
> 
> Looking  forward to  getting some  feedback. I  would also  be happy
to
> modify the tutorial based on the feedback.
> 
> Thanks and regards,
> 
> Uday.
> --
> Dr. Uday Khedker
> Professor
> Department of Computer Science & Engg.
> IIT Bombay, Powai, Mumbai 400 076, India.
> email   : u...@cse.iitb.ac.in
> homepage: http://www.cse.iitb.ac.in/~uday
> phone   : Office - 91 (22) 2576 7717
>Res.   - 91 (22) 2576 8717, 91 (22) 2572 0288
> --
> 
> 
> Tutorial on Essential Abstractions in GCC
> -
> 
> Motivation:
> ---
> 
> Most  explanations of  GCC which  are publicly  available describe
many
> details and tend  to be heavy on information rather  than insights. As
a
> consequence, a clear  and crisp explanation of the journey  from a
given
> machine description to  an actual run of a compiler  generated from
that
> machine description is not available.
> 
> In  this  tutorial  we   describe  some  carefully  chosen
abstractions
> that  help  one to  understand  the  retargetability mechanism  and
the
> architecture of the  compiler generation framework of GCC  and relate
it
> to a generated compiler.
> 
> 
> Coverage
> 
> 
> The  default duration  of this  tutorial is  one day  and it  covers
the
> following  topics:  Meeting  the  challenge of  understanding  GCC.
The
> architecture of GCC.  Basic concepts in GCC  configuration and
building.
> The  structure  of  a  GCC  generated  compiler.  Plugins  structure
of
> GCC.  First  level  graybox  probing  of  the  compilation  sequence
of
> a  GCC  generated  compiler.  Graybox probing  for  machine
independent
> optimizations.  Graybox probing  for parallelization  and
vectorization.
> Adding Gimple and RTL passes to GCC. The retargetability and
instruction
> selection  mechanism of  GCC.  Designing and  understanding GCC
machine
> descriptions.  The   abstractions  in   GCC  machine   descriptions
and
> their  influence on  a  compiler  generated from  them.  The design
and
> implementation of gdfa (generic data flow analyzer) for GCC.
> 
> We have held  several tutorials and workshops along these  lines in
past
> few years in India and now would like to reach out to a larger
audience.
> 
> Our main  source of material  is the Workshop on  Essential
Abstractions
> in  GCC   held  at  IIT   Bombay  from  5th   July  to  8th   July
2010
> (http://www.cse.iitb.ac.in/grc/gcc-workshop-10).
> 
> Target Audience
> ---
> 
> People interested  in using  GCC for  their research  as well  as
people
> interested in contributing to GCC will benefit a lot from this
tutorial.
> It is expected that this tutorial will  bring down the ramp up period
of
> novices  to GCC  from several  frustrating months  to a  few
stimulating
> weeks. This tutorial  will also be useful for people  who are
interested
> in  relating  class  room  concepts  of compilation  to  a  large
scale
> practical compiler which is widely used.
> 
> --
> Dr. Uday Khedker
> Professor
> Department of Computer Science & Engg.
> IIT Bombay, Powai, Mumbai 400 076, India.
> email   : u...@cse.iitb.ac.in
> homepage: http://www.cse.iitb.ac.in/~uday
> phone   : Office - 91 (22) 2572 2545 x 7717, 91 (22) 2576 7717
(Direct)
>Res.   - 91 (22) 2572 2545 x 8717, 91 (22) 2576 8717
(Direct)
> --




Re: Errors when invoking refs_may_alias_p_1

2010-08-27 Thread Richard Guenther
On Fri, Aug 27, 2010 at 5:27 PM, Hongtao  wrote:
> Hi all,
>
> I have instrumented a function call like foo(&a,&b) into the gimple SSA
> representation (gcc-4.5) and the consequent optimizations can not pass
> my instrumented code. The back traces are as followings. The error
> occurred when the pass dse tried to test if the call I inserted may use
> a memory reference. It is because the arguments &a is not a SSA_VAR or
> INDIRECT_REF, so the assert in function
>
> bool
> refs_may_alias_p_1 (ao_ref *ref1, ao_ref *ref2, bool tbaa_p)
>
>  gcc_assert ((!ref1->ref
>           || SSA_VAR_P (ref1->ref)
>           || handled_component_p (ref1->ref)
>           || INDIRECT_REF_P (ref1->ref)
>           || TREE_CODE (ref1->ref) == TARGET_MEM_REF
>           || TREE_CODE (ref1->ref) == CONST_DECL)
>          && (!ref2->ref
>          || SSA_VAR_P (ref2->ref)
>          || handled_component_p (ref2->ref)
>          || INDIRECT_REF_P (ref2->ref)
>          || TREE_CODE (ref2->ref) == TARGET_MEM_REF
>          || TREE_CODE (ref2->ref) == CONST_DECL));
> was violated.
>
> Does anyone know why the function arguments must be a SSA_VAR or
> INDIRECT_REF here? Have I missed to perform any actions to maintain the
> consistency of Gimple SSA?

Yes.  is_gimple_val () will return false for your arguments as it seems that
the variables do not have function invariant addresses.

Richard.

>
> #0  0x76fc8ee0 in exit () from /lib/libc.so.6
> #1  0x005ae4ce in diagnostic_action_after_output
> (context=0x1323880, diagnostic=0x7fffd870) at
> ../../src/gcc/diagnostic.c:198
> #2  0x005aed54 in diagnostic_report_diagnostic
> (context=0x1323880, diagnostic=0x7fffd870) at
> ../../src/gcc/diagnostic.c:424
> #3  0x005afdc3 in internal_error (gmsgid=0xddfb57 "in %s, at
> %s:%d") at ../../src/gcc/diagnostic.c:709
> #4  0x005aff4f in fancy_abort (file=0xe42670
> "../../src/gcc/tree-ssa-alias.c", line=786, function=0xe427e0
> "refs_may_alias_p_1")
>    at ../../src/gcc/diagnostic.c:763
> #5  0x008a1adb in refs_may_alias_p_1 (ref1=0x7fffdab0,
> ref2=0x7fffdb50, tbaa_p=1 '\001')
>    at ../../src/gcc/tree-ssa-alias.c:775
> #6  0x008a2b12 in ref_maybe_used_by_call_p_1
> (call=0x76790630, ref=0x7fffdb50) at
> ../../src/gcc/tree-ssa-alias.c:1133
> #7  0x008a2d2e in ref_maybe_used_by_call_p (call=0x76790630,
> ref=0x76848048) at ../../src/gcc/tree-ssa-alias.c:1147
> #8  0x008a2dfa in ref_maybe_used_by_stmt_p (stmt=0x76790630,
> ref=0x76848048) at ../../src/gcc/tree-ssa-alias.c:1179
> #9  0x008bf275 in dse_possible_dead_store_p
> (stmt=0x7683e820, use_stmt=0x7fffdca8) at
> ../../src/gcc/tree-ssa-dse.c:212
> #10 0x008bfeb9 in dse_optimize_stmt (dse_gd=0x7fffddd0,
> bd=0x156bd30, gsi=...) at ../../src/gcc/tree-ssa-dse.c:297
> #11 0x008c029d in dse_enter_block (walk_data=0x7fffdde0,
> bb=0x76a75068) at ../../src/gcc/tree-ssa-dse.c:370
> #12 0x00cc26a5 in walk_dominator_tree (walk_data=0x7fffdde0,
> bb=0x76a75068) at ../../src/gcc/domwalk.c:185
> #13 0x008c0812 in tree_ssa_dse () at
> ../../src/gcc/tree-ssa-dse.c:430
> #14 0x0073af0a in execute_one_pass (pass=0x13cced0) at
> ../../src/gcc/passes.c:1572
> #15 0x0073b21a in execute_pass_list (pass=0x13cced0) at
> ../../src/gcc/passes.c:1627
> #16 0x0073b238 in execute_pass_list (pass=0x1312720) at
> ../../src/gcc/passes.c:1628
> #17 0x0086e372 in tree_rest_of_compilation
> (fndecl=0x76b93500) at ../../src/gcc/tree-optimize.c:413
> #18 0x009fa7c5 in cgraph_expand_function (node=0x76be7000)
> at ../../src/gcc/cgraphunit.c:1548
> #19 0x009faa49 in cgraph_expand_all_functions () at
> ../../src/gcc/cgraphunit.c:1627
> #20 0x009fb07e in cgraph_optimize () at
> ../../src/gcc/cgraphunit.c:1875
> #21 0x009f9461 in cgraph_finalize_compilation_unit () at
> ../../src/gcc/cgraphunit.c:1096
> #22 0x004a9e93 in c_write_global_declarations () at
> ../../src/gcc/c-decl.c:9519
> #23 0x008180d4 in compile_file () at ../../src/gcc/toplev.c:1065
> #24 0x0081a1c5 in do_compile () at ../../src/gcc/toplev.c:2417
> #25 0x0081a286 in toplev_main (argc=21, argv=0x7fffe0f8) at
> ../../src/gcc/toplev.c:2459
> #26 0x00519c6b in main (argc=21, argv=0x7fffe0f8) at
> ../../src/gcc/main.c:35
>
> Thanks,
>
> Hongtao
> Purdue University
>
>
>


Re: Gengtype : strange code in output_type_enum

2010-08-27 Thread Basile Starynkevitch
On Fri, 2010-08-27 at 17:25 +0300, Laurynas Biveinis wrote:
> 2010/8/27  :
> > We recompiled GCC-trunk r162692 with the following modification :
> >
> > In function output_type_enum of gcc/gengtype.c, we replaced
> >
> > -  if (s->kind == TYPE_PARAM_STRUCT && s->u.s.line.file != NULL)
> > +  if (s->kind == TYPE_PARAM_STRUCT && s->u.param_struct.line.file != NULL)
> >
> > And Gengtype works like before with c,c++, lto enabled.
> >
> > Do you think we have to submit a one line patch (if yes, could it be 
> > reviewed quickly)? We don't know why the old version works, and we think 
> > writing u.s.line.file is incorrect for TYPE_PARAM_STRUCT (even if it 
> > happens to work by luck), since the union u.param_struct member is the only 
> > valid for TYPE_PARAM_STRUCT.
> 
> One-line patches are welcome, but in this instance could you please
> find out how the old code worked before changing it (as you admit, you
> don't understand it).

My impression is that s->u.s.line.file usually happens to have the same
offset (at least on GNU/Linux/AMD64=x86_64) as
s->u.param_struct.param[0] and that for every type concerned by
output_type_enum  its param[0] subfield happens to be non-null. This
explains that it worked by accident.

Is such an heuristic explanation enough to propose a patch? I am not
sure to be able to provide a better one quickly (so if the explanation
is not enough, I am not sure to want to propose a half-line patch).

By the way, what is the good way to find out exactly what svn commit
introduced the bogus line?

What surprises me much more is that the s->u.s.line.file != NULL test
has been accepted long time ago. From what we understand of gengtype, it
could never have made any sense (because conceptually s->u.s does not
exist for TYPE_PARAM_STRUCT!), even if it happens to work by pure luck.



I am quite surprised (but I admit I only looked a few pages) that there
does not seems to be any rules regarding use of union in C code inside
GNU.  My personal requirement is that a union is only usable if it is
inside a structure and is discriminated by a field of this structure
(the usual case of a union of sub-structures each starting with a
discriminant logically fits that requirement) or by a simple pure
fonction depending of such a field 
(in ML or Ocaml parlance, a union is a discriminated sum type; Also,
rpcxdr from Sun twenty years ago had a similar requirement...). 

But I see no such rules within GCC, and I even saw several unions not
used that way. My perhaps excessive opinion is that such union abuse
always gives unmaintainable code (and I am in the minority which wants
GCC code to be more easily maintainable & readable & hackable by new
contributors, even at the expense of raw performance; I feel that
competitors's free compilers like LLVM are much better in that aspect.).

###



For the curious people, our current work on gengtype is available as
http://starynkevitch.net/Basile/gengtype-r163582-27-august-2010.diff
As usual, this is a temporary URL.  Our patch is not yet ready for
submission.

I have to clean up the code, correct a bug or two, understand how
exactly is the s->u.param_struct.line.file field set in present
gengtype. I also have to split our work into several patches, and I am
very afraid of not being able to make a sequence of small patches such
that each change make still gengtype work for entire GCC!  I have no
idea if this is even doable (since gengtype is a code generator with
*global* side effects on GCC code; it could happen that some partial
change work for C but not C++ or Ada parts.).



I will perhaps propose a few *related* patches on gengtype this
week-end, if I am motivated enough to work on it.



Cheers.
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***



Re: Gengtype : strange code in output_type_enum

2010-08-27 Thread Ian Lance Taylor
Basile Starynkevitch  writes:

> My impression is that s->u.s.line.file usually happens to have the same
> offset (at least on GNU/Linux/AMD64=x86_64) as
> s->u.param_struct.param[0] and that for every type concerned by
> output_type_enum  its param[0] subfield happens to be non-null. This
> explains that it worked by accident.

No, that is not the case.  But I already explained why this error
doesn't matter:
http://gcc.gnu.org/ml/gcc/2010-08/msg00396.html

> By the way, what is the good way to find out exactly what svn commit
> introduced the bogus line?

svn blame

Ian


Re: Errors when invoking refs_may_alias_p_1

2010-08-27 Thread Hongtao
On 08/27/10 12:35, Richard Guenther wrote:
> On Fri, Aug 27, 2010 at 5:27 PM, Hongtao  wrote:
>   
>> Hi all,
>>
>> I have instrumented a function call like foo(&a,&b) into the gimple SSA
>> representation (gcc-4.5) and the consequent optimizations can not pass
>> my instrumented code. The back traces are as followings. The error
>> occurred when the pass dse tried to test if the call I inserted may use
>> a memory reference. It is because the arguments &a is not a SSA_VAR or
>> INDIRECT_REF, so the assert in function
>>
>> bool
>> refs_may_alias_p_1 (ao_ref *ref1, ao_ref *ref2, bool tbaa_p)
>>
>>  gcc_assert ((!ref1->ref
>>   || SSA_VAR_P (ref1->ref)
>>   || handled_component_p (ref1->ref)
>>   || INDIRECT_REF_P (ref1->ref)
>>   || TREE_CODE (ref1->ref) == TARGET_MEM_REF
>>   || TREE_CODE (ref1->ref) == CONST_DECL)
>>  && (!ref2->ref
>>  || SSA_VAR_P (ref2->ref)
>>  || handled_component_p (ref2->ref)
>>  || INDIRECT_REF_P (ref2->ref)
>>  || TREE_CODE (ref2->ref) == TARGET_MEM_REF
>>  || TREE_CODE (ref2->ref) == CONST_DECL));
>> was violated.
>>
>> Does anyone know why the function arguments must be a SSA_VAR or
>> INDIRECT_REF here? Have I missed to perform any actions to maintain the
>> consistency of Gimple SSA?
>> 
> Yes.  is_gimple_val () will return false for your arguments as it seems that
> the variables do not have function invariant addresses.
>
> Richard.
>
>   
Thanks. But how can I change my argument to gimple_vals, using it with
an assignment to a temp before and replacing my argument with the temp?

Hongtao
>> #0  0x76fc8ee0 in exit () from /lib/libc.so.6
>> #1  0x005ae4ce in diagnostic_action_after_output
>> (context=0x1323880, diagnostic=0x7fffd870) at
>> ../../src/gcc/diagnostic.c:198
>> #2  0x005aed54 in diagnostic_report_diagnostic
>> (context=0x1323880, diagnostic=0x7fffd870) at
>> ../../src/gcc/diagnostic.c:424
>> #3  0x005afdc3 in internal_error (gmsgid=0xddfb57 "in %s, at
>> %s:%d") at ../../src/gcc/diagnostic.c:709
>> #4  0x005aff4f in fancy_abort (file=0xe42670
>> "../../src/gcc/tree-ssa-alias.c", line=786, function=0xe427e0
>> "refs_may_alias_p_1")
>>at ../../src/gcc/diagnostic.c:763
>> #5  0x008a1adb in refs_may_alias_p_1 (ref1=0x7fffdab0,
>> ref2=0x7fffdb50, tbaa_p=1 '\001')
>>at ../../src/gcc/tree-ssa-alias.c:775
>> #6  0x008a2b12 in ref_maybe_used_by_call_p_1
>> (call=0x76790630, ref=0x7fffdb50) at
>> ../../src/gcc/tree-ssa-alias.c:1133
>> #7  0x008a2d2e in ref_maybe_used_by_call_p (call=0x76790630,
>> ref=0x76848048) at ../../src/gcc/tree-ssa-alias.c:1147
>> #8  0x008a2dfa in ref_maybe_used_by_stmt_p (stmt=0x76790630,
>> ref=0x76848048) at ../../src/gcc/tree-ssa-alias.c:1179
>> #9  0x008bf275 in dse_possible_dead_store_p
>> (stmt=0x7683e820, use_stmt=0x7fffdca8) at
>> ../../src/gcc/tree-ssa-dse.c:212
>> #10 0x008bfeb9 in dse_optimize_stmt (dse_gd=0x7fffddd0,
>> bd=0x156bd30, gsi=...) at ../../src/gcc/tree-ssa-dse.c:297
>> #11 0x008c029d in dse_enter_block (walk_data=0x7fffdde0,
>> bb=0x76a75068) at ../../src/gcc/tree-ssa-dse.c:370
>> #12 0x00cc26a5 in walk_dominator_tree (walk_data=0x7fffdde0,
>> bb=0x76a75068) at ../../src/gcc/domwalk.c:185
>> #13 0x008c0812 in tree_ssa_dse () at
>> ../../src/gcc/tree-ssa-dse.c:430
>> #14 0x0073af0a in execute_one_pass (pass=0x13cced0) at
>> ../../src/gcc/passes.c:1572
>> #15 0x0073b21a in execute_pass_list (pass=0x13cced0) at
>> ../../src/gcc/passes.c:1627
>> #16 0x0073b238 in execute_pass_list (pass=0x1312720) at
>> ../../src/gcc/passes.c:1628
>> #17 0x0086e372 in tree_rest_of_compilation
>> (fndecl=0x76b93500) at ../../src/gcc/tree-optimize.c:413
>> #18 0x009fa7c5 in cgraph_expand_function (node=0x76be7000)
>> at ../../src/gcc/cgraphunit.c:1548
>> #19 0x009faa49 in cgraph_expand_all_functions () at
>> ../../src/gcc/cgraphunit.c:1627
>> #20 0x009fb07e in cgraph_optimize () at
>> ../../src/gcc/cgraphunit.c:1875
>> #21 0x009f9461 in cgraph_finalize_compilation_unit () at
>> ../../src/gcc/cgraphunit.c:1096
>> #22 0x004a9e93 in c_write_global_declarations () at
>> ../../src/gcc/c-decl.c:9519
>> #23 0x008180d4 in compile_file () at ../../src/gcc/toplev.c:1065
>> #24 0x0081a1c5 in do_compile () at ../../src/gcc/toplev.c:2417
>> #25 0x0081a286 in toplev_main (argc=21, argv=0x7fffe0f8) at
>> ../../src/gcc/toplev.c:2459
>> #26 0x00519c6b in main (argc=21, argv=0x7fffe0f8) at
>> ../../src/gcc/main.c:35
>>
>> Thanks,
>>
>> Hongtao
>> Purdue University
>>
>>
>>
>> 
>   



Re: Errors when invoking refs_may_alias_p_1

2010-08-27 Thread Richard Guenther
On Fri, Aug 27, 2010 at 8:24 PM, Hongtao  wrote:
> On 08/27/10 12:35, Richard Guenther wrote:
>> On Fri, Aug 27, 2010 at 5:27 PM, Hongtao  wrote:
>>
>>> Hi all,
>>>
>>> I have instrumented a function call like foo(&a,&b) into the gimple SSA
>>> representation (gcc-4.5) and the consequent optimizations can not pass
>>> my instrumented code. The back traces are as followings. The error
>>> occurred when the pass dse tried to test if the call I inserted may use
>>> a memory reference. It is because the arguments &a is not a SSA_VAR or
>>> INDIRECT_REF, so the assert in function
>>>
>>> bool
>>> refs_may_alias_p_1 (ao_ref *ref1, ao_ref *ref2, bool tbaa_p)
>>>
>>>  gcc_assert ((!ref1->ref
>>>           || SSA_VAR_P (ref1->ref)
>>>           || handled_component_p (ref1->ref)
>>>           || INDIRECT_REF_P (ref1->ref)
>>>           || TREE_CODE (ref1->ref) == TARGET_MEM_REF
>>>           || TREE_CODE (ref1->ref) == CONST_DECL)
>>>          && (!ref2->ref
>>>          || SSA_VAR_P (ref2->ref)
>>>          || handled_component_p (ref2->ref)
>>>          || INDIRECT_REF_P (ref2->ref)
>>>          || TREE_CODE (ref2->ref) == TARGET_MEM_REF
>>>          || TREE_CODE (ref2->ref) == CONST_DECL));
>>> was violated.
>>>
>>> Does anyone know why the function arguments must be a SSA_VAR or
>>> INDIRECT_REF here? Have I missed to perform any actions to maintain the
>>> consistency of Gimple SSA?
>>>
>> Yes.  is_gimple_val () will return false for your arguments as it seems that
>> the variables do not have function invariant addresses.
>>
>> Richard.
>>
>>
> Thanks. But how can I change my argument to gimple_vals, using it with
> an assignment to a temp before and replacing my argument with the temp?

Yes, that will work.

Richard.

> Hongtao
>>> #0  0x76fc8ee0 in exit () from /lib/libc.so.6
>>> #1  0x005ae4ce in diagnostic_action_after_output
>>> (context=0x1323880, diagnostic=0x7fffd870) at
>>> ../../src/gcc/diagnostic.c:198
>>> #2  0x005aed54 in diagnostic_report_diagnostic
>>> (context=0x1323880, diagnostic=0x7fffd870) at
>>> ../../src/gcc/diagnostic.c:424
>>> #3  0x005afdc3 in internal_error (gmsgid=0xddfb57 "in %s, at
>>> %s:%d") at ../../src/gcc/diagnostic.c:709
>>> #4  0x005aff4f in fancy_abort (file=0xe42670
>>> "../../src/gcc/tree-ssa-alias.c", line=786, function=0xe427e0
>>> "refs_may_alias_p_1")
>>>    at ../../src/gcc/diagnostic.c:763
>>> #5  0x008a1adb in refs_may_alias_p_1 (ref1=0x7fffdab0,
>>> ref2=0x7fffdb50, tbaa_p=1 '\001')
>>>    at ../../src/gcc/tree-ssa-alias.c:775
>>> #6  0x008a2b12 in ref_maybe_used_by_call_p_1
>>> (call=0x76790630, ref=0x7fffdb50) at
>>> ../../src/gcc/tree-ssa-alias.c:1133
>>> #7  0x008a2d2e in ref_maybe_used_by_call_p (call=0x76790630,
>>> ref=0x76848048) at ../../src/gcc/tree-ssa-alias.c:1147
>>> #8  0x008a2dfa in ref_maybe_used_by_stmt_p (stmt=0x76790630,
>>> ref=0x76848048) at ../../src/gcc/tree-ssa-alias.c:1179
>>> #9  0x008bf275 in dse_possible_dead_store_p
>>> (stmt=0x7683e820, use_stmt=0x7fffdca8) at
>>> ../../src/gcc/tree-ssa-dse.c:212
>>> #10 0x008bfeb9 in dse_optimize_stmt (dse_gd=0x7fffddd0,
>>> bd=0x156bd30, gsi=...) at ../../src/gcc/tree-ssa-dse.c:297
>>> #11 0x008c029d in dse_enter_block (walk_data=0x7fffdde0,
>>> bb=0x76a75068) at ../../src/gcc/tree-ssa-dse.c:370
>>> #12 0x00cc26a5 in walk_dominator_tree (walk_data=0x7fffdde0,
>>> bb=0x76a75068) at ../../src/gcc/domwalk.c:185
>>> #13 0x008c0812 in tree_ssa_dse () at
>>> ../../src/gcc/tree-ssa-dse.c:430
>>> #14 0x0073af0a in execute_one_pass (pass=0x13cced0) at
>>> ../../src/gcc/passes.c:1572
>>> #15 0x0073b21a in execute_pass_list (pass=0x13cced0) at
>>> ../../src/gcc/passes.c:1627
>>> #16 0x0073b238 in execute_pass_list (pass=0x1312720) at
>>> ../../src/gcc/passes.c:1628
>>> #17 0x0086e372 in tree_rest_of_compilation
>>> (fndecl=0x76b93500) at ../../src/gcc/tree-optimize.c:413
>>> #18 0x009fa7c5 in cgraph_expand_function (node=0x76be7000)
>>> at ../../src/gcc/cgraphunit.c:1548
>>> #19 0x009faa49 in cgraph_expand_all_functions () at
>>> ../../src/gcc/cgraphunit.c:1627
>>> #20 0x009fb07e in cgraph_optimize () at
>>> ../../src/gcc/cgraphunit.c:1875
>>> #21 0x009f9461 in cgraph_finalize_compilation_unit () at
>>> ../../src/gcc/cgraphunit.c:1096
>>> #22 0x004a9e93 in c_write_global_declarations () at
>>> ../../src/gcc/c-decl.c:9519
>>> #23 0x008180d4 in compile_file () at ../../src/gcc/toplev.c:1065
>>> #24 0x0081a1c5 in do_compile () at ../../src/gcc/toplev.c:2417
>>> #25 0x0081a286 in toplev_main (argc=21, argv=0x7fffe0f8) at
>>> ../../src/gcc/toplev.c:2459
>>> #26 0x00519c6b in main (argc=21, argv=0x7fffe0f8) at
>>> ../../src/gcc/main.c:35
>>>
>>> Thanks,
>>>
>>> Hongtao
>>> Purdue University
>>>
>>>
>>>
>>>
>>
>
>


Re: Errors when invoking refs_may_alias_p_1

2010-08-27 Thread Hongtao
On 08/27/10 14:29, Richard Guenther wrote:
> On Fri, Aug 27, 2010 at 8:24 PM, Hongtao  wrote:
>   
>> On 08/27/10 12:35, Richard Guenther wrote:
>> 
>>> On Fri, Aug 27, 2010 at 5:27 PM, Hongtao  wrote:
>>>
>>>   
 Hi all,

 I have instrumented a function call like foo(&a,&b) into the gimple SSA
 representation (gcc-4.5) and the consequent optimizations can not pass
 my instrumented code. The back traces are as followings. The error
 occurred when the pass dse tried to test if the call I inserted may use
 a memory reference. It is because the arguments &a is not a SSA_VAR or
 INDIRECT_REF, so the assert in function

 bool
 refs_may_alias_p_1 (ao_ref *ref1, ao_ref *ref2, bool tbaa_p)

  gcc_assert ((!ref1->ref
   || SSA_VAR_P (ref1->ref)
   || handled_component_p (ref1->ref)
   || INDIRECT_REF_P (ref1->ref)
   || TREE_CODE (ref1->ref) == TARGET_MEM_REF
   || TREE_CODE (ref1->ref) == CONST_DECL)
  && (!ref2->ref
  || SSA_VAR_P (ref2->ref)
  || handled_component_p (ref2->ref)
  || INDIRECT_REF_P (ref2->ref)
  || TREE_CODE (ref2->ref) == TARGET_MEM_REF
  || TREE_CODE (ref2->ref) == CONST_DECL));
 was violated.

 Does anyone know why the function arguments must be a SSA_VAR or
 INDIRECT_REF here? Have I missed to perform any actions to maintain the
 consistency of Gimple SSA?

 
>>> Yes.  is_gimple_val () will return false for your arguments as it seems that
>>> the variables do not have function invariant addresses.
>>>
>>> Richard.
>>>
>>>
>>>   
>> Thanks. But how can I change my argument to gimple_vals, using it with
>> an assignment to a temp before and replacing my argument with the temp?
>> 
> Yes, that will work.
>
> Richard.
>
>   
OK. Do we have to rewrite it like this everytime we insert a function
call on Gimple body if the argument of that call is an expression?

Thanks,
Hongtao

>> Hongtao
>> 
 #0  0x76fc8ee0 in exit () from /lib/libc.so.6
 #1  0x005ae4ce in diagnostic_action_after_output
 (context=0x1323880, diagnostic=0x7fffd870) at
 ../../src/gcc/diagnostic.c:198
 #2  0x005aed54 in diagnostic_report_diagnostic
 (context=0x1323880, diagnostic=0x7fffd870) at
 ../../src/gcc/diagnostic.c:424
 #3  0x005afdc3 in internal_error (gmsgid=0xddfb57 "in %s, at
 %s:%d") at ../../src/gcc/diagnostic.c:709
 #4  0x005aff4f in fancy_abort (file=0xe42670
 "../../src/gcc/tree-ssa-alias.c", line=786, function=0xe427e0
 "refs_may_alias_p_1")
at ../../src/gcc/diagnostic.c:763
 #5  0x008a1adb in refs_may_alias_p_1 (ref1=0x7fffdab0,
 ref2=0x7fffdb50, tbaa_p=1 '\001')
at ../../src/gcc/tree-ssa-alias.c:775
 #6  0x008a2b12 in ref_maybe_used_by_call_p_1
 (call=0x76790630, ref=0x7fffdb50) at
 ../../src/gcc/tree-ssa-alias.c:1133
 #7  0x008a2d2e in ref_maybe_used_by_call_p (call=0x76790630,
 ref=0x76848048) at ../../src/gcc/tree-ssa-alias.c:1147
 #8  0x008a2dfa in ref_maybe_used_by_stmt_p (stmt=0x76790630,
 ref=0x76848048) at ../../src/gcc/tree-ssa-alias.c:1179
 #9  0x008bf275 in dse_possible_dead_store_p
 (stmt=0x7683e820, use_stmt=0x7fffdca8) at
 ../../src/gcc/tree-ssa-dse.c:212
 #10 0x008bfeb9 in dse_optimize_stmt (dse_gd=0x7fffddd0,
 bd=0x156bd30, gsi=...) at ../../src/gcc/tree-ssa-dse.c:297
 #11 0x008c029d in dse_enter_block (walk_data=0x7fffdde0,
 bb=0x76a75068) at ../../src/gcc/tree-ssa-dse.c:370
 #12 0x00cc26a5 in walk_dominator_tree (walk_data=0x7fffdde0,
 bb=0x76a75068) at ../../src/gcc/domwalk.c:185
 #13 0x008c0812 in tree_ssa_dse () at
 ../../src/gcc/tree-ssa-dse.c:430
 #14 0x0073af0a in execute_one_pass (pass=0x13cced0) at
 ../../src/gcc/passes.c:1572
 #15 0x0073b21a in execute_pass_list (pass=0x13cced0) at
 ../../src/gcc/passes.c:1627
 #16 0x0073b238 in execute_pass_list (pass=0x1312720) at
 ../../src/gcc/passes.c:1628
 #17 0x0086e372 in tree_rest_of_compilation
 (fndecl=0x76b93500) at ../../src/gcc/tree-optimize.c:413
 #18 0x009fa7c5 in cgraph_expand_function (node=0x76be7000)
 at ../../src/gcc/cgraphunit.c:1548
 #19 0x009faa49 in cgraph_expand_all_functions () at
 ../../src/gcc/cgraphunit.c:1627
 #20 0x009fb07e in cgraph_optimize () at
 ../../src/gcc/cgraphunit.c:1875
 #21 0x009f9461 in cgraph_finalize_compilation_unit () at
 ../../src/gcc/cgraphunit.c:1096
 #22 0x004a9e93 in c_write_global_declarations () at
 ../../src/gcc/c-decl.c:9519
 #23 0x008180d4 in compile_file () at ../../src/gcc/toplev.c

Re: Errors when invoking refs_may_alias_p_1

2010-08-27 Thread Richard Guenther
On Fri, Aug 27, 2010 at 8:37 PM, Hongtao  wrote:
> On 08/27/10 14:29, Richard Guenther wrote:
>> On Fri, Aug 27, 2010 at 8:24 PM, Hongtao  wrote:
>>
>>> On 08/27/10 12:35, Richard Guenther wrote:
>>>
 On Fri, Aug 27, 2010 at 5:27 PM, Hongtao  wrote:


> Hi all,
>
> I have instrumented a function call like foo(&a,&b) into the gimple SSA
> representation (gcc-4.5) and the consequent optimizations can not pass
> my instrumented code. The back traces are as followings. The error
> occurred when the pass dse tried to test if the call I inserted may use
> a memory reference. It is because the arguments &a is not a SSA_VAR or
> INDIRECT_REF, so the assert in function
>
> bool
> refs_may_alias_p_1 (ao_ref *ref1, ao_ref *ref2, bool tbaa_p)
>
>  gcc_assert ((!ref1->ref
>           || SSA_VAR_P (ref1->ref)
>           || handled_component_p (ref1->ref)
>           || INDIRECT_REF_P (ref1->ref)
>           || TREE_CODE (ref1->ref) == TARGET_MEM_REF
>           || TREE_CODE (ref1->ref) == CONST_DECL)
>          && (!ref2->ref
>          || SSA_VAR_P (ref2->ref)
>          || handled_component_p (ref2->ref)
>          || INDIRECT_REF_P (ref2->ref)
>          || TREE_CODE (ref2->ref) == TARGET_MEM_REF
>          || TREE_CODE (ref2->ref) == CONST_DECL));
> was violated.
>
> Does anyone know why the function arguments must be a SSA_VAR or
> INDIRECT_REF here? Have I missed to perform any actions to maintain the
> consistency of Gimple SSA?
>
>
 Yes.  is_gimple_val () will return false for your arguments as it seems 
 that
 the variables do not have function invariant addresses.

 Richard.



>>> Thanks. But how can I change my argument to gimple_vals, using it with
>>> an assignment to a temp before and replacing my argument with the temp?
>>>
>> Yes, that will work.
>>
>> Richard.
>>
>>
> OK. Do we have to rewrite it like this everytime we insert a function
> call on Gimple body if the argument of that call is an expression?

If it isn't is_gimple_reg_type (TREE_TYPE (arg)) ? is_gimple_val (arg)
: is_gimple_lvalue (arg), then yes.  See gimplify_arg in gimplify.c.

Richard.

> Thanks,
> Hongtao
>
>>> Hongtao
>>>
> #0  0x76fc8ee0 in exit () from /lib/libc.so.6
> #1  0x005ae4ce in diagnostic_action_after_output
> (context=0x1323880, diagnostic=0x7fffd870) at
> ../../src/gcc/diagnostic.c:198
> #2  0x005aed54 in diagnostic_report_diagnostic
> (context=0x1323880, diagnostic=0x7fffd870) at
> ../../src/gcc/diagnostic.c:424
> #3  0x005afdc3 in internal_error (gmsgid=0xddfb57 "in %s, at
> %s:%d") at ../../src/gcc/diagnostic.c:709
> #4  0x005aff4f in fancy_abort (file=0xe42670
> "../../src/gcc/tree-ssa-alias.c", line=786, function=0xe427e0
> "refs_may_alias_p_1")
>    at ../../src/gcc/diagnostic.c:763
> #5  0x008a1adb in refs_may_alias_p_1 (ref1=0x7fffdab0,
> ref2=0x7fffdb50, tbaa_p=1 '\001')
>    at ../../src/gcc/tree-ssa-alias.c:775
> #6  0x008a2b12 in ref_maybe_used_by_call_p_1
> (call=0x76790630, ref=0x7fffdb50) at
> ../../src/gcc/tree-ssa-alias.c:1133
> #7  0x008a2d2e in ref_maybe_used_by_call_p (call=0x76790630,
> ref=0x76848048) at ../../src/gcc/tree-ssa-alias.c:1147
> #8  0x008a2dfa in ref_maybe_used_by_stmt_p (stmt=0x76790630,
> ref=0x76848048) at ../../src/gcc/tree-ssa-alias.c:1179
> #9  0x008bf275 in dse_possible_dead_store_p
> (stmt=0x7683e820, use_stmt=0x7fffdca8) at
> ../../src/gcc/tree-ssa-dse.c:212
> #10 0x008bfeb9 in dse_optimize_stmt (dse_gd=0x7fffddd0,
> bd=0x156bd30, gsi=...) at ../../src/gcc/tree-ssa-dse.c:297
> #11 0x008c029d in dse_enter_block (walk_data=0x7fffdde0,
> bb=0x76a75068) at ../../src/gcc/tree-ssa-dse.c:370
> #12 0x00cc26a5 in walk_dominator_tree (walk_data=0x7fffdde0,
> bb=0x76a75068) at ../../src/gcc/domwalk.c:185
> #13 0x008c0812 in tree_ssa_dse () at
> ../../src/gcc/tree-ssa-dse.c:430
> #14 0x0073af0a in execute_one_pass (pass=0x13cced0) at
> ../../src/gcc/passes.c:1572
> #15 0x0073b21a in execute_pass_list (pass=0x13cced0) at
> ../../src/gcc/passes.c:1627
> #16 0x0073b238 in execute_pass_list (pass=0x1312720) at
> ../../src/gcc/passes.c:1628
> #17 0x0086e372 in tree_rest_of_compilation
> (fndecl=0x76b93500) at ../../src/gcc/tree-optimize.c:413
> #18 0x009fa7c5 in cgraph_expand_function (node=0x76be7000)
> at ../../src/gcc/cgraphunit.c:1548
> #19 0x009faa49 in cgraph_expand_all_functions () at
> ../../src/gcc/cgraphunit.c:1627
> #20 0x009fb07e in cgraph_optimize () at
> ../../src/gcc/cgraphunit.c:1875
>

Re: Clustering switch cases

2010-08-27 Thread Xinliang David Li
Another main thing missing is to consider profile information (if
available) so that most frequent cases can be peeled out.

David

On Fri, Aug 27, 2010 at 8:03 AM, Richard Guenther
 wrote:
> On Fri, Aug 27, 2010 at 4:47 PM, Ian Lance Taylor  wrote:
>> "Paulo J. Matos"  writes:
>>
>>> In the first case, it generates a binary tree, and in the second two
>>> jump tables. The jump tables solution is much more elegant (at least
>>> in our situation), generating less code and being faster.
>>> Now, what I am wondering is the reason why GCC doesn't try to cluster
>>> the cases trying to find for clusters of contiguous values in the
>>> switch.
>>>
>>> If there is no specific reason then I would implement such pass, which
>>> would before expansion split switches according to value clustering,
>>> since I find it would be a good code improvement.
>>>
>>> Currently GCC seems to only use jump table is the range of the switch
>>> is not much bigger than its count, which works well in most cases
>>> except when you have big switches with clusters of contiguous values
>>> (like the first example I sent).
>>
>> I don't know of any specific reason not to look for clusters of switch
>> cases.  The main issue would be the affect on compilation time.  If you
>> can do it with an algorithm which is linear in the number of cases, then
>> I think it would be an acceptable optimization.
>
> In fact we might want to move switch optimization up to the tree level
> (just because it's way easier to deal with there).  Thus, lower switch
> to a mixture of binary tree & jump-tables (possibly using perfect
> hashing).
>
> Richard.
>


Re: Better performance on older version of GCC

2010-08-27 Thread Xinliang David Li
Briefly looked at it -- the trunk gcc also regresses a lot compared to
the binary you attached. (To match your binary, also added
-mfpmath=387 -m32 options)

Two problems:

1) more register spills in the trunk version -- the old compiler seems
more effective in using fp stack registers;
2) the complex multiplication -- the old version emits inline sequence
while the trunk version emits call to _muld3c intrinsinc.

You can probably file a bug report on this.

Thanks,

David

On Fri, Aug 27, 2010 at 8:39 AM, Corey Kasten
 wrote:
> On Fri, 2010-08-27 at 17:09 +0200, Richard Guenther wrote:
>> On Fri, Aug 27, 2010 at 5:02 PM, Corey Kasten
>>  wrote:
>> > On Fri, 2010-08-27 at 06:50 -0700, Nathan Froyd wrote:
>> >> On Fri, Aug 27, 2010 at 09:44:25AM -0400, Corey Kasten wrote:
>> >> > I find that the executable compiled on system A runs faster (on both
>> >> > systems) than the executable compiled on system B (on both system), by a
>> >> > factor about approximately 4 times. I have attempted to play with the
>> >> > GCC optimizer flags and have not been able to get System B (with the
>> >> > later GCC version) to compile code with any better performance. Could
>> >> > someone please help figure this out?
>> >>
>> >> It's almost impossible to tell what's going on without an actual
>> >> testcase.  You might not be able to provide the actual code, but you
>> >> could try distilling it down to something you could release.
>> >>
>> >> -Nathan
>> >
>> > Thanks for the reply Nathan.
>> >
>> > I have attached an archive with the test case code. The code is built by
>> > build.sh and outputs the number of microseconds to complete the
>> > processing.
>> >
>> > Compiling with GCC version "4.1.2 20070925 (Red Hat 4.1.2-33)" produces
>> > code that runs in about 66% of the time than does GCC version "4.3.0
>> > 20080428 (Red Hat 4.3.0-8)"
>>
>> -fcx-limited-range or -fcx-fortran-rules.  4.3 now is more conforming than 
>> 4.1.
>>
>> Richard.
>>
>> > Thanks
>> >
>> > Corey
>> >
>
> Richard,
>
> -fcx-limited-range worked great on both my real benchmark and my test
> achive. GCC didn't recognize -fcx-fortran-rules, but obviously I don't
> need it.
>
> Thanks so much,
> Corey
>
>
>
>


Re: Better performance on older version of GCC

2010-08-27 Thread Andrew Pinski
On Fri, Aug 27, 2010 at 5:12 PM, Xinliang David Li  wrote:
> Briefly looked at it -- the trunk gcc also regresses a lot compared to
> the binary you attached. (To match your binary, also added
> -mfpmath=387 -m32 options)
>
> Two problems:
>
> 1) more register spills in the trunk version -- the old compiler seems
> more effective in using fp stack registers;
> 2) the complex multiplication -- the old version emits inline sequence
> while the trunk version emits call to _muld3c intrinsinc.

Neither of these seems like real bug reportable ones.  The first one
is that due to -fexcess-precision=standard being default in 4.5 and
above (see PR 323).  The second one is due to -fcx-limited-range not
being default any more (I cannot remember the bug number which changed
that though).

Thanks,
Andrew Pinski


Re: Better performance on older version of GCC

2010-08-27 Thread Xinliang David Li
Right -- I missed Richard's previous email regarding the options.

Thanks,

David

On Fri, Aug 27, 2010 at 5:21 PM, Andrew Pinski  wrote:
> On Fri, Aug 27, 2010 at 5:12 PM, Xinliang David Li  wrote:
>> Briefly looked at it -- the trunk gcc also regresses a lot compared to
>> the binary you attached. (To match your binary, also added
>> -mfpmath=387 -m32 options)
>>
>> Two problems:
>>
>> 1) more register spills in the trunk version -- the old compiler seems
>> more effective in using fp stack registers;
>> 2) the complex multiplication -- the old version emits inline sequence
>> while the trunk version emits call to _muld3c intrinsinc.
>
> Neither of these seems like real bug reportable ones.  The first one
> is that due to -fexcess-precision=standard being default in 4.5 and
> above (see PR 323).  The second one is due to -fcx-limited-range not
> being default any more (I cannot remember the bug number which changed
> that though).
>
> Thanks,
> Andrew Pinski
>