Re: Inner loop unable to compute sufficient information during vectorization

2009-05-26 Thread Ira Rosen


gcc-ow...@gcc.gnu.org wrote on 25/05/2009 21:53:41:

> for a loop like
>
> 1 for(i=0;i 2   for(j=0;j 3   a[i][j] = a[i][j]+b[i][j];
>
> GCC 4.3.* is unable to get the information for the inner loop that
> array reference 'a'  is alias of each other and generates code for
> runtime aliasing check during vectorization.

Both current trunk and GCC4.4 vectorize the inner loop without any runtime
alias checks.

> Is it necessary to
> recompute all information in loop_vec_info in function
> vect_analyze_ref for analysis of inner loop also, as most of the
> information is similar for the outer loop for the program.

Maybe you are right, and it is possible to extract at least part of the
information for the inner loop from the outer loop information.

>
> Similarly, outer loop is able to compute correct chrec i.e. NULL , for
> array 'a' reference, while innerloop has chrec as chrec_dont_know, and
> therfore complaint about runtime alias check.

The chrecs are not the same for inner and outer loops, so it is reasonable
that the results of the data dependence tests will be different.
In this case, however, it seems to be a bug.

Ira





March=native with a main 64bit system and 32bit chroot

2009-05-26 Thread Luca Zorzo
Hi all,
I've a main Gentoo 64bit system with CHOST="x86_64-pc-linux-gnu" and
CFLAGS="-march=native -mtune=native -O2 -pipe -fomit-frame-pointer".
My cpu is a Pentium 4 Prescott and i'm using gcc-4.3.2.

With this little script:
"echo 'float x(float x){return x < 0 ? -x : x;}' > x.c && gcc
-fverbose-asm -mtune=native -march=native -S x.c && grep
'\(-march\|-mtune\)' x.s && rm -fr x.c && rm -fr x.s "
i've found that gcc uses march=nocona and mtune=nocona, and this is ok.

In this main system i've some 32bit chroots with
CHOST="i686-pc-linux-gnu" and CFLAGS="-march=native -mtune=native -O2
-pipe -fomit-frame-pointer".
But this time gcc is using march=nocona and mtune=nocona again, that
is wrong i think.

Should i use march=native or march=prescott (that is what i was using
with old gcc versions) for my chroots?

If march=prescott is the right choice i think that the gcc "native"
detection should consider also $CHOST.


Intermediate representation

2009-05-26 Thread Nicolas COLLIN

Hello again,

I 'm still working on egcs 1.1 and the function cp_namespace_decls is 
not implemented in.
I just want to get the classes and functions implemented in my source 
code and I tried to get them with the function gettags but I think I 
didn't understand something. I tried to read some things about the 
bindings in the file decl.c but I didn't get a thing. What does it do ? 
Have I to use it to get what I want ?


Thanks.

Nicolas COLLIN


Intermediate representation

2009-05-26 Thread Nicolas COLLIN

Hello again,

to answer your question my code's purpose is to write a kind of tree in 
a file, the main arborescence is the classes, their method (with 
parameters, return type, ...), their attributes, etc... it will also 
recognize some new keywords I will introduce thanks to "__attribute__".

In this condition, where is the best place to put my code please ?

Thank you a lot, I would haven't made any progress without your help.

Nicolas COLLIN

Dave Korn a écrit :

Bear in mind that global_namespace only exists in the C++ compiler
'cc1plus', so if you access it directly in toplev.c, the plain C compiler
'cc1' will fail to build with a link error.  This might or might not matter to
you for your purposes, but it's better for that reason to keep any code that
needs to understand about anything to do with C++ in the /cp/ subdirectory.

 Without knowing what your code does, I can't say where is the best place to
put it.  The flow of control is that compile_file in gcc/toplev.c calls to the
language-specific parser yyparse, which then calls into hooks in the
language-specific files in gcc/ or gcc/cp/, ending eventually in finish_decl
in either gcc/c-decl.c or gcc/cp/decl.c, where it calls rest_of_compilation to
hand off the tree representation to the mid/backend for translation to
assembler code.

 So, the C++ specific finish_decl would be one good place to add code that
needs to analyse the trees.  Bear in mind that by the time you see them there,
some elementary optimisations like constant folding may already have been
performed, so you might not see what exactly reflects the form of the original
sources.


   cheers,
 DaveK




Re: March=native with a main 64bit system and 32bit chroot

2009-05-26 Thread Ian Lance Taylor
Luca Zorzo  writes:

> I've a main Gentoo 64bit system with CHOST="x86_64-pc-linux-gnu" and
> CFLAGS="-march=native -mtune=native -O2 -pipe -fomit-frame-pointer".
> My cpu is a Pentium 4 Prescott and i'm using gcc-4.3.2.
>
> With this little script:
> "echo 'float x(float x){return x < 0 ? -x : x;}' > x.c && gcc
> -fverbose-asm -mtune=native -march=native -S x.c && grep
> '\(-march\|-mtune\)' x.s && rm -fr x.c && rm -fr x.s "
> i've found that gcc uses march=nocona and mtune=nocona, and this is ok.
>
> In this main system i've some 32bit chroots with
> CHOST="i686-pc-linux-gnu" and CFLAGS="-march=native -mtune=native -O2
> -pipe -fomit-frame-pointer".
> But this time gcc is using march=nocona and mtune=nocona again, that
> is wrong i think.
>
> Should i use march=native or march=prescott (that is what i was using
> with old gcc versions) for my chroots?
>
> If march=prescott is the right choice i think that the gcc "native"
> detection should consider also $CHOST.

This question is appropriate for the mailing list gcc-h...@gcc.gnu.org,
not the mailing list g...@gcc.gnu.org.  Please take any followups to
gcc-help.  Thanks.

Your CPU is the same whether you have a chroot or not, and whether you
are running in 32-bit mode or in 64-bit mode, so it is correct for gcc
to handle -mtune=native -march=native the same way in both cases.
Naturally the -march option will be affected by whether you are in
32-bit mode or not.  As far as I know the -mtune option doesn't make any
difference.

It's reasonable to ask whether -mtune=nocona gives the best results when
using a Prescott.  Right now -mtune=prescott and -mtune=nocona are
handled identically in any case.  I don't know if there is room for
improvement there or not.

Ian


Re: Intermediate representation

2009-05-26 Thread Dave Korn
Nicolas COLLIN wrote:
> Hello again,
> 
> I 'm still working on egcs 1.1 and the function cp_namespace_decls is
> not implemented in.

  Well, the definition is very simple

tree
cp_namespace_decls (tree ns)
{
  return NAMESPACE_LEVEL (ns)->names;
}

and NAMESPACE_LEVEL exists in egcs-1.1, so why not try back-porting it?

cheers,
  DaveK


gcc-4.4-20090526 is now available

2009-05-26 Thread gccadmin
Snapshot gcc-4.4-20090526 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.4-20090526/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.4 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_4-branch 
revision 147883

You'll find:

gcc-4.4-20090526.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.4-20090526.tar.bz2 C front end and core compiler

gcc-ada-4.4-20090526.tar.bz2  Ada front end and runtime

gcc-fortran-4.4-20090526.tar.bz2  Fortran front end and runtime

gcc-g++-4.4-20090526.tar.bz2  C++ front end and runtime

gcc-java-4.4-20090526.tar.bz2 Java front end and runtime

gcc-objc-4.4-20090526.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.4-20090526.tar.bz2The GCC testsuite

Diffs from 4.4-20090519 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.4
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Seeking suggestion

2009-05-26 Thread Jim Wilson

Jamie Prescott wrote:

Is there a reason why something like this would not work?
if (!TARGET_XXX2)
  emit_clobber(gen_rtx_REG(CCmode, CC_REGNUM));
emit_insn(gen_addsi3_nc(operands[0], operands[1], operands[2]));


Yes.  The optimizer will not know that addsi3_nc uses CC_REGNUM, as it 
is not mentioned, so the optimizer will not know that these two RTL 
instructions always need to remain next to each other.  Any optimization 
pass that moves insns around may separate the add from the clobber 
resulting in broken code.  This is what parallels are for, to make sure 
that the clobber and add stay together.


Jim


4.4: march-native gives -mno-sse4, but cpuinfo sse4_1

2009-05-26 Thread sean darcy
If I run gcc -fverbose-asm -mtune=native -march=native -S x.c

I get
cat x.s:
.file   "x.c"
# GNU C (GCC) version 4.4.0 20090506 (Red Hat 4.4.0-4) (x86_64-redhat-linux)
#   compiled by GNU C version 4.4.0 20090506 (Red Hat 4.4.0-4), GMP
version 4.2.4, MPFR version 2.4.1.
# GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
# options passed:  x.c -march=core2 -mcx16 -msahf --param l1-cache-size=32
# --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=core2
# -fverbose-asm
# options enabled:  -falign-loops -fargument-alias
...
# -mfp-ret-in-387 -mfused-madd -mglibc -mieee-fp -mmmx -mno-sse4
# -mpush-args -mred-zone -msahf -msse -msse2 -msse3 -mssse3


cat /proc/cpuinfo:

flags   : .sse sse2  ssse3  sse4_1 ...

Is this a bug in the march-native code, or is there a reason sse4.1 is
not enabled (this cpu does not support sse4.2)?

sean


Re: 4.4: march-native gives -mno-sse4, but cpuinfo sse4_1

2009-05-26 Thread Ian Lance Taylor
sean darcy  writes:

> If I run gcc -fverbose-asm -mtune=native -march=native -S x.c
>
> I get
> cat x.s:
>   .file   "x.c"
> # GNU C (GCC) version 4.4.0 20090506 (Red Hat 4.4.0-4) (x86_64-redhat-linux)
> # compiled by GNU C version 4.4.0 20090506 (Red Hat 4.4.0-4), GMP
> version 4.2.4, MPFR version 2.4.1.
> # GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
> # options passed:  x.c -march=core2 -mcx16 -msahf --param l1-cache-size=32
> # --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=core2
> # -fverbose-asm
> # options enabled:  -falign-loops -fargument-alias
> ...
> # -mfp-ret-in-387 -mfused-madd -mglibc -mieee-fp -mmmx -mno-sse4
> # -mpush-args -mred-zone -msahf -msse -msse2 -msse3 -mssse3
> 
>
> cat /proc/cpuinfo:
>
> flags : .sse sse2  ssse3  sse4_1 ...
>
> Is this a bug in the march-native code, or is there a reason sse4.1 is
> not enabled (this cpu does not support sse4.2)?

This question is more appropriate for the gcc-h...@gcc.gnu.org mailing
list.  Please take any followups to gcc-help.  Thanks.

In gcc, -msse4 implies both SSE4.1 and SSE4.2.  Since your processor
does not support both, -mno-sse4 is in effect.

Actually, at present gcc does not consider any processor to support
SSE4.1 or SSE4.2 by default.  You always have to select them explicitly.
This seems to be a bug.  I would encourage you to open a bug report, if
there isn't already one about this; see http://gcc.gnu.org/bugs.html .
Thanks.

Ian