Re: Adding Leon processor to the SPARC list of processors

2010-11-23 Thread Eric Botcazou
> Following the recent comments by Eric, the patch now sketches the
> following setup:
>
> If multi-lib is wanted:
>  configure --with-cpu=leon ... : creates multilib-dir soft|v8
> combinations using [-msoft-float|-mcpu=sparcleonv8] (MULTILIB_OPTIONS =
> msoft-float mcpu=sparcleonv8)
>
> If Single-lib is wanted:
>  configure --with-cpu=sparcleonv7 --with-float=soft --disable-multilib ... 
> : (v7 | soft | no-multilib) configure --with-cpu=sparcleonv8
> --with-float=soft --disable-multilib ...  : (v8 | soft | no-multilib)
> configure --with-cpu=sparcleonv7 --with-float=hard --disable-multilib ... 
> : (v7 | hard | no-multilib) configure --with-cpu=sparcleonv8
> --with-float=hard --disable-multilib ...  : (v8 | hard | no-multilib)
>
> Using --with-cpu=leon|sparcleonv7|sparcleonv8 the the sparc_cpu is switched
> to PROCESSOR_LEON.

I'm mostly OK, but I don't think we need sparcleonv7 or sparcleonv8.  Attached 
is another proposal, which:

 1. Adds -mtune/--with-tune=leon for all SPARC targets.  In particular, this 
mean that if you configure --target=sparc-{elf,rtems} --with-tune=leon, you 
get a multilib-ed compiler defaulting to V7/FPU and -mtune=leon, with V8 and 
NO-FPU libraries.

 2. Adds new targets sparc-leon-{elf,linux}: multilib-ed compiler defaulting
to V8/FPU and -mtune=leon, with V7 and NO-FPU libraries.

 3. Adds new targets sparc-leon3-{elf,linux}: multilib-ed compiler defaulting 
to V8/FPU and -mtune=leon, with NO-FPU libraries.

Singlelib-ed compilers are available through --disable-multilib and
  --with=cpu={v7,v8} --with-float={soft,hard} --with-tune=leon
for sparc-{elf,rtems} or just
  --with=cpu={v7,v8} --with-float={soft,hard}
for sparc-leon*-*.

The rationale is that --with-cpu shouldn't change the set of multilibs, it is 
only the configure-time equivalent of -mcpu.  This set of multilibs should 
only depend on the target and the presence of --disable-multilib.


* config.gcc (sparc-*-elf*): Deal with sparc-leon specifically.
(sparc-*-linux*): Likewise.
(sparc*-*-*): Remove obsolete sparc86x setting.
(sparc-leon*): Default to --with-cpu=v8 and --with-tune=leon.
* doc/invoke.texi (SPARC Options): Document -mcpu/-mtune=leon.
* config/sparc/sparc.h (TARGET_CPU_leon): Define.
(TARGET_CPU_sparc86x): Delete.
(TARGET_CPU_cypress): Define as alias to TARGET_CPU_v7.
(TARGET_CPU_f930): Define as alias to TARGET_CPU_sparclite.
(TARGET_CPU_f934): Likewise.
(TARGET_CPU_tsc701): Define as alias to TARGET_CPU_sparclet.
(CPP_CPU_SPEC): Add entry for -mcpu=leon.
(enum processor_type): Add PROCESSOR_LEON.
* config/sparc/sparc.c (leon_costs): New cost array.
(sparc_option_override): Add entry for TARGET_CPU_leon and -mcpu=leon.
Initialize cost array to leon_costs if -mtune=leon.
* config/sparc/sparc.md (cpu attribute): Add leon.
Include leon.md scheduling description.
* config/sparc/leon.md: New file.
* config/sparc/t-elf: Do not assemble Solaris startup files.
* config/sparc/t-leon: New file.
* config/sparc/t-leon3: Likewise.


-- 
Eric Botcazou
Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 167022)
+++ doc/invoke.texi	(working copy)
@@ -16917,8 +16917,8 @@ the rules of the a...@.
 @opindex mcpu
 Set the instruction set, register set, and instruction scheduling parameters
 for machine type @var{cpu_type}.  Supported values for @var{cpu_type} are
-...@samp{v7}, @samp{cypress}, @samp{v8}, @samp{supersparc}, @samp{sparclite},
-...@samp{f930}, @samp{f934}, @samp{hypersparc}, @samp{sparclite86x},
+...@samp{v7}, @samp{cypress}, @samp{v8}, @samp{supersparc}, @samp{hypersparc},
+...@samp{leon}, @samp{sparclite}, @samp{f930}, @samp{f934}, @samp{sparclite86x},
 @samp{sparclet}, @samp{tsc701}, @samp{v9}, @samp{ultrasparc},
 @samp{ultrasparc3}, @samp{niagara} and @samp{niagara2}.
 
@@ -16931,7 +16931,7 @@ implementations.
 
 @smallexample
 v7: cypress
-v8: supersparc, hypersparc
+v8: supersparc, hypersparc, leon
 sparclite:  f930, f934, sparclite86x
 sparclet:   tsc701
 v9: ultrasparc, ultrasparc3, niagara, niagara2
@@ -16984,9 +16984,9 @@ option @option{-mc...@var{cpu_type}} wou
 The same values for @option{-mc...@var{cpu_type}} can be used for
 @option{-mtu...@var{cpu_type}}, but the only useful values are those
 that select a particular cpu implementation.  Those are @samp{cypress},
-...@samp{supersparc}, @samp{hypersparc}, @samp{f930}, @samp{f934},
-...@samp{sparclite86x}, @samp{tsc701}, @samp{ultrasparc},
-...@samp{ultrasparc3}, @samp{niagara}, and @samp{niagara2}.
+...@samp{supersparc}, @samp{hypersparc}, @samp{leon}, @samp{f930}, @samp{f934},
+...@samp{sparclite86x}, @samp{tsc701}, @samp{ultrasparc}, @samp{ultrasparc3},
+...@samp{niagara}, and @samp{niagara2}.
 
 @item -mv8plu

Loop-iv.c ICEs on subregs

2010-11-23 Thread Maxim Kuvyrkov
Zdenek,

I'm investigating an ICE in loop-iv.c:get_biv_step().  I hope you can shed some 
light on what the correct fix would be.

The ICE happens when processing:
==
(insn 111 (set (reg:SI 304)
   (plus (subreg:SI (reg:DI 251) 4)
 (const_int 1

(insn 177 (set (subreg:SI (reg:DI 251))
   (reg:SI 304)))
==

The code like the above does not occur on current mainline early enough for 
loop-iv.c to catch it.  The subregs above are produced by Tom's (CC'ed) 
extension elimination pass (scheduled before fwprop1) which is not in mainline 
yet [*].

The failure is due to assert in loop-iv.c:get_biv_step():
==
gcc_assert ((*inner_mode == *outer_mode) != (*extend != UNKNOWN));
==
i.e., inner and outer modes can differ iff there's an extend in the chain. 

Get_biv_step_1() starts with insn 177, then gets to insn 111, then loops back 
to insn 177 at which point it stops and returns GRD_MAYBE_BIV and sets:

* outer_mode == DImode == natural mode of (reg A);

* inner_mode == SImode == mode of (subreg (reg A)), set in get_biv_step_1:
==
  if (GET_CODE (next) == SUBREG)
{
  enum machine_mode amode = GET_MODE (next);

  if (GET_MODE_SIZE (amode) > GET_MODE_SIZE (*inner_mode))
return false;

  *inner_mode = amode;
  *inner_step = simplify_gen_binary (PLUS, outer_mode,
 *inner_step, *outer_step);
  *outer_step = const0_rtx;
  *extend = UNKNOWN;
}
==

* extend == UNKNOWN as there are no extensions in the chain.

It seems to me that computations of outer_mode and extend are correct, I'm not 
sure about inner_mode.

Zdenek, what do you think is the right way to handle the above case in loop 
analysis?

[*] http://gcc.gnu.org/ml/gcc-patches/2010-10/msg01529.html

Thanks,

--
Maxim Kuvyrkov
CodeSourcery
+1-650-331-3385 x724



RFD: hookizing BITS_PER_UNIT in tree optimizers / frontends

2010-11-23 Thread Joern Rennecke

If we changed BITS_PER_UNIT into an ordinary piece-of-data 'hook', this
would not only cost a data load from the target vector, but would also
inhibit optimizations that replace division / modulo / multiply with shift
or mask operations.
So maybe we should look into having a few functional hooks that do  
common operations, i.e.

bits_in_unitsx / BITS_PER_UNIT
bits_in_units_ceil   (x + BITS_PER_UNIT - 1) / BITS_PER_UNIT
bit_unit_remainder   x % BITS_PER_UNIT
units_in_bitsx * BITS_PER_UNIT

Although we currently have some HOST_WIDE_INT uses, I hope using
unsigned HOST_WIDE_INT as the argument / return type will generally work.

tree.h also defines BITS_PER_UNIT_LOG, which (or its hook equivalent)
should probably be used in all the places that use
exact_log_2 (BITS_PER_UNIT), and, if it could be relied upon to exist, we
could also use it as a substitute for the above hooks.  However, this seems
a bit iffy - we'd permanently forgo the possibility to have 6 / 7 / 36
bit etc. units.

Similar arrangements could be made for BITS_PER_WORD and UNITS_PER_WORD,
although these macros seem not quite so prevalent in the tree optimizers.


Re: Method to disable code SSE2 generation but still use -msse2

2010-11-23 Thread David Mathog
The last mysterious error message went away when the same code was
compiled on a machine with a more recent gcc (4.4.1).  Shortly after
I hit the next roadblock.

Here is foo.c (a modified version of sse2-cmpsd-1.c from the version
4.5.1 testsuite):

>8>8<8>8>8<8>8>8<8>8>8<8>8>8<8>8>8<8>8>8<8>8>8<8>8>8<8
#ifndef CHECK_H
#define CHECK_H "sse2-check.h"
#endif

#ifndef TEST
#define TEST sse2_test
#endif

#include CHECK_H

#include 

static __m128d
__attribute__((noinline, unused))
test (__m128d s1, __m128d s2)
{
printf("test s1.x"); _mm_dump_fd(s1);
printf("test s2.x"); _mm_dump_fd(s2);
  return _mm_add_pd (s1, s2); 
}

static void
TEST (void)
{
  union128d u, s1, s2;
  double e[2];
   
  s1.x = _mm_set_pd (2134.3343,1234.635654);
  s2.x = _mm_set_pd (41124.234,2344.2354);
printf("s10 1 %lf %lf\n",s1.a[0],s1.a[1]);
printf("s20 1 %lf %lf\n",s2.a[0],s2.a[1]);
printf("s1.x"); _mm_dump_fd(s1.x);
printf("s2.x"); _mm_dump_fd(s2.x);
  u.x = test (s1.x, s2.x); 
   
  e[0] = s1.a[0] + s2.a[0];
  e[1] = s1.a[1] + s2.a[1];

printf("s1.x"); _mm_dump_fd(s1.x);
printf("s2.x"); _mm_dump_fd(s2.x);
printf("expected e0 e1 %lf %lf\n",e[0],e[1]);
printf("result   r0 r1 %lf %lf\n",u.a[0],u.a[1]);

  if (check_union128d (u, e))
abort ();
}
>8>8<8>8>8<8>8>8<8>8>8<8>8>8<8>8>8<8>8>8<8>8>8<8>8>8<8

When compiled with -mno-sse2 the run fails.  Bizarrely, it seems to be
passing data into the test function incorrectly, notice that in test
the low double in s2 is the high double in s1, instead of the original
low double in s2 from outside the calling function.  This erroneous
value propagates into my inline code where it is added (correctly, but
of course to the wrong final sum since the inputs were wrong).

gcc -Wall -msse -mno-sse2  -I. -lm -DSOFT_SSE2 -DEMMSOFTDBG -O1  -o
foo_wno foo.c
./foo_wno
mm_set_pd, in 2134.334300 1234.635654
mm_set_pd, in 41124.234000 2344.235400
s10 1 1234.635654 2134.334300
s20 1 2344.235400 41124.234000
s1.xDEBUG m_d_fd:   1234.635654  2134.334300
s2.xDEBUG m_d_fd:   2344.235400 41124.234000
test s1.xDEBUG m_d_fd:   1234.635654  2134.334300
test s2.xDEBUG m_d_fd:   2134.334300 41124.234000
IN _mm_add_pd
__ADEBUG m_d_fd:   1234.635654  2134.334300
__BDEBUG m_d_fd:   2134.334300 41124.234000
s1.xDEBUG m_d_fd:   1234.635654  2134.334300
s2.xDEBUG m_d_fd:   2344.235400 41124.234000
expected e0 e1 3578.871054 43258.568300
result   r0 r1 3368.969954 43258.568300
Aborted

when -msse2 is enabled however, the parameters are passed appropriately
into test (and my inlined function), and the program works.  Here
the pass to the test function is correct, and that propagates into my
inline function correctly too:

gcc -Wall -msse -msse2  -I. -lm -DSOFT_SSE2 -DEMMSOFTDBG -O1  -o
foo_nono foo.c
[r...@newsaf i386]# ./foo_nono
mm_set_pd, in 2134.334300 1234.635654
mm_set_pd, in 41124.234000 2344.235400
s10 1 1234.635654 2134.334300
s20 1 2344.235400 41124.234000
s1.xDEBUG m_d_fd:   1234.635654  2134.334300
s2.xDEBUG m_d_fd:   2344.235400 41124.234000
test s1.xDEBUG m_d_fd:   1234.635654  2134.334300
test s2.xDEBUG m_d_fd:   2344.235400 41124.234000
IN _mm_add_pd
__ADEBUG m_d_fd:   1234.635654  2134.334300
__BDEBUG m_d_fd:   2344.235400 41124.234000
s1.xDEBUG m_d_fd:   1234.635654  2134.334300
s2.xDEBUG m_d_fd:   2344.235400 41124.234000
expected e0 e1 3578.871054 43258.568300
result   r0 r1 3578.871054 43258.568300

Regards,

David Mathog
mat...@caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech


Re: RFD: hookizing BITS_PER_UNIT in tree optimizers / frontends

2010-11-23 Thread Joseph S. Myers
I think quite a lot of front end uses of BITS_PER_UNIT should really be 
TYPE_PRECISION (char_type_node) (which in general I'd consider preferred 
to CHAR_TYPE_SIZE in the front ends).  Though it's pretty poorly defined 
what datastructures should look like if target "char" in the front ends is 
wider than the instruction-set unit of BITS_PER_UNIT bits.

If something relates to an interface to a lower-level part of the compiler 
then BITS_PER_UNIT is probably right - but if somethis relates to whether 
a type is a variant of char, or to alignment of a non-bit-field object 
(you can't have smaller than char alignment), or things like that, then 
TYPE_PRECISION (char_type_node) may be better.

Note that BITS_PER_UNIT is used in code built for the target (libgcc2.c, 
dfp-bit.h, fixed-bit.h, fp-bit.h, libobjc/encoding.c, ...), and converting 
it to a hook requires eliminating those uses.  __CHAR_BIT__ is a suitable 
replacement, at least if the code really cares about char - which is the 
case whenever the value is multiplied by the result of "sizeof".  Some 
questions about machine modes might most usefully be answered by 
predefined macros giving properties of particular machine modes.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: RFD: hookizing BITS_PER_UNIT in tree optimizers / frontends

2010-11-23 Thread Joern Rennecke

Quoting "Joseph S. Myers" :


If something relates to an interface to a lower-level part of the compiler
then BITS_PER_UNIT is probably right - but if somethis relates to whether
a type is a variant of char, or to alignment of a non-bit-field object
(you can't have smaller than char alignment), or things like that, then
TYPE_PRECISION (char_type_node) may be better.


Yes, I see examples for both in the C++ front end.
The tree optimizers seem mostly (or entirely?) concerned with the
addressable unit size.


Note that BITS_PER_UNIT is used in code built for the target (libgcc2.c,
dfp-bit.h, fixed-bit.h, fp-bit.h, libobjc/encoding.c, ...), and converting
it to a hook requires eliminating those uses.


Full conversion does.  For the moment I would be content with a  
partial conversion so that not every tree optimizer that currently uses

BITS_PER_UNIT has to include tm.h itself once the bogus tm.h includes
from target.h / function.h / gimple.h are gone.


Re: Method to disable code SSE2 generation but still use -msse2

2010-11-23 Thread David Mathog
I have found several ways to "fix" the latest issue, but they all boil
down to never passing an __m128d value on the call stack.  For instance
change

static __m128d
__attribute__((noinline, unused))
test (__m128d s1, __m128d s2)

to

static __m128d test (__m128d s1, __m128d s2)

and the program works.  Similarly, change the function to

 static __m128d __attribute__((noinline)) test (__m128d *s1, __m128d *s2)
{
  return _mm_add_pd (*s1, *s2); 
}

and it also works.

Things I tried to force a 16 byte stack alignment that didn't work:

1  -mstackrealign
2  -mpreferred-stack-boundary=4
3  -mincoming-stack-boundary=4
4  2 and 3
5  1 and 2 and 3

I guess the bigger question is why can an __m128d be passed on the call
stack reliably when -msse2 is invoked, but not otherwise?  If the
compiler cannot do this reliably shouldn't it throw an error or warning?

Thanks,

David Mathog
mat...@caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech


Re: Method to disable code SSE2 generation but still use -msse2

2010-11-23 Thread David Mathog
> Things I tried to force a 16 byte stack alignment that didn't work:
> 
> 1  -mstackrealign
> 2  -mpreferred-stack-boundary=4
> 3  -mincoming-stack-boundary=4
> 4  2 and 3
> 5  1 and 2 and 3

And this is why they didn't work.  Change the test function to

 static __m128d __attribute__((noinline,aligned (16))) test ( __m128d
s1, __m128d s2)
{
printf("test s1"); _mm_dump_fd(s1);
printf("test s2"); _mm_dump_fd(s2);
printf("loc s1 %p\n",&s1);
printf("loc s2 %p\n",&s2);
  return _mm_add_pd (s1, s2); 
}

compile and run

 gcc -Wall -msse -mno-sse2  -I. -lm -DSOFT_SSE2 -DEMMSOFTDBG  -O1  -o
foo_wno foo.c
[r...@newsaf i386]# ./foo_wno
mm_set_pd, in 2134.334300 1234.635654
mm_set_pd, in 41124.234000 2344.235400
s10 1 1234.635654 2134.334300
s20 1 2344.235400 41124.234000
s1.xDEBUG m_d_fd:   1234.635654  2134.334300
s2.xDEBUG m_d_fd:   2344.235400 41124.234000
test s1DEBUG m_d_fd:   1234.635654  2134.334300
test s2DEBUG m_d_fd:   2134.334300 41124.234000
loc s1 0x7fff6b6ccb10   <--
loc s2 0x7fff6b6ccb00   <--
s1.xDEBUG m_d_fd:   1234.635654  2134.334300
s2.xDEBUG m_d_fd:   2344.235400 41124.234000
expected e0 e1 3578.871054 43258.568300
result   r0 r1 3368.969954 43258.568300
Aborted

s1 and s2 within test are already 16 byte aligned, so the extra
alignment switches did not help.  Somehow this code

  u.x = test (s1.x, s2.x);

is putting the wrong values for s2 onto the call stack.

Bizarre.  Either I'm missing something or turning off SSE2 is uncovering
a compiler bug.

Thanks,

David Mathog
mat...@caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech


Re: Method to disable code SSE2 generation but still use -msse2

2010-11-23 Thread David Mathog
I renamed the test case gccprob.c and made two binaries and two
assembler files:

gcc -Wall -msse -mno-sse2  -I. -lm -DSOFT_SSE2 -DEMMSOFTDBG \
   -O0  -o gccprob_wno gccprob.c
gcc -Wall -msse -mno-sse2  -I. -lm -DSOFT_SSE2 -DEMMSOFTDBG  \
   -O0 -S  -o gccprob_wno.s gccprob.c
gcc -Wall -msse -msse2  -I. -lm -DSOFT_SSE2 -DEMMSOFTDBG \
   -O0 -S  -o gccprob_nono.s gccprob.c
gcc -Wall -msse -msse2  -I. -lm -DSOFT_SSE2 -DEMMSOFTDBG \
   -O0  -o gccprob_nono gccprob.c

The _wno variants have the problem passing __m128d on the stack,
the _nono varients do not.

packed up all 5 files and put them here (retrieve only directory, no
directory listings in pickup):

http://saf.bio.caltech.edu/pub/pickup/gccprob.tar.gz

I am not an assembler programmer.  If one of you who is could have a
look at the two .s files maybe we can get to the bottom of this.

Thanks,

David Mathog
mat...@caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech


GCC Intermodule Analysis for Go

2010-11-23 Thread Matt Davis
Hello,
I have been working on my PhD thesis and I want to focus on the Go
language.  I know Ian Taylor has done tons of work regarding the Go
frontend for gcc.  Likewise, I know gcc implements SSA and even
link-time optimization.  For my specific research I will need to do
some intermodule analysis.  I know gcc has link-time optimization,
however I might, for my purposes, need to add additional information
to the object files that would allow my specific optimization of a Go
program to aid other compiled modules/translation-units.  Ideally, my
implementation, I would hope, would translate nice to gogo then to
GIMPLE.  In the short term I would like to use this intermodule
analysis to give enough information to the compiler so that when a
module/object-file is  recompiled the changed routines and dependent
routines would be the only aspects recompiled, instead of having to
recompile an entire object file each time a small change is made.

Thoughts?  Is this even feasible?

-Matt


Re: Method to disable code SSE2 generation but still use -msse2

2010-11-23 Thread David Mathog
The problem is specific for 64 bit environments, made these:

gcc -Wall -msse -mno-sse2  -I. -lm -DSOFT_SSE2 -DEMMSOFTDBG \
   -O0 -m32 -S  -o gccprob_wno32.s gccprob.c
gcc -Wall -msse -mno-sse2  -I. -lm -DSOFT_SSE2 -DEMMSOFTDBG  \
   -O0 -m32  -o gccprob_wno32 gccprob.c
gcc -Wall -msse -msse2  -I. -lm -DSOFT_SSE2 -DEMMSOFTDBG  \
   -O0 -m32  -o gccprob_nono32 gccprob.c
gcc -Wall -msse -msse2  -I. -lm -DSOFT_SSE2 -DEMMSOFTDBG  \
   -O0 -m32 -S  -o gccprob_nono32.s gccprob.c

and both binaries work correctly.  Added them to the set here:

http://saf.bio.caltech.edu/pub/pickup/gccprob.tar.gz

Specifics on the environment where the problem is seen:

OS:  Mandriva Linux release 2010.0 (Official) for x86_64
gcc (GCC) 4.4.1
Dual Dual Core Opteron 280. 
Arima HDAMAI motherboard.
64 bit targets only, 32 bit is OK.

Regards,

David Mathog
mat...@caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech


gcc-4.4-20101123 is now available

2010-11-23 Thread gccadmin
Snapshot gcc-4.4-20101123 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.4-20101123/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.4 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_4-branch 
revision 167096

You'll find:

 gcc-4.4-20101123.tar.bz2 Complete GCC (includes all of below)

  MD5=03ae257bfd6a0adde7b2c6fff9a13c28
  SHA1=3afa1b3cdab91775e588f34a55a65e1908318fff

 gcc-core-4.4-20101123.tar.bz2C front end and core compiler

  MD5=b52fe749825c8a33f4390722f1bee788
  SHA1=d2943c6c6f72ebc73dc94e150990f59ea379a120

 gcc-ada-4.4-20101123.tar.bz2 Ada front end and runtime

  MD5=e3a277eb349c166750083ac7d698b868
  SHA1=52133c3d40f7f997d676a846cd1999c8421eb4d4

 gcc-fortran-4.4-20101123.tar.bz2 Fortran front end and runtime

  MD5=543a3b27e0701d674239511d8d0021b4
  SHA1=e1675a7f47f9a832181a57911906f7043565b46a

 gcc-g++-4.4-20101123.tar.bz2 C++ front end and runtime

  MD5=752805fd4dff37ab24ed2afba1d4d626
  SHA1=f44c71e9785e8e2a79b188aadee74e033ac4b71d

 gcc-java-4.4-20101123.tar.bz2Java front end and runtime

  MD5=7bbeb90b4fd6fbb0ebcb2e484913f4aa
  SHA1=6e3ec6b34d093bb52448d510b8b3f328f99ceecd

 gcc-objc-4.4-20101123.tar.bz2Objective-C front end and runtime

  MD5=ecde0a1ac24b43b8d24ef8f8551c27c6
  SHA1=40b6e546b787333f9a28fdbd9d9efbe80cef8add

 gcc-testsuite-4.4-20101123.tar.bz2   The GCC testsuite

  MD5=de6a29b6f6fd2e220e6646dd14b6fba7
  SHA1=5854bb5ab6d240d057c2fb2b022c4aa4f7198d22

Diffs from 4.4-20101116 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.4
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Method to test all sse2 calls?

2010-11-23 Thread David Mathog
What is:

  __builtin_ia32_vec_ext_v2df

???  It wasn't in the original emmintrin.h, so presumably isn't actually
part of SSE2, but it is present in the testsuite, and it is not visible
to the compiler when -mno-sse2 is set. See for instance the files
sse2-vec-#.c.  (Randomly selected) Example:

sse2-vec-4.c:  res[2] = __builtin_ia32_vec_ext_v8hi ((__v8hi)val1.x, 2);

 gcc -Wall -msse -mno-sse2 -I. -m32 -lm -DSOFT_SSE2 -o foo sse2-vec-4.c
sse2-vec-4.c: In function 'sse2_test':
sse2-vec-4.c:27: warning: implicit declaration of function
'__builtin_ia32_vec_ext_v8hi'
/root/tmp/ccYAq3IB.o: In function `sse2_test':
sse2-vec-4.c:(.text+0x58c): undefined reference to
`__builtin_ia32_vec_ext_v8hi'
.
.
.
/root/tmp/ccYAq3IB.o:sse2-vec-4.c:(.text+0x613): more undefined
references to `__builtin_ia32_vec_ext_v8hi' follow
collect2: ld returned 1 exit status

Thanks,

David Mathog
mat...@caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech


Re: GCC Intermodule Analysis for Go

2010-11-23 Thread Ian Lance Taylor
Matt Davis  writes:

> I have been working on my PhD thesis and I want to focus on the Go
> language.  I know Ian Taylor has done tons of work regarding the Go
> frontend for gcc.  Likewise, I know gcc implements SSA and even
> link-time optimization.  For my specific research I will need to do
> some intermodule analysis.  I know gcc has link-time optimization,
> however I might, for my purposes, need to add additional information
> to the object files that would allow my specific optimization of a Go
> program to aid other compiled modules/translation-units.  Ideally, my
> implementation, I would hope, would translate nice to gogo then to
> GIMPLE.  In the short term I would like to use this intermodule
> analysis to give enough information to the compiler so that when a
> module/object-file is  recompiled the changed routines and dependent
> routines would be the only aspects recompiled, instead of having to
> recompile an entire object file each time a small change is made.
>
> Thoughts?  Is this even feasible?

I think the frontend work is entirely feasible in Go.

It would be difficult to do entirely correctly in C++ because of the
complex name lookup rules.  But Go has simple name lookup, so
identifying which parts of a program depends on which other parts should
be more or less straightforward.

As far as translating the information to GIMPLE, and taking advantage of
it in the optimizers, it kind of depends on what kind of information you
are thinking about.

Ian


Re: Method to test all sse2 calls?

2010-11-23 Thread Ian Lance Taylor
"David Mathog"  writes:

> What is:
>
>   __builtin_ia32_vec_ext_v2df
>
> ???

It's a gcc builtin function, not to be confused with an SSE intrinsic
function.

> It wasn't in the original emmintrin.h, so presumably isn't actually
> part of SSE2, but it is present in the testsuite, and it is not visible
> to the compiler when -mno-sse2 is set. See for instance the files
> sse2-vec-#.c.  (Randomly selected) Example:
>
> sse2-vec-4.c:  res[2] = __builtin_ia32_vec_ext_v8hi ((__v8hi)val1.x, 2);

Tests that directly invoke __builtin functions are not appropriate for
your replacement for emmintrin.h.

Ian