Re: -ftree-vectorize can't vectorize plus?

2006-09-11 Thread Dorit Nuzman
>A silly little testcase which the vectorizer doesn't vectorize:
>

> autovecttest.c:11: note: not vectorized: relevant stmt not
> supported: D.1861_9 = (signed char) D.1860_8

Can these type casts (from uchar to schar and back) be cleaned away by some
pass before vectorization, or do we need to teach the vectorizer to ignore
such type casts?

unsigned char D.1932
unsigned char D.1936
unsigned char D.1939

  D.1933_9 = (signed char) D.1932_8;
  D.1937_17 = (signed char) D.1936_16;
  D.1938_18 = D.1937_17 ^ D.1933_9;
  D.1939_19 = (unsigned char) D.1938_18;

dorit


> unsigned char qa[128];
> unsigned char qb[128];
> unsigned char qc[128];
> unsigned char qd[128];
>
> void autovectqi (void)
> {
>int i;
>
>for (i = 0; i < 128; i ++)
>   qd[i] = qa[i] ^ qb[i] + qc[i];
> }
>
>Revision 116799 with '-O3 -fomit-frame-pointer -S -dp -ftree-vectorize
> -march=prescott' produces:
>
> autovectqi:
>xorl   %edx, %edx   # 54   *movsi_xor   [length = 2]
> .L2:
>movzbl   qb(%edx), %eax   # 20   *movqi_1/3   [length = 4]
>addb   qc(%edx), %al   # 21   *addqi_1_lea/2   [length = 3]
>xorb   qa(%edx), %al   # 23   *xorqi_1/1   [length = 3]
>movb   %al, qd(%edx)   # 24   *movqi_1/7   [length = 3]
>addl   $1, %edx   # 26   *addsi_1/1   [length = 3]
>cmpl   $128, %edx   # 27   *cmpsi_1_insn/1   [length = 6]
>jne   .L2  # 28   *jcc_1  [length = 2]
>ret # 51   return_internal   [length = 1]
>
>
>If I change 'qb[i] + qc[i]' to e.g. 'qb[i] & qc[i]' the vectorizer
works
> fine.
>
> ;; Function autovectqi (autovectqi)
> [snip lots of stuff]
> autovecttest.c:11: note: Access function of PHI: {0, +, 1}_1
> autovecttest.c:11: note: Analyze phi: qd_23 = PHI ;
> autovecttest.c:11: note: virtual phi. skip.
> autovecttest.c:11: note: === vect_analyze_operations ===
> autovecttest.c:11: note: examining phi: ivtmp.28_1 = PHI  28_2(4), 128(2)>;
> autovecttest.c:11: note: examining phi: i_24 = PHI ;
> autovecttest.c:11: note: examining phi: qd_23 = PHI ;
> autovecttest.c:11: note: ==> examining statement: :
> autovecttest.c:11: note: irrelevant.
> autovecttest.c:11: note: ==> examining statement: D.1860_8 = qa[i_24]
> autovecttest.c:11: note: num. args = 4 (not unary/binary op).
> autovecttest.c:11: note: vect_is_simple_use: operand qa[i_24]
> autovecttest.c:11: note: not ssa-name.
> autovecttest.c:11: note: use not simple.
> autovecttest.c:11: note: ==> examining statement: D.1861_9 = (signed
> char) D.1860_8
> autovecttest.c:11: note: vect_is_simple_use: operand D.1860_8
> autovecttest.c:11: note: def_stmt: D.1860_8 = qa[i_24]
> autovecttest.c:11: note: type of def: 2.
> autovecttest.c:11: note: no optab.
> autovecttest.c:11: note: vect_is_simple_use: operand (signed char)
D.1860_8
> autovecttest.c:11: note: not ssa-name.
> autovecttest.c:11: note: use not simple.
> autovecttest.c:11: note: not vectorized: relevant stmt not
> supported: D.1861_9 = (signed char) D.1860_8
> autovecttest.c:11: note: bad operation or unsupported loop bound.
> autovecttest.c:11: note: vectorized 0 loops in function.
> autovectqi ()
> {
>   unsigned int ivtmp.28;
>   int pretmp.22;
>   int i;
>   unsigned char D.1867;
>   signed char D.1866;
>   signed char D.1865;
>   unsigned char D.1864;
>   unsigned char D.1863;
>   unsigned char D.1862;
>   signed char D.1861;
>   unsigned char D.1860;
>
> :
>
>   # ivtmp.28_1 = PHI ;
>   # i_24 = PHI ;
> :;
>   D.1860_8 = qa[i_24];
>   D.1861_9 = (signed char) D.1860_8;
>   D.1862_12 = qb[i_24];
>   D.1863_15 = qc[i_24];
>   D.1864_16 = D.1863_15 + D.1862_12;
>   D.1865_17 = (signed char) D.1864_16;
>   D.1866_18 = D.1865_17 ^ D.1861_9;
>   D.1867_19 = (unsigned char) D.1866_18;
>   qd[i_24] = D.1867_19;
>   i_21 = i_24 + 1;
>   ivtmp.28_2 = ivtmp.28_1 - 1;
>   if (ivtmp.28_2 != 0) goto ; else goto ;
>
> :;
>   goto  ();
>
> :;
>   return;
>
> }
> [cut]
>
> --
> Rask Ingemann Lambertsen



Re: libgfortran build broken on Darwin ppc

2006-09-11 Thread Jack Howarth
Geoff,
   If the autoconf patch isn't going in to gcc trunk, would someone
at Apple please nudge the folks who maintain www.opensource.apple.com
to post the Xcode Tools 2.4 source code release? Either than or post
a new cctools based off the same to the gcc ftp site. We really need
to be able to create a new odcctools release in sync with Xcode 2.4.
Jack


Re: powerpc targets, long double implementation, and c++ programs

2006-09-11 Thread Edmar Wienskoski

Joseph S. Myers wrote:


On Fri, 8 Sep 2006, Edmar Wienskoski wrote:

 


Ok. I am starting to see the whole picture now.
So the whole thing appears to work with --disable-shared, just because the way
the linker
loads symbols in presence of libgcc_s.so versus libgcc.a.

Follow up question:
The e500 abi actualy defines long double to be 128bits floats.
On rs6000.c, rs6000_init_libfuncs links to __gcc_qadd becasue of
TARGET_HARD_FLOAT
shouldn't that be TARGET_HARD_FLOAT && TARGET_FPRS
and also have:
diff -u t-fprules-softfp~ t-fprules-softfp
--- t-fprules-softfp~   2006-08-09 14:20:24.0 -0500
+++ t-fprules-softfp2006-09-06 12:39:17.0 -0500
@@ -1,4 +1,4 @@
-softfp_float_modes := sf df
+softfp_float_modes := sf df tf
softfp_int_modes := si di
softfp_extensions := sfdf
softfp_truncations := dfsf

Would that be right ?
   



No.

(a) The existing GNU/Linux ABIs use or are intended to use IBM long 
double, not IEEE long double, and the E500 GNU/Linux ABI should be 
compatible with the other ABIs in this regard.  The present formal ABI 
documents are not very relevant to the de facto GNU/Linux ABIs.
 

Well, actually this is part of the problem. We have only one document: 
the  "e500 Sys V ABI",

which was intended to create only one ABI.

(b) To use IEEE long double with soft-fp you'll need to add sftf dftf to 
softfp_extensions and tfdf tfsf to softfp_truncations.
 


Humm.

(c) If using IEEE long double on PowerPC, you should be using the standard 
_q_* functions defined in the psABI, and not the __*tf* functions at all.  
glibc does provide the _q_* functions (albeit with a typo meaning _q_utoq 
is missing), though since they don't get built with -mabi=ieeelongdouble 
they aren't actually usable.
 


There are 2 issues here:
First, It is libgcc that is generating undefined references to __*tf* functions. If gcc 
can provide them with "softfp_float_modes := sf df tf", I think is reasonable 
to do that. For completeness sake you can do as you suggested: change softfp_extensions 
and softfp_truncations, but they are not absolutely necessary.

Second, is the long double ABI problem. In the past gcc always generated 
function calls to _q_* functions. (Per ABI Chapter 5)
For this code:
long double foo (long double x, long double y){ return x + y; }
gcc-4.0, target powerpc-eabise and
gcc-4.0, target powerpc-*-linux-gnuspe with -mlong-double-128 option
both generates a call to _q_add.
The same code with gcc-4.2, both targets generates a call to __gcc_qadd.

If there is an intention to change the E500 ABI, then somebody has to step 
forward and actually change the document (With all the administrative burden 
that cames with it..).

Edmar




Re: -ftree-vectorize can't vectorize plus?

2006-09-11 Thread Daniel Berlin

On 9/11/06, Dorit Nuzman <[EMAIL PROTECTED]> wrote:

>A silly little testcase which the vectorizer doesn't vectorize:
>

> autovecttest.c:11: note: not vectorized: relevant stmt not
> supported: D.1861_9 = (signed char) D.1860_8

Can these type casts (from uchar to schar and back) be cleaned away by some
pass before vectorization,


Uh, what do you mean "cleaned away"?
You can't just legally ignore them, they are changing the overflow behavior.


Re: powerpc targets, long double implementation, and c++ programs

2006-09-11 Thread David Edelsohn
> Edmar Wienskoski writes:

Edmar> Second, is the long double ABI problem. In the past gcc always generated 
function calls to _q_* functions. (Per ABI Chapter 5)
Edmar> For this code:
Edmar> long double foo (long double x, long double y){ return x + y; }
Edmar> gcc-4.0, target powerpc-eabise and
Edmar> gcc-4.0, target powerpc-*-linux-gnuspe with -mlong-double-128 option
Edmar> both generates a call to _q_add.
Edmar> The same code with gcc-4.2, both targets generates a call to __gcc_qadd.

Edmar> If there is an intention to change the E500 ABI, then somebody has to 
step forward and actually change the document (With all the administrative 
burden that cames with it..).

The PowerPC Linux ABI uses IBM long double format.  The PowerPC
SVR4 Supplement defines IEEE long double, but no library actually
implemented the _q_* functions or at least it was not generally available.

If Freescale wants users who configure with powerpc-*-linux-gnuspe
to be compatible with the rest of PowerPC GNU+Linux, it needs to accept
IBM long double.  This is defined in the PowerPC Linux ABI, regardless of
SVR4.  IEEE long double also is not very efficient on PowerPC and few
people (if any) could use it because of the lack of library support.  An
unimplemented ABI is not very useful, except in theory.

David




Re: powerpc targets, long double implementation, and c++ programs

2006-09-11 Thread Joseph S. Myers
On Mon, 11 Sep 2006, Edmar Wienskoski wrote:

> There are 2 issues here:
> First, It is libgcc that is generating undefined references to __*tf*
> functions. If gcc can provide them with "softfp_float_modes := sf df tf", I
> think is reasonable to do that. For completeness sake you can do as you
> suggested: change softfp_extensions and softfp_truncations, but they are not
> absolutely necessary.

softfp_extensions and softfp_truncations are needed for conversions 
between float/double and long double.

> Second, is the long double ABI problem. In the past gcc always generated
> function calls to _q_* functions. (Per ABI Chapter 5)
> For this code:
> long double foo (long double x, long double y){ return x + y; }
> gcc-4.0, target powerpc-eabise and
> gcc-4.0, target powerpc-*-linux-gnuspe with -mlong-double-128 option
> both generates a call to _q_add.
> The same code with gcc-4.2, both targets generates a call to __gcc_qadd.

__gcc_* are for IBM long double.  _q_* are for IEEE long double.  __*tf* 
are for an unspecified version of long double *where there aren't 
ABI-specified functions*; generating calls to them on PowerPC Linux is 
generally a GCC bug.  For IEEE long double, it's definitely better to use 
_q_* from glibc since they support rounding modes and exceptions.  (This 
in general is an advantage of using soft-fp code in glibc where the glibc 
version is known to have a good copy, but doing it in general - for float 
and double - requires extra configure support to be implemented in GCC, 
and some changes in glibc.)

I personally prefer IEEE long double to IBM long double, but in the 
context of the existing family of PowerPC ABIs currently supported by 
glibc and GCC I think trying to use it on Linux is a mistake; if used it 
should have a separate family of target triplets whose names explicitly 
indicate a new ABI.

-- 
Joseph S. Myers
[EMAIL PROTECTED]


Re: -ftree-vectorize can't vectorize plus?

2006-09-11 Thread Devang Patel

Can these type casts (from uchar to schar and back) be cleaned away
by some pass before vectorization, or do we need to teach the vectorizer
to ignore such type casts?


This was considered as tree-combiner's responsibility. However, I do not
know what is the current state and plan of tree-combiner pass. tree-combiner
pass, along with other combining activites, will remove unnecessary
cast (if possible).

-
Devang


Re: -ftree-vectorize can't vectorize plus?

2006-09-11 Thread Dorit Nuzman

"Daniel Berlin" <[EMAIL PROTECTED]> wrote on 11/09/2006 06:27:16 PM:

> On 9/11/06, Dorit Nuzman <[EMAIL PROTECTED]> wrote:
> > >A silly little testcase which the vectorizer doesn't vectorize:
> > >
> > 
> > > autovecttest.c:11: note: not vectorized: relevant stmt not
> > > supported: D.1861_9 = (signed char) D.1860_8
> >
> > Can these type casts (from uchar to schar and back) be cleaned away by
some
> > pass before vectorization,
>
> Uh, what do you mean "cleaned away"?
> You can't just legally ignore them, they are changing the overflow
behavior.

not in the case of xor... I was referring to cases like in the pattern I
showed, in which the arguments are cast from unsigned to signed just to
perform the xor operation, and the result is cast back to unsigned. Isn't
this:

  unsigned char D.1932
  unsigned char D.1936
  unsigned char D.1939
  
  D.1933_9 = (signed char) D.1932_8;
  D.1937_17 = (signed char) D.1936_16;
  D.1938_18 = D.1937_17 ^ D.1933_9;
  D.1939_19 = (unsigned char) D.1938_18;

the same as this?:

  D.1939_19 = D.1936_16 ^ D.1932_8

dorit



question about -print-search-dirs

2006-09-11 Thread Kate Minola

Perhaps a kind person would explain what -print-search-dirs is printing.
The manual entry is not very enlightening.

When I do

%gcc -print search-dirs

I get output of which the "libraries=" line lists the following libraries

libraries: 
=/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/
:/usr/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/
:/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/../../../../x86_64-unknown-linux-gnu/lib/x86_64-unknown-linux-gnu/4.1.1/
:/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/../../../../x86_64-unknown-linux-gnu/lib/
:/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/../../../x86_64-unknown-linux-gnu/4.1.1/
:/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/../../../
:/lib/x86_64-unknown-linux-gnu/4.1.1/
:/lib/
:/usr/lib/x86_64-unknown-linux-gnu/4.1.1/
:/usr/lib/

Yet if I do

% gcc -v -o hello hello.c

for a simple "hello world" program, then the libraries listed
(following "-L") are

-L/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1
-L/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/../../../../lib64
-L/lib/../lib64
-L/usr/lib/../lib64

I guess I would have expected that the two lists of libraries would be the same,
or perhaps that the second list would be contained in the first.  But
this does not
seem to be the case.

What am I missing?

Kate Minola
University of Maryland, College Park


Re: question about -print-search-dirs

2006-09-11 Thread Ian Lance Taylor
"Kate Minola" <[EMAIL PROTECTED]> writes:

> I guess I would have expected that the two lists of libraries would be the 
> same,
> or perhaps that the second list would be contained in the first.  But
> this does not
> seem to be the case.
> 
> What am I missing?

gcc only generates a -L option for directories which actually exist,
and which actually are directories.

Ian


new libjava regression on darwin

2006-09-11 Thread Jack Howarth
Geoff,
   Did you notice that a new libjava regression occured today on Darwin
apparently after revision 116838 but by revision 116843? The testcase...

FAIL: Thread_Sleep -O3 -findirect-dispatch output - bytecode->native test

now fails. Could this be related to your change...


r116639 | geoffk | 2006-09-01 15:52:10 -0400 (Fri, 01 Sep 2006) | 3 lines

* testsuite/libjava.jni/jni.exp (gcj_jni_invocation_test_one):
Pass -lgcj to linker for C++ files on Darwin.

FYI.
  Jack


Re: new libjava regression on darwin

2006-09-11 Thread Geoffrey Keating


On 11/09/2006, at 3:51 PM, Jack Howarth wrote:


Geoff,
   Did you notice that a new libjava regression occured today on  
Darwin
apparently after revision 116838 but by revision 116843? The  
testcase...


FAIL: Thread_Sleep -O3 -findirect-dispatch output - bytecode- 
>native test


now fails. Could this be related to your change...

-- 
--
r116639 | geoffk | 2006-09-01 15:52:10 -0400 (Fri, 01 Sep 2006) | 3  
lines


* testsuite/libjava.jni/jni.exp (gcj_jni_invocation_test_one):
Pass -lgcj to linker for C++ files on Darwin.


No.



smime.p7s
Description: S/MIME cryptographic signature


Re: new libjava regression on darwin

2006-09-11 Thread Andrew Pinski
> 
> Geoff,
>Did you notice that a new libjava regression occured today on Darwin
> apparently after revision 116838 but by revision 116843? The testcase...
> 
> FAIL: Thread_Sleep -O3 -findirect-dispatch output - bytecode->native test
> 
> now fails. Could this be related to your change...

This is just a timeout.

-- Pinski


Re: new libjava regression on darwin

2006-09-11 Thread Geoffrey Keating


On 11/09/2006, at 3:59 PM, Andrew Pinski wrote:



Geoff,
   Did you notice that a new libjava regression occured today on  
Darwin
apparently after revision 116838 but by revision 116843? The  
testcase...


FAIL: Thread_Sleep -O3 -findirect-dispatch output - bytecode- 
>native test


now fails. Could this be related to your change...


This is just a timeout.


It's actually an timing-dependent infinite loop, or at least that's  
what I remember from the last time I looked at it several years ago.




smime.p7s
Description: S/MIME cryptographic signature


debugging tmpdir-gcc.dg-struct-layout-1 failures

2006-09-11 Thread Jack Howarth
On Darwin PPC at -m64, we are seeing a slew of failures 
in the tmpdir-gcc.dg-struct-layout-1 tests...

FAIL: tmpdir-gcc.dg-struct-layout-1/t001 c_compat_x_tst.o-c_compat_y_tst.o 
execute 
FAIL: tmpdir-gcc.dg-struct-layout-1/t003 c_compat_x_tst.o-c_compat_y_tst.o 
execute 
FAIL: tmpdir-gcc.dg-struct-layout-1/t005 c_compat_x_tst.o-c_compat_y_tst.o 
execute 
FAIL: tmpdir-gcc.dg-struct-layout-1/t006 c_compat_x_tst.o-c_compat_y_tst.o 
execute 
FAIL: tmpdir-gcc.dg-struct-layout-1/t008 c_compat_x_tst.o-c_compat_y_tst.o 
execute 
FAIL: tmpdir-gcc.dg-struct-layout-1/t016 c_compat_x_tst.o-c_compat_y_tst.o 
execute 
FAIL: tmpdir-gcc.dg-struct-layout-1/t024 c_compat_x_tst.o-c_compat_y_tst.o 
execute 
FAIL: tmpdir-gcc.dg-struct-layout-1/t026 c_compat_x_tst.o-c_compat_y_tst.o 
execute 
FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o-c_compat_y_tst.o 
execute 

The binaries left in gcc/testsuite/gcc simply abort when run. Can someone walk 
me
through the process of creating a bug report for these? I don't quite understand
how I could create .i or .s files for these since the creation of all of the 
test
binaries seems to be automated. Thanks in advance for any help.
Jack


Re: debugging tmpdir-gcc.dg-struct-layout-1 failures

2006-09-11 Thread Jakub Jelinek
On Tue, Sep 12, 2006 at 12:32:35AM -0400, Jack Howarth wrote:
> On Darwin PPC at -m64, we are seeing a slew of failures 
> in the tmpdir-gcc.dg-struct-layout-1 tests...
> 
> FAIL: tmpdir-gcc.dg-struct-layout-1/t001 c_compat_x_tst.o-c_compat_y_tst.o 
> execute 
> FAIL: tmpdir-gcc.dg-struct-layout-1/t003 c_compat_x_tst.o-c_compat_y_tst.o 
> execute 
> FAIL: tmpdir-gcc.dg-struct-layout-1/t005 c_compat_x_tst.o-c_compat_y_tst.o 
> execute 
> FAIL: tmpdir-gcc.dg-struct-layout-1/t006 c_compat_x_tst.o-c_compat_y_tst.o 
> execute 
> FAIL: tmpdir-gcc.dg-struct-layout-1/t008 c_compat_x_tst.o-c_compat_y_tst.o 
> execute 
> FAIL: tmpdir-gcc.dg-struct-layout-1/t016 c_compat_x_tst.o-c_compat_y_tst.o 
> execute 
> FAIL: tmpdir-gcc.dg-struct-layout-1/t024 c_compat_x_tst.o-c_compat_y_tst.o 
> execute 
> FAIL: tmpdir-gcc.dg-struct-layout-1/t026 c_compat_x_tst.o-c_compat_y_tst.o 
> execute 
> FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o-c_compat_y_tst.o 
> execute 
> 
> The binaries left in gcc/testsuite/gcc simply abort when run. Can someone 
> walk me
> through the process of creating a bug report for these? I don't quite 
> understand
> how I could create .i or .s files for these since the creation of all of the 
> test
> binaries seems to be automated. Thanks in advance for any help.

Are you testing two compilers against each other (i.e. are you using
ALT_CC_UNDER_TEST resp. ALT_CXX_UNDER_TEST)?
In any case, you can build the generated tests with -DDBG (guess
e.g. RUNTESTFLAGS="struct-layout-1.exp --target-board unix/-DDBG"
could work) and you'll see which test line and which check failed in the
testsuite log dump.
Then you can just cut and paste that line from the generated testcase
(the test line number is the first argument of the macro on each line)
say into say struct-layout-1_test.h and can build it as any other compat.exp
testcase.

Jakub