Re: -ftree-vectorize can't vectorize plus?
>A silly little testcase which the vectorizer doesn't vectorize: > > autovecttest.c:11: note: not vectorized: relevant stmt not > supported: D.1861_9 = (signed char) D.1860_8 Can these type casts (from uchar to schar and back) be cleaned away by some pass before vectorization, or do we need to teach the vectorizer to ignore such type casts? unsigned char D.1932 unsigned char D.1936 unsigned char D.1939 D.1933_9 = (signed char) D.1932_8; D.1937_17 = (signed char) D.1936_16; D.1938_18 = D.1937_17 ^ D.1933_9; D.1939_19 = (unsigned char) D.1938_18; dorit > unsigned char qa[128]; > unsigned char qb[128]; > unsigned char qc[128]; > unsigned char qd[128]; > > void autovectqi (void) > { >int i; > >for (i = 0; i < 128; i ++) > qd[i] = qa[i] ^ qb[i] + qc[i]; > } > >Revision 116799 with '-O3 -fomit-frame-pointer -S -dp -ftree-vectorize > -march=prescott' produces: > > autovectqi: >xorl %edx, %edx # 54 *movsi_xor [length = 2] > .L2: >movzbl qb(%edx), %eax # 20 *movqi_1/3 [length = 4] >addb qc(%edx), %al # 21 *addqi_1_lea/2 [length = 3] >xorb qa(%edx), %al # 23 *xorqi_1/1 [length = 3] >movb %al, qd(%edx) # 24 *movqi_1/7 [length = 3] >addl $1, %edx # 26 *addsi_1/1 [length = 3] >cmpl $128, %edx # 27 *cmpsi_1_insn/1 [length = 6] >jne .L2 # 28 *jcc_1 [length = 2] >ret # 51 return_internal [length = 1] > > >If I change 'qb[i] + qc[i]' to e.g. 'qb[i] & qc[i]' the vectorizer works > fine. > > ;; Function autovectqi (autovectqi) > [snip lots of stuff] > autovecttest.c:11: note: Access function of PHI: {0, +, 1}_1 > autovecttest.c:11: note: Analyze phi: qd_23 = PHI ; > autovecttest.c:11: note: virtual phi. skip. > autovecttest.c:11: note: === vect_analyze_operations === > autovecttest.c:11: note: examining phi: ivtmp.28_1 = PHI 28_2(4), 128(2)>; > autovecttest.c:11: note: examining phi: i_24 = PHI ; > autovecttest.c:11: note: examining phi: qd_23 = PHI ; > autovecttest.c:11: note: ==> examining statement: : > autovecttest.c:11: note: irrelevant. > autovecttest.c:11: note: ==> examining statement: D.1860_8 = qa[i_24] > autovecttest.c:11: note: num. args = 4 (not unary/binary op). > autovecttest.c:11: note: vect_is_simple_use: operand qa[i_24] > autovecttest.c:11: note: not ssa-name. > autovecttest.c:11: note: use not simple. > autovecttest.c:11: note: ==> examining statement: D.1861_9 = (signed > char) D.1860_8 > autovecttest.c:11: note: vect_is_simple_use: operand D.1860_8 > autovecttest.c:11: note: def_stmt: D.1860_8 = qa[i_24] > autovecttest.c:11: note: type of def: 2. > autovecttest.c:11: note: no optab. > autovecttest.c:11: note: vect_is_simple_use: operand (signed char) D.1860_8 > autovecttest.c:11: note: not ssa-name. > autovecttest.c:11: note: use not simple. > autovecttest.c:11: note: not vectorized: relevant stmt not > supported: D.1861_9 = (signed char) D.1860_8 > autovecttest.c:11: note: bad operation or unsupported loop bound. > autovecttest.c:11: note: vectorized 0 loops in function. > autovectqi () > { > unsigned int ivtmp.28; > int pretmp.22; > int i; > unsigned char D.1867; > signed char D.1866; > signed char D.1865; > unsigned char D.1864; > unsigned char D.1863; > unsigned char D.1862; > signed char D.1861; > unsigned char D.1860; > > : > > # ivtmp.28_1 = PHI ; > # i_24 = PHI ; > :; > D.1860_8 = qa[i_24]; > D.1861_9 = (signed char) D.1860_8; > D.1862_12 = qb[i_24]; > D.1863_15 = qc[i_24]; > D.1864_16 = D.1863_15 + D.1862_12; > D.1865_17 = (signed char) D.1864_16; > D.1866_18 = D.1865_17 ^ D.1861_9; > D.1867_19 = (unsigned char) D.1866_18; > qd[i_24] = D.1867_19; > i_21 = i_24 + 1; > ivtmp.28_2 = ivtmp.28_1 - 1; > if (ivtmp.28_2 != 0) goto ; else goto ; > > :; > goto (); > > :; > return; > > } > [cut] > > -- > Rask Ingemann Lambertsen
Re: libgfortran build broken on Darwin ppc
Geoff, If the autoconf patch isn't going in to gcc trunk, would someone at Apple please nudge the folks who maintain www.opensource.apple.com to post the Xcode Tools 2.4 source code release? Either than or post a new cctools based off the same to the gcc ftp site. We really need to be able to create a new odcctools release in sync with Xcode 2.4. Jack
Re: powerpc targets, long double implementation, and c++ programs
Joseph S. Myers wrote: On Fri, 8 Sep 2006, Edmar Wienskoski wrote: Ok. I am starting to see the whole picture now. So the whole thing appears to work with --disable-shared, just because the way the linker loads symbols in presence of libgcc_s.so versus libgcc.a. Follow up question: The e500 abi actualy defines long double to be 128bits floats. On rs6000.c, rs6000_init_libfuncs links to __gcc_qadd becasue of TARGET_HARD_FLOAT shouldn't that be TARGET_HARD_FLOAT && TARGET_FPRS and also have: diff -u t-fprules-softfp~ t-fprules-softfp --- t-fprules-softfp~ 2006-08-09 14:20:24.0 -0500 +++ t-fprules-softfp2006-09-06 12:39:17.0 -0500 @@ -1,4 +1,4 @@ -softfp_float_modes := sf df +softfp_float_modes := sf df tf softfp_int_modes := si di softfp_extensions := sfdf softfp_truncations := dfsf Would that be right ? No. (a) The existing GNU/Linux ABIs use or are intended to use IBM long double, not IEEE long double, and the E500 GNU/Linux ABI should be compatible with the other ABIs in this regard. The present formal ABI documents are not very relevant to the de facto GNU/Linux ABIs. Well, actually this is part of the problem. We have only one document: the "e500 Sys V ABI", which was intended to create only one ABI. (b) To use IEEE long double with soft-fp you'll need to add sftf dftf to softfp_extensions and tfdf tfsf to softfp_truncations. Humm. (c) If using IEEE long double on PowerPC, you should be using the standard _q_* functions defined in the psABI, and not the __*tf* functions at all. glibc does provide the _q_* functions (albeit with a typo meaning _q_utoq is missing), though since they don't get built with -mabi=ieeelongdouble they aren't actually usable. There are 2 issues here: First, It is libgcc that is generating undefined references to __*tf* functions. If gcc can provide them with "softfp_float_modes := sf df tf", I think is reasonable to do that. For completeness sake you can do as you suggested: change softfp_extensions and softfp_truncations, but they are not absolutely necessary. Second, is the long double ABI problem. In the past gcc always generated function calls to _q_* functions. (Per ABI Chapter 5) For this code: long double foo (long double x, long double y){ return x + y; } gcc-4.0, target powerpc-eabise and gcc-4.0, target powerpc-*-linux-gnuspe with -mlong-double-128 option both generates a call to _q_add. The same code with gcc-4.2, both targets generates a call to __gcc_qadd. If there is an intention to change the E500 ABI, then somebody has to step forward and actually change the document (With all the administrative burden that cames with it..). Edmar
Re: -ftree-vectorize can't vectorize plus?
On 9/11/06, Dorit Nuzman <[EMAIL PROTECTED]> wrote: >A silly little testcase which the vectorizer doesn't vectorize: > > autovecttest.c:11: note: not vectorized: relevant stmt not > supported: D.1861_9 = (signed char) D.1860_8 Can these type casts (from uchar to schar and back) be cleaned away by some pass before vectorization, Uh, what do you mean "cleaned away"? You can't just legally ignore them, they are changing the overflow behavior.
Re: powerpc targets, long double implementation, and c++ programs
> Edmar Wienskoski writes: Edmar> Second, is the long double ABI problem. In the past gcc always generated function calls to _q_* functions. (Per ABI Chapter 5) Edmar> For this code: Edmar> long double foo (long double x, long double y){ return x + y; } Edmar> gcc-4.0, target powerpc-eabise and Edmar> gcc-4.0, target powerpc-*-linux-gnuspe with -mlong-double-128 option Edmar> both generates a call to _q_add. Edmar> The same code with gcc-4.2, both targets generates a call to __gcc_qadd. Edmar> If there is an intention to change the E500 ABI, then somebody has to step forward and actually change the document (With all the administrative burden that cames with it..). The PowerPC Linux ABI uses IBM long double format. The PowerPC SVR4 Supplement defines IEEE long double, but no library actually implemented the _q_* functions or at least it was not generally available. If Freescale wants users who configure with powerpc-*-linux-gnuspe to be compatible with the rest of PowerPC GNU+Linux, it needs to accept IBM long double. This is defined in the PowerPC Linux ABI, regardless of SVR4. IEEE long double also is not very efficient on PowerPC and few people (if any) could use it because of the lack of library support. An unimplemented ABI is not very useful, except in theory. David
Re: powerpc targets, long double implementation, and c++ programs
On Mon, 11 Sep 2006, Edmar Wienskoski wrote: > There are 2 issues here: > First, It is libgcc that is generating undefined references to __*tf* > functions. If gcc can provide them with "softfp_float_modes := sf df tf", I > think is reasonable to do that. For completeness sake you can do as you > suggested: change softfp_extensions and softfp_truncations, but they are not > absolutely necessary. softfp_extensions and softfp_truncations are needed for conversions between float/double and long double. > Second, is the long double ABI problem. In the past gcc always generated > function calls to _q_* functions. (Per ABI Chapter 5) > For this code: > long double foo (long double x, long double y){ return x + y; } > gcc-4.0, target powerpc-eabise and > gcc-4.0, target powerpc-*-linux-gnuspe with -mlong-double-128 option > both generates a call to _q_add. > The same code with gcc-4.2, both targets generates a call to __gcc_qadd. __gcc_* are for IBM long double. _q_* are for IEEE long double. __*tf* are for an unspecified version of long double *where there aren't ABI-specified functions*; generating calls to them on PowerPC Linux is generally a GCC bug. For IEEE long double, it's definitely better to use _q_* from glibc since they support rounding modes and exceptions. (This in general is an advantage of using soft-fp code in glibc where the glibc version is known to have a good copy, but doing it in general - for float and double - requires extra configure support to be implemented in GCC, and some changes in glibc.) I personally prefer IEEE long double to IBM long double, but in the context of the existing family of PowerPC ABIs currently supported by glibc and GCC I think trying to use it on Linux is a mistake; if used it should have a separate family of target triplets whose names explicitly indicate a new ABI. -- Joseph S. Myers [EMAIL PROTECTED]
Re: -ftree-vectorize can't vectorize plus?
Can these type casts (from uchar to schar and back) be cleaned away by some pass before vectorization, or do we need to teach the vectorizer to ignore such type casts? This was considered as tree-combiner's responsibility. However, I do not know what is the current state and plan of tree-combiner pass. tree-combiner pass, along with other combining activites, will remove unnecessary cast (if possible). - Devang
Re: -ftree-vectorize can't vectorize plus?
"Daniel Berlin" <[EMAIL PROTECTED]> wrote on 11/09/2006 06:27:16 PM: > On 9/11/06, Dorit Nuzman <[EMAIL PROTECTED]> wrote: > > >A silly little testcase which the vectorizer doesn't vectorize: > > > > > > > > autovecttest.c:11: note: not vectorized: relevant stmt not > > > supported: D.1861_9 = (signed char) D.1860_8 > > > > Can these type casts (from uchar to schar and back) be cleaned away by some > > pass before vectorization, > > Uh, what do you mean "cleaned away"? > You can't just legally ignore them, they are changing the overflow behavior. not in the case of xor... I was referring to cases like in the pattern I showed, in which the arguments are cast from unsigned to signed just to perform the xor operation, and the result is cast back to unsigned. Isn't this: unsigned char D.1932 unsigned char D.1936 unsigned char D.1939 D.1933_9 = (signed char) D.1932_8; D.1937_17 = (signed char) D.1936_16; D.1938_18 = D.1937_17 ^ D.1933_9; D.1939_19 = (unsigned char) D.1938_18; the same as this?: D.1939_19 = D.1936_16 ^ D.1932_8 dorit
question about -print-search-dirs
Perhaps a kind person would explain what -print-search-dirs is printing. The manual entry is not very enlightening. When I do %gcc -print search-dirs I get output of which the "libraries=" line lists the following libraries libraries: =/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/ :/usr/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/ :/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/../../../../x86_64-unknown-linux-gnu/lib/x86_64-unknown-linux-gnu/4.1.1/ :/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/../../../../x86_64-unknown-linux-gnu/lib/ :/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/../../../x86_64-unknown-linux-gnu/4.1.1/ :/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/../../../ :/lib/x86_64-unknown-linux-gnu/4.1.1/ :/lib/ :/usr/lib/x86_64-unknown-linux-gnu/4.1.1/ :/usr/lib/ Yet if I do % gcc -v -o hello hello.c for a simple "hello world" program, then the libraries listed (following "-L") are -L/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1 -L/usr/local/gcc-4.1.1/x86_64-Linux/lib/gcc/x86_64-unknown-linux-gnu/4.1.1/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 I guess I would have expected that the two lists of libraries would be the same, or perhaps that the second list would be contained in the first. But this does not seem to be the case. What am I missing? Kate Minola University of Maryland, College Park
Re: question about -print-search-dirs
"Kate Minola" <[EMAIL PROTECTED]> writes: > I guess I would have expected that the two lists of libraries would be the > same, > or perhaps that the second list would be contained in the first. But > this does not > seem to be the case. > > What am I missing? gcc only generates a -L option for directories which actually exist, and which actually are directories. Ian
new libjava regression on darwin
Geoff, Did you notice that a new libjava regression occured today on Darwin apparently after revision 116838 but by revision 116843? The testcase... FAIL: Thread_Sleep -O3 -findirect-dispatch output - bytecode->native test now fails. Could this be related to your change... r116639 | geoffk | 2006-09-01 15:52:10 -0400 (Fri, 01 Sep 2006) | 3 lines * testsuite/libjava.jni/jni.exp (gcj_jni_invocation_test_one): Pass -lgcj to linker for C++ files on Darwin. FYI. Jack
Re: new libjava regression on darwin
On 11/09/2006, at 3:51 PM, Jack Howarth wrote: Geoff, Did you notice that a new libjava regression occured today on Darwin apparently after revision 116838 but by revision 116843? The testcase... FAIL: Thread_Sleep -O3 -findirect-dispatch output - bytecode- >native test now fails. Could this be related to your change... -- -- r116639 | geoffk | 2006-09-01 15:52:10 -0400 (Fri, 01 Sep 2006) | 3 lines * testsuite/libjava.jni/jni.exp (gcj_jni_invocation_test_one): Pass -lgcj to linker for C++ files on Darwin. No. smime.p7s Description: S/MIME cryptographic signature
Re: new libjava regression on darwin
> > Geoff, >Did you notice that a new libjava regression occured today on Darwin > apparently after revision 116838 but by revision 116843? The testcase... > > FAIL: Thread_Sleep -O3 -findirect-dispatch output - bytecode->native test > > now fails. Could this be related to your change... This is just a timeout. -- Pinski
Re: new libjava regression on darwin
On 11/09/2006, at 3:59 PM, Andrew Pinski wrote: Geoff, Did you notice that a new libjava regression occured today on Darwin apparently after revision 116838 but by revision 116843? The testcase... FAIL: Thread_Sleep -O3 -findirect-dispatch output - bytecode- >native test now fails. Could this be related to your change... This is just a timeout. It's actually an timing-dependent infinite loop, or at least that's what I remember from the last time I looked at it several years ago. smime.p7s Description: S/MIME cryptographic signature
debugging tmpdir-gcc.dg-struct-layout-1 failures
On Darwin PPC at -m64, we are seeing a slew of failures in the tmpdir-gcc.dg-struct-layout-1 tests... FAIL: tmpdir-gcc.dg-struct-layout-1/t001 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: tmpdir-gcc.dg-struct-layout-1/t003 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: tmpdir-gcc.dg-struct-layout-1/t005 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: tmpdir-gcc.dg-struct-layout-1/t006 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: tmpdir-gcc.dg-struct-layout-1/t008 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: tmpdir-gcc.dg-struct-layout-1/t016 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: tmpdir-gcc.dg-struct-layout-1/t024 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: tmpdir-gcc.dg-struct-layout-1/t026 c_compat_x_tst.o-c_compat_y_tst.o execute FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o-c_compat_y_tst.o execute The binaries left in gcc/testsuite/gcc simply abort when run. Can someone walk me through the process of creating a bug report for these? I don't quite understand how I could create .i or .s files for these since the creation of all of the test binaries seems to be automated. Thanks in advance for any help. Jack
Re: debugging tmpdir-gcc.dg-struct-layout-1 failures
On Tue, Sep 12, 2006 at 12:32:35AM -0400, Jack Howarth wrote: > On Darwin PPC at -m64, we are seeing a slew of failures > in the tmpdir-gcc.dg-struct-layout-1 tests... > > FAIL: tmpdir-gcc.dg-struct-layout-1/t001 c_compat_x_tst.o-c_compat_y_tst.o > execute > FAIL: tmpdir-gcc.dg-struct-layout-1/t003 c_compat_x_tst.o-c_compat_y_tst.o > execute > FAIL: tmpdir-gcc.dg-struct-layout-1/t005 c_compat_x_tst.o-c_compat_y_tst.o > execute > FAIL: tmpdir-gcc.dg-struct-layout-1/t006 c_compat_x_tst.o-c_compat_y_tst.o > execute > FAIL: tmpdir-gcc.dg-struct-layout-1/t008 c_compat_x_tst.o-c_compat_y_tst.o > execute > FAIL: tmpdir-gcc.dg-struct-layout-1/t016 c_compat_x_tst.o-c_compat_y_tst.o > execute > FAIL: tmpdir-gcc.dg-struct-layout-1/t024 c_compat_x_tst.o-c_compat_y_tst.o > execute > FAIL: tmpdir-gcc.dg-struct-layout-1/t026 c_compat_x_tst.o-c_compat_y_tst.o > execute > FAIL: tmpdir-gcc.dg-struct-layout-1/t028 c_compat_x_tst.o-c_compat_y_tst.o > execute > > The binaries left in gcc/testsuite/gcc simply abort when run. Can someone > walk me > through the process of creating a bug report for these? I don't quite > understand > how I could create .i or .s files for these since the creation of all of the > test > binaries seems to be automated. Thanks in advance for any help. Are you testing two compilers against each other (i.e. are you using ALT_CC_UNDER_TEST resp. ALT_CXX_UNDER_TEST)? In any case, you can build the generated tests with -DDBG (guess e.g. RUNTESTFLAGS="struct-layout-1.exp --target-board unix/-DDBG" could work) and you'll see which test line and which check failed in the testsuite log dump. Then you can just cut and paste that line from the generated testcase (the test line number is the first argument of the macro on each line) say into say struct-layout-1_test.h and can build it as any other compat.exp testcase. Jakub