We get the following failures on powerpc64-suse-linux: FAIL: gcc.dg/vect/vect-46.c scan-tree-dump-times vectorized 1 loops 1 FAIL: gcc.dg/vect/vect-50.c scan-tree-dump-times vectorized 1 loops 1 FAIL: gcc.dg/vect/vect-52.c scan-tree-dump-times vectorized 1 loops 1 FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vectorized 1 loops 1 FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vectorized 1 loops 1 FAIL: gcc.dg/vect/vect-77.c scan-tree-dump-times vectorized 1 loops 1 FAIL: gcc.dg/vect/vect-77a.c scan-tree-dump-times vectorized 1 loops 1
The access function that the evolution analyzer returns for the pointers in these loops when the compiler is configured for 64bit (powerpc64-suse-linux) is more complicated than when configured for 32bit (powerpc-suse-linux): When the compiler is configured for 32bit, the pointer arithmetic in the loop looks like: # i_1 = PHI <i_24(5), 0(3)>; <L0>:; i.4_6 = (unsigned int) i_1; D.1588_7 = i.4_6 * 4; D.1589_8 = (afloat * restrict) D.1588_7; D.1591_15 = D.1589_8 + pb_14; ... = *D.1591_15; ... i_24 = i_1 + 1; if (n_3 > i_24) goto <L9>; else goto <L10>; the access function that is computed for the pointer is: Access function of ptr: {pb_14, +, 4B}_1 which is simple enough, and the loop is vectorized. On the other hand, when the compiler is configured for 64bit, the pointer arithmetic in the loop looks like: # i_1 = PHI <i_24(5), 0(3)>; <L0>:; D.1816_6 = (long unsigned int) i_1; D.1817_7 = D.1816_6 * 4; D.1818_8 = (afloat * restrict) D.1817_7; D.1820_15 = D.1818_8 + pb_14; ... = *D.1820_15; ... i_24 = i_1 + 1; if (n_3 > i_24) goto <L9>; else goto <L10>; and in this case the access function that is computed for the pointer is: Access function of ptr: (afloat * restrict) ((long unsigned int) {0, +, 1}_1 * 4) + pb_14 The vectorizer does not handle such access-functions at the moment, and thereofore fails to vectorize the loop: loop at vect-46.c:37: not vectorized: pointer access is not simple. loop at vect-46.c:37: not vectorized: unhandled data ref: D.1821_16 = *D.1820_15 loop at vect-46.c:37: bad data references. These loops should be marked xfail for now for ppc64-linux. One of the following would allow vectorizing these loops: - The evolution analyzer knows to ignore the cast to (unsigned int) when it builds the access function, but it doesn't ignore the cast to (long unsigned int). If this cast can be avoided when building the access function, it would be simple enough to handle later on - Enhance the vectorizer to digest such access-functions -- Summary: FAILs to vectorize testcases on ppc64-linux Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: regression AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: dorit at il dot ibm dot com CC: gcc-bugs at gcc dot gnu dot org GCC build triplet: powerpc64-suse-linux GCC host triplet: powerpc64-suse-linux GCC target triplet: powerpc64-suse-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18403