------- Comment #5 from matz at gcc dot gnu dot org 2010-09-07 13:42 ------- Since Ira implemented unaligned support in SLP mode we get somewhat further, but not much. If complete unrolling is active that we can't disambiguate between *s and *(s+stride). That is correct because stride is unknown and might be < 8. The problem is the code generated by unrolling looks like so:
b_1[0] = s1_2[0]... b_1[1] = s1_2[1]... ... b_3 = b_1 + 8; s1_4 = s1_2 + stride; b_3[0] = s1_4[0]... b_3[1] = s1_4[1]... Now SLP checks for dependencies between the first block of access and those in the second block. Although this is really uninteresting for SLP, nevertheless it prevents SLPing here because the dependencies can't be computed. Deactivating loop-unrolling reveals another problem, namely that SLP doesn't support multiple types at all, see vect_build_slp_tree. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43434