>> ...Ira would know best, but I don't think it would be used for this >> kind of loop. It would be more something like: >> >> for (i=0; i<N; ++i) >> X[i] = Y[i].red + Y[i].blue + Y[i].green; >> >> (not a realistic example). You'd then have: >> >> compoundY = __builtin_load_lanes (Y); >> red = ARRAY_REF <compoundY, 0> >> green = ARRAY_REF <compoundY, 1> >> blue = ARRAY_REF <compoundY, 2> >> D1 = red + green >> D2 = D1 + blue >> MEM_REF <X> = D2; >> >> My understanding is that'd we never do any operations besides ARRAY_REFs >> on the compound value, and that the individual vectors would be treated >> pretty much like any other. > > Ok, I thought it might be used to have a larger vectorization factor for > loads and stores, basically make further unrolling cheaper because you > don't have to duplicate the loads and stores.
Right, we can do that using vld1/vst1 instructions (full load/store with N=1) and operate on up to 4 doubleword vectors in parallel. But at the moment we are concentrating on efficient support of strided memory accesses. Ira