Thanks. Another question. Is there any plan to vectorize the loops like the following ones?
for (i=127; i>=0; i--) { x[i] = y[i] + z[i]; } I found that GCC trunk still cannot handle negative step for store. Even it can, it won't be efficient by introducing redundant permutations on load and store. Cheers, Bingfeng > -----Original Message----- > From: Ira Rosen [mailto:i...@il.ibm.com] > Sent: 10 February 2011 17:22 > To: Bingfeng Mei > Cc: gcc@gcc.gnu.org > Subject: Re: Vector permutation only deals with # of vector elements > same as mask? > > > Hi, > > "Bingfeng Mei" <b...@broadcom.com> wrote on 10/02/2011 05:35:45 PM: > > > > Hi, > > I noticed that vector permutation gets more use in GCC > > 4.6, which is great. It is used to handle negative step > > by reversing vector elements now. > > > > However, after reading the related code, I understood > > that it only works when the # of vector elements is > > the same as that of mask vector in the following code. > > > > perm_mask_for_reverse (tree-vect-stmts.c) > > ... > > mask_type = get_vectype_for_scalar_type (mask_element_type); > > nunits = TYPE_VECTOR_SUBPARTS (vectype); > > if (!mask_type > > || TYPE_VECTOR_SUBPARTS (vectype) != TYPE_VECTOR_SUBPARTS > (mask_type)) > > return NULL; > > ... > > > > For PowerPC altivec, the mask_type is V16QI. It means that > > compiler can only permute V16QI type. But given the capability of > > altivec vperm instruction, it can permute any 128-bit type > > (V8HI, V4SI, etc). We just need convert in/out V16QI from > > given types and a bit more extra work in producing mask. > > > > Do I understand correctly or miss something here? > > Yes, you are right. The support of reverse access is somewhat limited. > Please see vect_transform_slp_perm_load() in tree-vect-slp.c for > example of > all type permutation support. > > But, anyway, reverse accesses are not supported for altivec's load > realignment scheme. > > Ira > > > > > Thanks, > > Bingfeng Mei > > > > > > > > >