Hi Tim, Actually, I left out another very good reason why you may want to use vec_vsx_ld/st. Sorry for forgetting this.
As you saw, vec_ld translates into the lvx instruction. This instruction loads a sequence of 16 bytes into a vector register. For big endian, the first byte in memory is loaded into the high order byte of the register. For little endian, the first byte in memory is loaded into the low order byte of the register. This is fine if the data you are loading is arrays of characters, but is not so fine if you are loading arrays of larger items. Suppose you are loading four integers {1, 2, 3, 4} into a register with lvx. In big endian you will see: 00 00 00 01 00 00 00 02 00 00 00 03 00 00 00 04 In little endian you will see: 04 00 00 00 03 00 00 00 02 00 00 00 01 00 00 00 But for this to be interpreted as a vector of integers ordered for little endian, what you really want is: 00 00 00 04 00 00 00 03 00 00 00 02 00 00 00 01 If you use vec_vsx_ld, the compiler will generate a lxvw2x instruction followed by an xxpermdi that swaps the doublewords. After the lxvw2x you will have: 00 00 00 02 00 00 00 01 00 00 00 04 00 00 00 03 because the two LE doublewords are loaded in BE (reversed) order. Swapping the two doublewords restores sanity: 00 00 00 04 00 00 00 03 00 00 00 02 00 00 00 01 So, even if your data is properly aligned, the use of vec_ld = lvx is only correct if you are loading arrays of bytes. Arrays of anything larger must use vec_vsx_ld to avoid errors. Again, sorry for my previous omission! Thanks, Bill Schmidt, Ph.D. IBM Linux Technology Center On Fri, 2015-03-13 at 15:42 +0000, Ewart Timothée wrote: > thank you very much for this answer. > I know my memory is aligned so I will use vec_ld/st only. > > best > > Tim > > > > >