Frédéric Bonnard wrote: > I'm not an altivec expert but I was interested to look into this and > maybe help.
Many thanks for taking the time and effort, this is exactly the response I was hoping for. > When I compile : > __vector unsigned char Va = vec_pack(vec_pack(vec_ctu(Vperma, 0), > vec_splat_u32(0)), vec_splat_u16(0)); > __vector unsigned char Vb = vec_pack(vec_pack(vec_ctu(Vpermb, 0), > vec_splat_u32(0)), vec_splat_u16(0)); > > and print those, I have : > Va : 3 3 8 8 0 0 0 0 0 0 0 0 0 0 0 0 > Vb : 10 10 10 10 0 0 0 0 0 0 0 0 0 0 0 0 This looks bogus; either a bug in my code or the unfortunate result of the type conversion. Knowing me, I would definitely bet it's the former :) Out of curiosity, how do you print vectors? Is there a specific function or you write your own? > So with this : > vector unsigned char Va = { 0, 1, 2, 3, 0, 1, 2, 3, 4, 5, 6, 7, 4, 5, 6, 7 }; > vector unsigned char Vb = { 8, 9, 10, 11, 8, 9, 10, 11, 12, 13, 14, 15, 12, > 13, 14, 15}; > we get the following indexes : > Va : 0 1 2 3 0 1 2 3 4 5 6 7 4 5 6 7 > Vb : 8 9 a b 8 9 a b c d e f c d e f > > which extracts good looking floats. Thanks, I will use your version of the patch. > I also extracted part of the computation code to test the computation > done with some random floats and check if the results make some sense which > seems to be the case (in terms of addition, multiplication, load/store). > I also did this on powerpc, ppc64 and ppc64el to see if I had some > endianness issue, but I got the same results on the 3 archs. Thanks for doing this, I was going to ask if it's possible to check for endianness bugs as well but thought it would be too pushy. As ppc64el is a relevantly new architecture it's quite common to see code assuming big endian (powerpc/pcc64). If my grasp of GCC's configuration is correct, this particular snippet is conditionally compiled only on ppc64el because -mvsx is the default which implies -maltivec. This is not the case for the other PowerPC ports which (should) support machines without AltiVec, so it would be a bug in the package if it assumed AltiVec everywhere. Anyway, it's good to know that it works as expected on the other PowerPC ports. > The best would be to test all this in real by running lynkeos.app's > deconvulation, or at least compile part of the original code on Mac > OS X and check if the indexes used here gives the same results > compared to the original ones. I don't have access to Mac OS X and wouldn't want to use it even if I had. It is natural to ask the upstream author to perform this test but his email bounces. And I never received a response from the person who ported the 1.x series to GNUstep. If you have some spare time and direct access to a GNU/Linux PowerPC machine, perhaps you can compare the results (visually) with another common architecture by processing the same image? Or on powerpc/ppc64 with and without AltiVec.