Hi Yavor, On Thu, 22 Mar 2018 12:04:33 +0200, Yavor Doganov <ya...@gnu.org> wrote: > Frédéric Bonnard wrote: > > I'm not an altivec expert but I was interested to look into this and > > maybe help. > > Many thanks for taking the time and effort, this is exactly the > response I was hoping for. > > > When I compile : > > __vector unsigned char Va = vec_pack(vec_pack(vec_ctu(Vperma, 0), > > vec_splat_u32(0)), vec_splat_u16(0)); > > __vector unsigned char Vb = vec_pack(vec_pack(vec_ctu(Vpermb, 0), > > vec_splat_u32(0)), vec_splat_u16(0)); > > > > and print those, I have : > > Va : 3 3 8 8 0 0 0 0 0 0 0 0 0 0 0 0 > > Vb : 10 10 10 10 0 0 0 0 0 0 0 0 0 0 0 0 > > This looks bogus; either a bug in my code or the unfortunate result of > the type conversion. Knowing me, I would definitely bet it's the > former :) > > Out of curiosity, how do you print vectors? Is there a specific > function or you write your own?
Good question : I used libvecpf which I linked against. Then %vf displays the floats and %vlX the hexadecimal representation. > > So with this : > > vector unsigned char Va = { 0, 1, 2, 3, 0, 1, 2, 3, 4, 5, 6, 7, 4, 5, 6, 7 > > }; > > vector unsigned char Vb = { 8, 9, 10, 11, 8, 9, 10, 11, 12, 13, 14, 15, 12, > > 13, 14, 15}; > > we get the following indexes : > > Va : 0 1 2 3 0 1 2 3 4 5 6 7 4 5 6 7 > > Vb : 8 9 a b 8 9 a b c d e f c d e f > > > > which extracts good looking floats. > > Thanks, I will use your version of the patch. Maybe some closer representation to the initaal four hex would be nice too. > > I also extracted part of the computation code to test the computation > > done with some random floats and check if the results make some sense which > > seems to be the case (in terms of addition, multiplication, load/store). > > I also did this on powerpc, ppc64 and ppc64el to see if I had some > > endianness issue, but I got the same results on the 3 archs. > > Thanks for doing this, I was going to ask if it's possible to check > for endianness bugs as well but thought it would be too pushy. As > ppc64el is a relevantly new architecture it's quite common to see code > assuming big endian (powerpc/pcc64). > > If my grasp of GCC's configuration is correct, this particular snippet > is conditionally compiled only on ppc64el because -mvsx is the default > which implies -maltivec. This is not the case for the other PowerPC > ports which (should) support machines without AltiVec, so it would be > a bug in the package if it assumed AltiVec everywhere. Indeed, I had to pass -maltivec to powerpc and ppc64. > Anyway, it's good to know that it works as expected on the other > PowerPC ports. Well coherency seems preserved at least : either it's all correct on the 3 or it's all false :) .. it needs more test or some knowledge of the computations done (see below). > > The best would be to test all this in real by running lynkeos.app's > > deconvulation, or at least compile part of the original code on Mac > > OS X and check if the indexes used here gives the same results > > compared to the original ones. > > I don't have access to Mac OS X and wouldn't want to use it even if I > had. :D > It is natural to ask the upstream author to perform this test > but his email bounces. And I never received a response from the > person who ported the 1.x series to GNUstep. Pity > If you have some spare time and direct access to a GNU/Linux PowerPC > machine, perhaps you can compare the results (visually) with another > common architecture by processing the same image? Or on powerpc/ppc64 > with and without AltiVec. Will try. I feel somehow unsatisfied to have no certainty at this point, but wanted to give you some feedback early. F.
pgpxj384jVvIy.pgp
Description: PGP signature