Re: [PATCH, ARM, RFC] Fix vect.exp failures for NEON in big-endian mode

Paul Brook Fri, 01 Mar 2013 06:35:32 -0800

> > Do I understand correctly that the "only" issue is memory vs. register
> > element ordering?  Thus a fixup could be as simple as extra shuffles
> > inserted after vector memory loads and before vector memory stores?
> > (with the hope of RTL optimizers optimizing those)?
> 
> It's not even necessary to use explicit shuffles -- NEON has perfectly
> good instructions for loading/storing vectors in the "right" order, in
> the form of vld1 & vst1. I'm afraid the solution to this problem might
> have been staring us in the face for years, which is simply to forbid
> vldr/vstr/vldm/vstm (the instructions which lead to weird element
> permutations in BE mode) for loading/storing NEON vectors altogether.
> That way the vectorizer gets what it wants, the intrinsics can continue
> to use __builtin_shuffle exactly as they are doing, and we get to
> remove all the bits which fiddle vector element numbering in BE mode in
> the ARM backend.
> 
> I can't exactly remember why we didn't do that to start with. I think
> the problem was ABI-related, or to do with transferring NEON vectors
> to/from ARM registers when it was necessary to do that... I'm planning
> to do some archaeology to try to see if I can figure out a definitive
> answer.


The ABI defined vector types (uint32x4_t etc) are defined to be in vldm/vstm 
order.

Paul

Re: [PATCH, ARM, RFC] Fix vect.exp failures for NEON in big-endian mode

Reply via email to