On Tue, 27 May 2014, Bin.Cheng wrote: > There are some other similar cases in vectorizer and all of them look > suspicious since intuitively, vectorizer should neither care about > target endianess nor do such shuffle. Anyway, this is how we do > vectorization currently.
Agreed. The semantics of GIMPLE and RTL operations (and architecture-independent built-in functions / generic vectors extensions in GNU C) are meant to be architecture-independent and endianness-independent, generally (including that in GIMPLE and RTL, vector lane numbers always use array ordering). I don't see anything in the definition of VEC_WIDEN_MULT_* in tree.def that would make those definitions endian-dependent. Fixes should be on the basis of: * Ensure the architecture-independent, endianness-independent semantics of the relevant GIMPLE and RTL operations are well-defined. * Fix any code, whether in architecture-independent or architecture-dependent parts of the compiler, that deviates from those architecture-independent, endianness-independent semantics. It's the back end's responsibility to map from those semantics to the semantics of the actual machine instructions. It may be the case that on some architectures endianness affects what vector operations are available. (Once you define how a particular vector machine mode is represented in registers, it could be the case that a load instruction means, in terms of GCC IR, "load" for one endianness but "permuting load" for the other endianness, for example - see the various past discussions of issues with big-endian NEON.) But this isn't a case for the vectorizer caring about endianness - rather, it simply needs to ask the back end about the available operations, and adapt to what's available. -- Joseph S. Myers jos...@codesourcery.com