On Mon, Nov 22, 2010 at 02:33:32PM -0800, David Mathog wrote: > My software implementation of SSE2 now passes all the testsuite > programs. In case anybody else ever needs this, it is here: > > http://saf.bio.caltech.edu/pub/software/linux_or_unix_tools/soft_emmintrin.h > > I compiled that with a target program and gprof showed > all the time in resulting binary in the inlined functions. It ran about > 4X slower than the SSE2 hardware version, which is about what I > expected. So, so far so good. What I am worried about now is that > since it was invoked with "-msse2" the compiler may still be generating > SSE2 calls within the inlined functions. Is there a way to definitively > disable this but still retain -msse2 on the command line? > > For instance, here is one of the software version inline functions: > > /* vector subtract the two doubles in an __m128d */ > static __inline __m128d __attribute__((__always_inline__)) > _mm_sub_pd (__m128d __A, __m128d __B) > { > return (__m128d)((__v2df)__A - (__v2df)__B); > }
Use target attributes (or pragmas): static __inline __m128d __attribute__((__always_inline__,__target__("no-sse2"))) _mm_sub_pd (__m128d __A, __m128d __B) { return (__m128d)((__v2df)__A - (__v2df)__B); } or: #pragma GCC push_options #pragma GCC target ("no-sse2") static __inline __m128d __attribute__((__always_inline__)) _mm_sub_pd (__m128d __A, __m128d __B) { return (__m128d)((__v2df)__A - (__v2df)__B); } #pragma GCC pop_options > In the original gcc emmintrin.h that called a builtin _explicitly_. I > also want to avoid having the compiler use the same builtin > _implicitly_. If it uses SSE, 3DNOW or MMX implicitly, in this example, > that would be fine, it just cannot use any SSE2 hardware. > > Actually, one thing I was never very clear on, do -msse2 -m3dnow > etc. only provide access to the corresponding machine operations through > the _mm* (or whatever) definitions in the header file, or does the > compiler also figure out vector operations by itself during the > optimization phase of compilation? If -msse2 is used on the command line or inside of a target attribute/pragma, the compiler feels free to use the sse2 instructions in any fashion, including when vectorizing. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899