On Mon, Nov 22, 2010 at 02:33:32PM -0800, David Mathog wrote:
> My software implementation of SSE2 now passes all the testsuite
> programs. In case anybody else ever needs this, it is here: 
> 
> http://saf.bio.caltech.edu/pub/software/linux_or_unix_tools/soft_emmintrin.h
> 
> I compiled that with a target program and gprof showed
> all the time in resulting binary in the inlined functions.  It ran about
> 4X slower than the SSE2 hardware version, which is about what I
> expected.  So, so far so good.  What I am worried about now is that
> since it was invoked with "-msse2" the compiler may still be generating
> SSE2 calls within the inlined functions.  Is there a way to definitively
> disable this but still retain -msse2 on the command line?  
> 
> For instance, here is one of the software version inline functions:
> 
> /*  vector subtract the two doubles in an __m128d  */
> static __inline __m128d __attribute__((__always_inline__))
> _mm_sub_pd (__m128d __A, __m128d __B)
> {
>   return (__m128d)((__v2df)__A - (__v2df)__B);
> }

Use target attributes (or pragmas):

static __inline __m128d __attribute__((__always_inline__,__target__("no-sse2")))
_mm_sub_pd (__m128d __A, __m128d __B)
{
  return (__m128d)((__v2df)__A - (__v2df)__B);
}

or:

#pragma GCC push_options
#pragma GCC target ("no-sse2")
static __inline __m128d __attribute__((__always_inline__))
_mm_sub_pd (__m128d __A, __m128d __B)
{
  return (__m128d)((__v2df)__A - (__v2df)__B);
}
#pragma GCC pop_options


> In the original gcc emmintrin.h that called a builtin _explicitly_.  I
> also want to avoid having the compiler use the same builtin
> _implicitly_.  If it uses SSE, 3DNOW or MMX implicitly, in this example,
> that would be fine, it just cannot use any SSE2 hardware.
> 
> Actually, one thing I was never very clear on, do -msse2 -m3dnow
> etc. only provide access to the corresponding machine operations through
> the _mm* (or whatever) definitions in the header file, or does the
> compiler also figure out vector operations by itself during the
> optimization phase of compilation?

If -msse2 is used on the command line or inside of a target attribute/pragma,
the compiler feels free to use the sse2 instructions in any fashion, including
when vectorizing.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com     fax +1 (978) 399-6899

Reply via email to