My software implementation of SSE2 now passes all the testsuite programs. In case anybody else ever needs this, it is here:
http://saf.bio.caltech.edu/pub/software/linux_or_unix_tools/soft_emmintrin.h I compiled that with a target program and gprof showed all the time in resulting binary in the inlined functions. It ran about 4X slower than the SSE2 hardware version, which is about what I expected. So, so far so good. What I am worried about now is that since it was invoked with "-msse2" the compiler may still be generating SSE2 calls within the inlined functions. Is there a way to definitively disable this but still retain -msse2 on the command line? For instance, here is one of the software version inline functions: /* vector subtract the two doubles in an __m128d */ static __inline __m128d __attribute__((__always_inline__)) _mm_sub_pd (__m128d __A, __m128d __B) { return (__m128d)((__v2df)__A - (__v2df)__B); } In the original gcc emmintrin.h that called a builtin _explicitly_. I also want to avoid having the compiler use the same builtin _implicitly_. If it uses SSE, 3DNOW or MMX implicitly, in this example, that would be fine, it just cannot use any SSE2 hardware. Actually, one thing I was never very clear on, do -msse2 -m3dnow etc. only provide access to the corresponding machine operations through the _mm* (or whatever) definitions in the header file, or does the compiler also figure out vector operations by itself during the optimization phase of compilation? Thank you, David Mathog mat...@caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech