"David Mathog" <[email protected]> writes:
> Ian Lance Taylor <[email protected]>, wrote:
>
>> Tests that directly invoke __builtin functions are not appropriate for
>> your replacement for emmintrin.h.
>
> Clearly. However, I do not see why these are in the test routines in
> the first place. They seem not to be needed. I made the changes below
> my signature, eliminating all of the vector builtins, and the programs
> still worked with both -msse2 and -mno-sse2 plus my software SSE2. If
> anything the test programs are much easier to understand without the
> builtins.
Your changes are relying on a gcc extension which was only recently
added, more recently than those tests were added to the testsuite. Only
recently did gcc acquire the ability to use [] to access elements in a
vector. I agree that your changes look good, although we rarely change
existing tests unless there is a very good reason. Avoiding __builtin
functions in the gcc testsuite is not in itself a good reason. These
tests were written for gcc; they were not written as general purpose SSE
tests.
> There is also a (big) problem with sse2-vec-2.c (and -2a, which is empty
> other than an #include sse2-vec-2.c). There are no explicit sse2
> operations within this test program. Moreover, the code within the
> tests does not work. Finally, if one puts a print statement anywhere in
> the test that is there, compiles it with:
>
> gcc -msse -msse2
>
> there will be no warnings, and the run will appear to show a valid test,
> but in actuality the test will never execute! This shows part of the
> problem:
>
> gcc -Wall -msse -msse2 -o foo sse2-vec-2.c
> sse-os-support.h:27: warning: 'sse_os_support' defined but not used
> sse2-check.h:10: warning: 'do_test' defined but not used
>
> (also for -m64) There must be some sort of main in there, but no test,
> it does nothing and returns a valid status.
The main function is in sse2-check.h. As you can see in that file, the
test is only run if the CPU includes SSE2 support. That is fine for
gcc's purposes, but I can see that it is problematic for yours.
> When stuffed with debug statements:
>
> for (i = 0; i < 2; i++)
> masks[i] = i;
>
> printf("DEBUG res[0] %llX\n",res[0]);
> printf("DEBUG res[1] %llX\n",res[1]);
> printf("DEBUG val1.ll[0] %llX\n",val1.ll[0]);
> printf("DEBUG val1.ll[1] %llX\n",val1.ll[1]);
> for (i = 0; i < 2; i++)
> if (res[i] != val1.ll [masks[i]]){
> printf("DEBUG i %d\n",i);
> printf("DEBUG masks[i] %d\n",masks[i]);
> printf("DEBUG val1.ll [masks[i]] %llX\n", val1.ll [masks[i]]);
> abort ();
> }
>
> and compiled with my software SSE2
>
> gcc -Wall -msse -mno-sse2 -I. -O0 -m32 -lm -DSOFT_SSE2 -DEMMSOFTDBG -o
> foo sse2-vec-2.c
>
> It emits:
>
> DEBUG res[0] 3020100
> DEBUG res[1] 7060504
> DEBUG val1.ll[0] 706050403020100
> DEBUG val1.ll[1] F0E0D0C0B0A0908
> DEBUG i 0
> DEBUG masks[i] 0
> DEBUG val1.ll [masks[i]] 706050403020100
> Aborted
>
> True enough 3020100 != 706050403020100, but what kind of test
> is that???
When I run the unmodified test on my system, which has SSE2 support in
hardware, I see that
res[0] == 0x706050403020100
res[1] == 0xf0e0d0c0b0a0908
So I think you may have misinterpreted the __builtin_ia32_vec_ext_v2di
builtin function. That function treats the vector as containing two
8-byte integers, and pulls out one or the other depending on the second
argument. Your dumps of res[0] and res[1] suggest that you are treating
the vector as four 4-byte integers and pulling out specific ones.
Ian