Tested on Core2 Q9550, using -march=native with and without -mno-sse4.1flag. It works perfect :)

Also David Heidelberg kindly tested the patch which permanently enables optimized code paths if supported by target machine and it was okay.
http://patchwork.freedesktop.org/patch/36488/

And a small improvement to your patch, I think including <smmintrin.h> or the all-in-one alternative <immintrin.h> should be enough.

Best regards,
Siavash Eliasi.

On 11/15/2014 08:34 PM, Emil Velikov wrote:
So when checking/building sse code we have three possibilities:
  1 Old compiler, throws an error when using -msse*
  2 New compiler, user disables sse* (-mno-sse*)
  3 New compiler, user doesn't disable sse

The original code, added code for #1 but not #2. Later on we patched
around the lack of handling #2 by wrapping the code in __SSE4_1__.
Yet it lead to a missing/undefined symbol in case of #1 or #2, which
might cause an issue for #2 when using the i965 driver.

A bit later we "fixed" the undefined symbol by using #1, rather than
updating it to handle #2. With this commit we set things straight :)

To top it all up, conventions state that in case of conflicting
(-enable-foo -disable-foo) options, the latter one takes precedence.
Thus we need to make sure to prepend -msse4.1 to CFLAGS in our test.

Cc: Siavash Eliasi <siavashser...@gmail.com>
Cc: Matt Turner <matts...@gmail.com>
Signed-off-by: Emil Velikov <emil.l.veli...@gmail.com>
---

Man this thing is _very_ messy.
Matt from the last hunk it seems that pixman might need fixing. Should
be bother with that, or let people have fun when they hit it :P

-Emil

  configure.ac | 14 +++++++++++++-
  1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 91e111b..9d1835e 100644
--- a/configure.ac
+++ b/configure.ac
@@ -252,7 +252,19 @@ AC_SUBST([VISIBILITY_CXXFLAGS])
  dnl
  dnl Optional flags, check for compiler support
  dnl
-AX_CHECK_COMPILE_FLAG([-msse4.1], [SSE41_SUPPORTED=1], [SSE41_SUPPORTED=0])
+save_CFLAGS="$CFLAGS"
+CFLAGS="-msse4.1 $CFLAGS"
+AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
+#include <mmintrin.h>
+#include <xmmintrin.h>
+#include <emmintrin.h>
+#include <smmintrin.h>
+int main () {
+    __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
+    c = _mm_max_epu32(a, b);
+    return 0;
+}]])], SSE41_SUPPORTED=1)
+CFLAGS="$save_CFLAGS"
  if test "x$SSE41_SUPPORTED" = x1; then
      DEFINES="$DEFINES -DUSE_SSE41"
  fi

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to