On Fri, 2014-10-24 at 09:11 -0700, Matt Turner wrote: > On Fri, Oct 24, 2014 at 5:47 AM, Timothy Arceri <t_arc...@yahoo.com.au> wrote: > > Makes use of SSE to speed up compute of min and max elements > > > > Callgrind cpu usage results from pts benchmarks: > > > > Openarena 0.8.8: 3.67% -> 1.03% > > UrbanTerror: 2.36% -> 0.81% > > > > Signed-off-by: Timothy Arceri <t_arc...@yahoo.com.au> > > --- > > src/mesa/Android.libmesa_dricore.mk | 3 +- > > src/mesa/Makefile.am | 3 +- > > src/mesa/Makefile.sources | 1 + > > src/mesa/main/sse_minmax.c | 81 > > +++++++++++++++++++++++++++++++++++++ > > src/mesa/main/sse_minmax.h | 30 ++++++++++++++ > > src/mesa/vbo/vbo_exec_array.c | 13 ++++-- > > 6 files changed, 126 insertions(+), 5 deletions(-) > > create mode 100644 src/mesa/main/sse_minmax.c > > create mode 100644 src/mesa/main/sse_minmax.h > > > > This version includes all the suggestions from Brian and Matt, thanks for > > the review guys. > > > > I haven't been able to do Matt's suggestion and compare this to what OpenMP > > would generate as I only have one machine that supports SSE4.1 with Fedora > > 20 and > > I dont want to have to upgrade to Fedora 21 alpha (gcc 4.9) just to test > > this > > (although I did consider it). If people are happy with this code I will > > revisit > > OpenMP for Mesa 10.5 and will look at using OpenMP for the short and byte > > support too. > > > > diff --git a/src/mesa/Android.libmesa_dricore.mk > > b/src/mesa/Android.libmesa_dricore.mk > > index 1e6d948..52d626f 100644 > > --- a/src/mesa/Android.libmesa_dricore.mk > > +++ b/src/mesa/Android.libmesa_dricore.mk > > @@ -51,7 +51,8 @@ endif # MESA_ENABLE_ASM > > > > ifeq ($(ARCH_X86_HAVE_SSE4_1),true) > > LOCAL_SRC_FILES += \ > > - $(SRCDIR)main/streaming-load-memcpy.c > > + $(SRCDIR)main/streaming-load-memcpy.c \ > > + $(SRCDIR)main/sse_minmax.c > > LOCAL_CFLAGS := -msse4.1 > > endif > > > > diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am > > index e71bccb..932db4f 100644 > > --- a/src/mesa/Makefile.am > > +++ b/src/mesa/Makefile.am > > @@ -151,7 +151,8 @@ libmesagallium_la_LIBADD = \ > > $(ARCH_LIBS) > > > > libmesa_sse41_la_SOURCES = \ > > - main/streaming-load-memcpy.c > > + main/streaming-load-memcpy.c \ > > + main/sse_minmax.c > > libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1 > > > > pkgconfigdir = $(libdir)/pkgconfig > > diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources > > index 4755018..dd10574 100644 > > --- a/src/mesa/Makefile.sources > > +++ b/src/mesa/Makefile.sources > > @@ -93,6 +93,7 @@ MAIN_FILES = \ > > $(SRCDIR)main/shaderobj.c \ > > $(SRCDIR)main/shader_query.cpp \ > > $(SRCDIR)main/shared.c \ > > + $(SRCDIR)main/sse_minmax.c \ > > We can't add this here. You've already added it to libmesa_sse41.la above.
I added this without thinking about it to much after Brian said it was probably needed for SCons. Obviously we cant have both so I'll remove it from here. I don't know enough about Scons to know what will require or how to fix it. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev