On Thu, 2014-10-23 at 17:08 -0400, Ilia Mirkin wrote: > On Thu, Oct 23, 2014 at 4:56 PM, Timothy Arceri <t_arc...@yahoo.com.au> wrote: > > On Thu, 2014-10-23 at 09:20 -0600, Brian Paul wrote: > >> > >> Can something similar be done for 16-bit values? > >> > > > > Yes there are _mm_max_epu16 and _mm_min_epu16 intrinsics too. > > And those only need SSE2 iirc. There are also _mm256_* intrinsics for > avx2... so many options. The OpenMP thing starts looking more and more > attractive... Or something based on the gcc-4.x alternatives thing. > > -ilia
Yeah the only issue with OpenMP seems to be that it selects the platform at build time rather than allowing some type of runtime selection. So distro builds would only get SSE2 on 64bit builds of Mesa. I tested OpenMP on my desktop (old amd with SSE2) as I don't have gcc 4.9 on my laptop and there was an improvement but nowhere near what SSE4.1 gives. Openarena: 4.15% -> 2.99% I'm sure avx2 would give a nice bump too but cant test that myself :( If anyone is interested in playing with this if you can use pts to install openarena and if you go to the install directory ~/.phoronix-test-suite/installed-tests/pts/openarena-1.5.2/openarena-0.8.8 You can run the benchmark in callgrind using a command like this: valgrind --tool=callgrind ./openarena.x86_64 +exec pts-openarena-088 +set r_mode -1 +set r_fullscreen 1 +set com_speeds 1 +set r_customWidth 800 +set r_customHeight 600 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev