On Sun, Jan 3, 2016 at 8:14 PM, Roland Scheidegger <srol...@vmware.com> wrote: > Am 03.01.2016 um 13:50 schrieb Oded Gabbay: >> On Thu, Dec 31, 2015 at 8:48 PM, Roland Scheidegger <srol...@vmware.com> >> wrote: >>> Am 31.12.2015 um 10:30 schrieb Oded Gabbay: >>>> On Wed, Dec 30, 2015 at 5:41 PM, Roland Scheidegger <srol...@vmware.com> >>>> wrote: >>>>> Am 30.12.2015 um 10:46 schrieb Oded Gabbay: >>>>>> On Wed, Dec 30, 2015 at 1:11 AM, Roland Scheidegger <srol...@vmware.com> >>>>>> wrote: >>>>>>> >>>>>>> So, if I see that right, you will automatically generate binaries using >>>>>>> power8 instructions if compiled on power8 capable box, which then won't >>>>>>> run on boxes not supporting power8? Is that really what you want? >>>>>>> Maybe some runtime detection would be a good idea (though I don't know >>>>>>> if anyone cares about power7)? >>>>>> >>>>>> The problem is I don't think I can eliminate the build time check >>>>>> (although I would very much like to) because I need: >>>>>> 1. To pass a special flag to the GCC compiler: -mpower8-vector >>>>>> 2. To define _ARCH_PWR8 so GCC will include the newer intrinsic >>>>>> >>>>>> Without those two things, I won't be able to use vec_vgbbd which I >>>>>> need to implement the _mm_movemask_epi8 efficiently, and without that, >>>>>> all this patch series can be thrown out the window. The emulation of >>>>>> _mm_movemask_epi8 using regular instructions is just horrible. >>>>>> >>>>>> You are correct that once you build a binary with this flag on power8 >>>>>> machine, that binary won't run on power7 machine. You get "cannot >>>>>> execute binary file" >>>>>> >>>>>> Unfortunately, I don't see a way around this because even if I >>>>>> condition the use of vec_vgbbd on a runtime check/define, the library >>>>>> still won't be executable because it was built with -mpower8-vector. >>>>>> >>>>>> Having said that, because I *assume* IBM right now mostly cares about >>>>>> Linux running on POWER8 with little-endian, I think it is a fair >>>>>> compromise. >>>>> >>>>> Note I don't have anything against a build time check. My concern here >>>>> is something along the lines of unsuspecting distros shipping binaries >>>>> which won't work, as it looks to me like this will get picked up >>>>> automatically. That is different to how for instance sse41 is handled. >>>>> That is I believe this should only get enabled if someone has specified >>>>> some -mcpu=power8 or whatever flag explicitly somewhere already. >>>>> >>>>> Roland >>>> >>>> I understand and I share your concern. Maybe we should add >>>> "--disable-pwr8-inst" to mesa's configure ? if that flag is given to >>>> configure, it would disable the optimization code (won't add >>>> _ARCH_PWR8 to defines and won't add -mpower8-vector to gcc flags). >>>> >>>> What do you think ? >>> If the generated code with all automatically picked up compile options >>> really doesn't run on power7 just because of this, I think it would be >>> nicer if this were an explicit enable. >>> >>> Roland >>> >> >> So the problem is a bit worse then that and requires a harsher solution. >> Apparently, when that compiler flag (power8-vector) is given to GCC, >> GCC uses POWER8-only instructions in other places as well! What I have >> seen so far, is that it uses such instructions in the implementation >> of exp2() and/or log2() (in f.cpp) and also saw it in >> __glXInitializeVisualConfigFromTags(). The instructions used are not >> vector instructions, but floating point instructions, which were added >> only in PowerISA 2.07 >> >> Therefore, I think that for now, I will limit the entire optimization >> code to POWER8 *and* Little-Endian. Because ppc64le packages can >> *only* run on POWER8 systems, and because you can't transfer binaries >> between LE and BE machines, this workaround eliminates the danger of >> crashing on "illegal instruction". In addition, there is no more need >> for runtime checks. >> >> I hope you agree that with this change, it is better to enable the >> power8-vector by default when building on POWER8 machine installed >> with Linux LE. For all other archs it will be disabled by default. > > Yes, that looks reasonable.
Thanks. > >> I will try to contact IBM GCC devs to see how we can overcome this >> problem (or if they even care) so I can expand these optimizations to >> BE as well. > IIRC this is problematic for things like sse41 etc. as well, it is just > how gcc works. I think the typical workaround is to move code using > intrinsics which need special compile flags to their own file (or rather > compile unit), which you then can compile with those flags. (This > implies of course separate functions, that is you can't have any > run-time check plus the assembly in the same file or function.) > This is how mesa handles _mesa_streaming_load_memcpy for the i965 driver. > > Roland > Yeah, I imagined as much, but I didn't know this technique was already in use in mesa. I hate this fragmentation of code, and I think a better solution, at the compiler level, is to have some kind of flag which tells the compiler he can accept power8/sse41 inline assembly and/or intrinsics but can't generate those on his own (I'm not a compiler expert but I think that's doable). As I said, I'll try to dig around and see what can be done. Oded > >> >> I will send the revised patches shortly. >> >> Oded >> >>> >>> >>> >>>> >>>> Oded >>>> >>>>> >>>>>> >>>>>> Oded >>>>>> >>>>>>> So far we didn't bother with that for SSE >>>>>>> but it has to be said SSE2 is a really low bar (and the manual assembly >>>>>>> stuff doesn't use anything more advanced, even though clearly things >>>>>>> like the emulated mm_mullo_epi32 are suboptimal if your cpu supports >>>>>>> sse41). And even then on non-x86 you actually might not get >>>>>>> PIPE_ARCH_SSE if you didn't set gcc's compile flags accordingly. >>>>>>> >>>>>>> Roland >>>>>>> >>>>>>> >>>>>>> Am 29.12.2015 um 17:12 schrieb Oded Gabbay: >>>>>>>> To determine if we could use special POWER8 assembly directives, we >>>>>>>> first >>>>>>>> need to detect whether we are running on POWER8 architecture. This >>>>>>>> patch >>>>>>>> adds this detection to configure.ac and adds the necessary compilation >>>>>>>> flags accordingly. >>>>>>>> >>>>>>>> Signed-off-by: Oded Gabbay <oded.gab...@gmail.com> >>>>>>>> --- >>>>>>>> configure.ac | 30 ++++++++++++++++++++++++++++++ >>>>>>>> 1 file changed, 30 insertions(+) >>>>>>>> >>>>>>>> diff --git a/configure.ac b/configure.ac >>>>>>>> index f8a70be..1acd47e 100644 >>>>>>>> --- a/configure.ac >>>>>>>> +++ b/configure.ac >>>>>>>> @@ -396,6 +396,36 @@ fi >>>>>>>> AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1]) >>>>>>>> AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS) >>>>>>>> >>>>>>>> +dnl Check for POWER8 Architecture >>>>>>>> +PWR8_CFLAGS="-mpower8-vector" >>>>>>>> +have_pwr8_intrinsics=no >>>>>>>> +AC_MSG_CHECKING(whether we are running on POWER8 Architecture) >>>>>>>> +save_CFLAGS=$CFLAGS >>>>>>>> +CFLAGS="$PWR8_CFLAGS $CFLAGS" >>>>>>>> +AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ >>>>>>>> +#if defined(__GNUC__) && (__GNUC__ < 4 || (__GNUC__ == 4 && >>>>>>>> __GNUC_MINOR__ < 8)) >>>>>>>> +#error "Need GCC >= 4.8 for sane POWER8 support" >>>>>>>> +#endif >>>>>>>> +#include <altivec.h> >>>>>>>> +int main () { >>>>>>>> + vector unsigned char r; >>>>>>>> + vector unsigned int v = vec_splat_u32 (1); >>>>>>>> + r = __builtin_vec_vgbbd ((vector unsigned char) v); >>>>>>>> + return 0; >>>>>>>> +}]])], have_pwr8_intrinsics=yes) >>>>>>>> +CFLAGS=$save_CFLAGS >>>>>>>> + >>>>>>>> +if test $have_pwr8_intrinsics = yes ; then >>>>>>>> + DEFINES="$DEFINES -D_ARCH_PWR8" >>>>>>>> + CXXFLAGS="$CXXFLAGS $PWR8_CFLAGS" >>>>>>>> + CFLAGS="$CFLAGS $PWR8_CFLAGS" >>>>>>>> +else >>>>>>>> + PWR8_CFLAGS= >>>>>>>> +fi >>>>>>>> + >>>>>>>> +AC_MSG_RESULT($have_pwr8_intrinsics) >>>>>>>> +AC_SUBST([PWR8_CFLAGS], $PWR8_CFLAGS) >>>>>>>> + >>>>>>>> dnl Can't have static and shared libraries, default to static if user >>>>>>>> dnl explicitly requested. If both disabled, set to static since shared >>>>>>>> dnl was explicitly requested. >>>>>>>> >>>>>>> >>>>> >>> > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev