On Thu, Dec 31, 2015 at 2:38 PM, Oded Gabbay <oded.gab...@gmail.com> wrote: > On Thu, Dec 31, 2015 at 11:30 AM, Oded Gabbay <oded.gab...@gmail.com> wrote: >> On Wed, Dec 30, 2015 at 5:41 PM, Roland Scheidegger <srol...@vmware.com> >> wrote: >>> Am 30.12.2015 um 10:46 schrieb Oded Gabbay: >>>> On Wed, Dec 30, 2015 at 1:11 AM, Roland Scheidegger <srol...@vmware.com> >>>> wrote: >>>>> >>>>> So, if I see that right, you will automatically generate binaries using >>>>> power8 instructions if compiled on power8 capable box, which then won't >>>>> run on boxes not supporting power8? Is that really what you want? >>>>> Maybe some runtime detection would be a good idea (though I don't know >>>>> if anyone cares about power7)? >>>> >>>> The problem is I don't think I can eliminate the build time check >>>> (although I would very much like to) because I need: >>>> 1. To pass a special flag to the GCC compiler: -mpower8-vector >>>> 2. To define _ARCH_PWR8 so GCC will include the newer intrinsic >>>> >>>> Without those two things, I won't be able to use vec_vgbbd which I >>>> need to implement the _mm_movemask_epi8 efficiently, and without that, >>>> all this patch series can be thrown out the window. The emulation of >>>> _mm_movemask_epi8 using regular instructions is just horrible. >>>> >>>> You are correct that once you build a binary with this flag on power8 >>>> machine, that binary won't run on power7 machine. You get "cannot >>>> execute binary file" >>>> >>>> Unfortunately, I don't see a way around this because even if I >>>> condition the use of vec_vgbbd on a runtime check/define, the library >>>> still won't be executable because it was built with -mpower8-vector. >>>> >>>> Having said that, because I *assume* IBM right now mostly cares about >>>> Linux running on POWER8 with little-endian, I think it is a fair >>>> compromise. >>> >>> Note I don't have anything against a build time check. My concern here >>> is something along the lines of unsuspecting distros shipping binaries >>> which won't work, as it looks to me like this will get picked up >>> automatically. That is different to how for instance sse41 is handled. >>> That is I believe this should only get enabled if someone has specified >>> some -mcpu=power8 or whatever flag explicitly somewhere already. >>> >>> Roland >> >> I understand and I share your concern. Maybe we should add >> "--disable-pwr8-inst" to mesa's configure ? if that flag is given to >> configure, it would disable the optimization code (won't add >> _ARCH_PWR8 to defines and won't add -mpower8-vector to gcc flags). >> >> What do you think ? >> >> Oded > > Actually, I made a mistake in checking this issue. I forgot my power8 > machine is LE and power7 is BE - that's why the binary couldn't be > executed. > > I need to install a power8 BE and re-check this. If the binary can be > executed, than I will add runtime checks. > > Oded > So indeed the problem was LE vs. BE. Once I built it on power8 BE, I could run it on power7 (which is always be). Of course, it crashed with illegal instruction, but I can fix that using runtime detection. I will send a new patch series with all the fixes.
Oded >> >>> >>>> >>>> Oded >>>> >>>>> So far we didn't bother with that for SSE >>>>> but it has to be said SSE2 is a really low bar (and the manual assembly >>>>> stuff doesn't use anything more advanced, even though clearly things >>>>> like the emulated mm_mullo_epi32 are suboptimal if your cpu supports >>>>> sse41). And even then on non-x86 you actually might not get >>>>> PIPE_ARCH_SSE if you didn't set gcc's compile flags accordingly. >>>>> >>>>> Roland >>>>> >>>>> >>>>> Am 29.12.2015 um 17:12 schrieb Oded Gabbay: >>>>>> To determine if we could use special POWER8 assembly directives, we first >>>>>> need to detect whether we are running on POWER8 architecture. This patch >>>>>> adds this detection to configure.ac and adds the necessary compilation >>>>>> flags accordingly. >>>>>> >>>>>> Signed-off-by: Oded Gabbay <oded.gab...@gmail.com> >>>>>> --- >>>>>> configure.ac | 30 ++++++++++++++++++++++++++++++ >>>>>> 1 file changed, 30 insertions(+) >>>>>> >>>>>> diff --git a/configure.ac b/configure.ac >>>>>> index f8a70be..1acd47e 100644 >>>>>> --- a/configure.ac >>>>>> +++ b/configure.ac >>>>>> @@ -396,6 +396,36 @@ fi >>>>>> AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1]) >>>>>> AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS) >>>>>> >>>>>> +dnl Check for POWER8 Architecture >>>>>> +PWR8_CFLAGS="-mpower8-vector" >>>>>> +have_pwr8_intrinsics=no >>>>>> +AC_MSG_CHECKING(whether we are running on POWER8 Architecture) >>>>>> +save_CFLAGS=$CFLAGS >>>>>> +CFLAGS="$PWR8_CFLAGS $CFLAGS" >>>>>> +AC_COMPILE_IFELSE([AC_LANG_SOURCE([[ >>>>>> +#if defined(__GNUC__) && (__GNUC__ < 4 || (__GNUC__ == 4 && >>>>>> __GNUC_MINOR__ < 8)) >>>>>> +#error "Need GCC >= 4.8 for sane POWER8 support" >>>>>> +#endif >>>>>> +#include <altivec.h> >>>>>> +int main () { >>>>>> + vector unsigned char r; >>>>>> + vector unsigned int v = vec_splat_u32 (1); >>>>>> + r = __builtin_vec_vgbbd ((vector unsigned char) v); >>>>>> + return 0; >>>>>> +}]])], have_pwr8_intrinsics=yes) >>>>>> +CFLAGS=$save_CFLAGS >>>>>> + >>>>>> +if test $have_pwr8_intrinsics = yes ; then >>>>>> + DEFINES="$DEFINES -D_ARCH_PWR8" >>>>>> + CXXFLAGS="$CXXFLAGS $PWR8_CFLAGS" >>>>>> + CFLAGS="$CFLAGS $PWR8_CFLAGS" >>>>>> +else >>>>>> + PWR8_CFLAGS= >>>>>> +fi >>>>>> + >>>>>> +AC_MSG_RESULT($have_pwr8_intrinsics) >>>>>> +AC_SUBST([PWR8_CFLAGS], $PWR8_CFLAGS) >>>>>> + >>>>>> dnl Can't have static and shared libraries, default to static if user >>>>>> dnl explicitly requested. If both disabled, set to static since shared >>>>>> dnl was explicitly requested. >>>>>> >>>>> >>> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev