On Fri, Jul 31, 2015 at 6:37 PM, Roland Scheidegger <srol...@vmware.com> wrote: > Am 01.08.2015 um 03:02 schrieb Matt Turner: >> On Fri, Jul 31, 2015 at 5:50 PM, Roland Scheidegger <srol...@vmware.com> >> wrote: >>> Am 01.08.2015 um 01:26 schrieb Matt Turner: >>>> gcc actually generates this for us now that we use -fno-math-errno >>>> (which is weird, since lrintf()/lrint() don't set errno) but clang still >>>> does not. Presumably helps MSVC as well. >>>> >>>> Reduced .text size by 8.5k with gcc before -fno-math-errno. >>>> >>>> text data bss dec hex filename >>>> 4935850 195136 26192 5157178 4eb13a i965_dri.so before >>>> 4927225 195128 26192 5148545 4e8f81 i965_dri.so after >>>> --- >>>> src/util/rounding.h | 13 +++++++++++++ >>>> 1 file changed, 13 insertions(+) >>>> >>>> diff --git a/src/util/rounding.h b/src/util/rounding.h >>>> index 2d00760..e546c9f 100644 >>>> --- a/src/util/rounding.h >>>> +++ b/src/util/rounding.h >>>> @@ -26,6 +26,11 @@ >>>> >>>> #include <math.h> >>>> >>>> +#ifdef __x86_64__ >>>> +#include <xmmintrin.h> >>>> +#include <emmintrin.h> >>>> +#endif >>>> + >>>> #ifdef __SSE4_1__ >>>> #include <smmintrin.h> >>>> #endif >>>> @@ -87,7 +92,11 @@ _mesa_roundeven(double x) >>>> static inline long >>>> _mesa_lroundevenf(float x) >>>> { >>>> +#ifdef __x86_64__ >>>> + return _mm_cvtss_si64(_mm_load_ss(&x)); >>> I think you really want _mm_cvtss_si32, not 64. Longs tend to be 32bit. >>> _mm_cvtss_si64 would be the equivalent of llrintf. >> >> long is 64-bits on Linux/amd64. Looks like it's 32-bits on x32 and >> Windows though. > You are of course totally right. > The actual assembly looks pretty much the same of course (cvtss2si > %xmm0,%rax is the 64bit version...). > Another solution would be to make this function return an int, as all > callers (so far) expect ints anyway. (Those may get different results > now for overflows (both negative and positive), as it's the lower 32bits > now, whereas before this F_TO_I and IROUND actually produced the integer > indefinite value at least with sse (0x80000000) - while undefined result > certainly includes random numbers, it may make figuring out some bugs > slightly harder due to the random numbers). > But either way looks ok to me, can't say I like that datatype though, I > like datatypes with known sizes and not those with surprising > differences :-).
Yeah, I don't like "long" either. I haven't come up with a reason why the float->int libc routines return it. It's (a) not always big enough, and (b) the "long long" routines are often exactly the same. I really just wanted to wrap lrintf and friends and then to match the behavior with SSE intrinsics. I guess we could do that and always return int as well... > >> >> I guess I need to do >> >> #ifdef __x86_64__ >> #if LONG_BIT == 64 >> return _mm_cvtss_si64(_mm_load_ss(&x)); >> #elif LONG_BIT == 32 >> return _mm_cvtss_si32(_mm_load_ss(&x)); >> #endif >> #endif >> >> I'll change it to that. > Would x32 actually have __x86_64__ set? Yes, much to the displeasure of many people. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev