Am 28.07.2014 20:38, schrieb Jason Ekstrand: > > > > On Mon, Jul 28, 2014 at 8:34 AM, Roland Scheidegger <srol...@vmware.com > <mailto:srol...@vmware.com>> wrote: > > Looking all good, though it got me thinking about these numerous > float-to-int rounding functions in mesa in various places a bit more. > There is, for instance, a _mesa_round_to_even() function, which claims > that the c99 lrintf() and friends functions can't be used because the > environment might use a different rounding mode. This is of course true > though I wonder why it would matter there and not here for instance, > since of course cvtss2si is affected by that as well. > > > Yes, it does matter for cvtss2si. I briefly considered replacing F_TO_I > with lrint and just trusting in the standard library to do it. However, > that turns out to have really bad performance (on my system) if you > don't enable -ffast-math (which we don't by default). The reason why > using cvtss2si is ok is that we only use F_TO_I for things where the GL > spec allows us to be sloppy on the rounding. In particular, we use it > for texture format conversion and for things like swrast where we care > more about fast than perfect. > > > Also, I'm wondering about the IROUND fallback. Richard Sandiford > discovered a problem in llvmpipe (using lp_build_iround()) which was > using the same method (that is, (int)(val + 0.5) for unsigned numbers, > similar but more complex for signed numbers). This works ok for nearly > all numbers (though it is definitely not round to nearest even) except > numbers between [-]2^23 and [-]2^24-1, in which case it will always > return the next higher even number for odd numbers. So > IROUND((float)val) != val even for numbers which can be represented as > floats exactly. I'd guess though mesa probably doesn't use IROUND for > cases where this would matter (most likely some conversion of z24 > numbers), worst case there would probably not just be the inaccuracy but > if you'd have clamped to z24 max (as float, 2^24-1) then done the IROUND > you'd get back 2^24. And FWIW some gallium util math function seem to > have this problem as well, though again I don't know if it would matter > (the gallium util code will only use them if c99 isn't available). > > But anyway I guess that's slightly off-topic... > > > That's interesting. I'll give that some thought, but at that point > you're right at the boundary of floating-point precision and things get > interesting. I'll think about it and if I come up with a better way to > do it, I'll send a patch. Yes it's right at the boundary but does not exceed it (there's a reason z24 unorm is very common and z32 unorm is paper spec only :-)). FWIW the easiest solution for the gallivm code was to use largest-float-smaller-than-0.5 for the add in iround though I'm not 100% sure yet of the consequences this could have (at first sight, this looks quite ok, the previous rounding when using +0.5f gives round up semantics for positive tied values (0.5, 1.5, 2.5 would become 1, 2, 3 whereas usually you'd possibly expect round-nearest-even hence 0,2,2) and round down semantics for negative tied values, whereas with this you get round-nearest-trunc). Oh and since you're working on it, it would be also nice if the rounding macro / function stuff could be shared between gallium/util and mesa.
Roland > > > > Roland > > > Am 23.07.2014 05:15, schrieb Jason Ekstrand: > > According to a quick micro-benchmark, this new version is 20% > faster on my > > Haswell laptop. > > > > v2: Removed the XXX note about x86_64 from the comment > > > > Signed-off-by: Jason Ekstrand <jason.ekstr...@intel.com > <mailto:jason.ekstr...@intel.com>> > > --- > > src/mesa/main/imports.h | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h > > index af780b2..c8ae7f2 100644 > > --- a/src/mesa/main/imports.h > > +++ b/src/mesa/main/imports.h > > @@ -277,7 +277,6 @@ static inline int IROUND_POS(float f) > > > > /** > > * Convert float to int using a fast method. The rounding mode > may vary. > > - * XXX We could use an x86-64/SSE2 version here. > > */ > > static inline int F_TO_I(float f) > > { > > @@ -285,6 +284,10 @@ static inline int F_TO_I(float f) > > int r; > > __asm__ ("fistpl %0" : "=m" (r) : "t" (f) : "st"); > > return r; > > +#elif defined(USE_X86_64_ASM) && defined(__GNUC__) > > + int r; > > + __asm__ ("cvtss2si %1, %0" : "=r" (r) : "xm" (f)); > > + return r; > > #elif defined(USE_X86_ASM) && defined(_MSC_VER) > > int r; > > _asm { > > > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev