Looking through the kernel radeon drm source, it looks like the i2f() functions in r600_blit.c and r600_blit_ksm() can be optimized a bit.
The following extends the range to all unsigned 32bit integers, and avoids the slow loop by using the bsr instruction via __fls(). It provides an exact 1-1 correspondence up to 2^24. Above that, there is the inevitable rounding. This routine rounds towards zero (truncation). /* 23 bits of float fractional data */ #define I2F_FRAC_BITS 23 #define I2F_MASK ((1 << I2F_FRAC_BITS) - 1) /* * Converts an unsigned integer into 32-bit IEEE floating point representation. * Will be exact from 0 to 2^24. Above that, we round towards zero * as the fractional bits will not fit in a float. (It would be better to * round towards even as the fpu does, but that is slower.) * This routine depends on the mod(32) behaviour of the rotate instructions * on x86. */ uint32_t i2f(uint32_t x) { uint32_t msb, exponent, fraction; /* Zero is special */ if (!x) return 0; /* Get location of the most significant bit */ msb = __fls(x); /* * Use a rotate instead of a shift because that works both leftwards * and rightwards due to the mod(32) beahviour. This means we don't * need to check to see if we are above 2^24 or not. */ fraction = ror32(x, msb - I2F_FRAC_BITS) & I2F_MASK; exponent = (127 + msb) << I2F_FRAC_BITS; return fraction + exponent; } Steven Fuerst -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20120730/f194ee6b/attachment.html>