On 06/21/2012 08:30 AM, Olivier Galibert wrote:
On Wed, Jun 20, 2012 at 01:44:14PM +0100, Roland Scheidegger wrote:
A lot of code I just glossed over it, but seems to look ok other than
the (performance) implications this might have.
Actually whether there's a performance implication is not obvious. In
practice the code just kicks the 4-pixel loop one or two function
calls higher. This unshares some tests, some function calls, and the
mip-size computation shifts. For normal texturing and on x86 the
tests are correctly predicted after the first one, and so are the
function calls, giving all of them a near zero cost. So I'm not sure
the costs is that measurable.
Well I'm not sure neither, that's why I'd have liked to see some number,
even if it's just from multiarb :-) It's not just about prediction
though, e.g. your indirect address lookups might be simpler when
unrolled due to using immediates instead of index register etc.
I can live without that number, though...
With the actual vectorization the llvmpipe situation may be different
(not so sure with the aos texturing though).
Yes the additional lookups for the mip level base addresses are going to
hurt (slightly), and the min/mag
filter selection is going to be a big mess (both aren't really dependent
on aos/soa texturing). This is really going to need a different path.
Roland
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev