Marek Olšák <mar...@gmail.com> writes: > On Wed, Dec 12, 2012 at 5:06 PM, Paul Berry <stereotype...@gmail.com> wrote: >> On 11 December 2012 23:49, Aras Pranckevicius <a...@unity3d.com> wrote: >>> Not sure if relevant for Mesa, but e.g. on PowerVR SGX it's really bad to >>> pack two vec2 texture coordinates into a single vec4. That's because var.xy >>> texture read can be "prefetched", whereas var.zw texture read is not >>> prefetched (essentially treated as a dependent texture read), and often >>> causes stalls in the shader execution. >> >> >> Interesting--I had not thought of that possibility. On i965 all texture >> reads have to be done explicitly by the fragment shader (there is no >> prefetching IIRC), so this penalty doesn't apply. Does anyone know if a >> penalty like this exists in any of Mesa's other back-ends? If so that might >> suggest some good experiments to try. I'm open to revising my opinion if >> someone measures a significant performance degradation, particularly with a >> real-world app. > > R300 and R400 support 4 texture indirections (as defined by > ARB_fragment_program). Adding ALU instructions before the first TEX > instruction increases the number of texture indirections by 1, which > might make some shaders not be executable on the hardware at all. > > I think this optimization should be disabled on drivers where the > texture indirection limit is too low.
And are swizzles of texcoords required to be separate MOVs beforehand (like on i915)?
pgpLp5FfmyHAC.pgp
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev