Marek Olšák <mar...@gmail.com> writes:

> On Wed, Dec 12, 2012 at 5:06 PM, Paul Berry <stereotype...@gmail.com> wrote:
>> On 11 December 2012 23:49, Aras Pranckevicius <a...@unity3d.com> wrote:
>>> Not sure if relevant for Mesa, but e.g. on PowerVR SGX it's really bad to
>>> pack two vec2 texture coordinates into a single vec4. That's because var.xy
>>> texture read can be "prefetched", whereas var.zw texture read is not
>>> prefetched (essentially treated as a dependent texture read), and often
>>> causes stalls in the shader execution.
>>
>>
>> Interesting--I had not thought of that possibility.  On i965 all texture
>> reads have to be done explicitly by the fragment shader (there is no
>> prefetching IIRC), so this penalty doesn't apply.  Does anyone know if a
>> penalty like this exists in any of Mesa's other back-ends?  If so that might
>> suggest some good experiments to try.  I'm open to revising my opinion if
>> someone measures a significant performance degradation, particularly with a
>> real-world app.
>
> R300 and R400 support 4 texture indirections (as defined by
> ARB_fragment_program). Adding ALU instructions before the first TEX
> instruction increases the number of texture indirections by 1, which
> might make some shaders not be executable on the hardware at all.
>
> I think this optimization should be disabled on drivers where the
> texture indirection limit is too low.

And are swizzles of texcoords required to be separate MOVs beforehand
(like on i915)?

Attachment: pgpLp5FfmyHAC.pgp
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to