Re: [Mesa-dev] [PATCH] i965/fs: Improve accuracy of dFdy() to match dFdx().

Eric Anholt Wed, 02 Oct 2013 12:20:13 -0700

Paul Berry <stereotype...@gmail.com> writes:

> Previously, we computed dFdy() using the following instruction:
>
>   add(8) dst<1>F src<4,4,0)F -src.2<4,4,0>F { align1 1Q }
>
> That had the disadvantage that it computed the same value for all 4
> pixels of a 2x2 subspan, which meant that it was less accurate than
> dFdx().  This patch changes it to the following instruction when
> c->key.high_quality_derivatives is set:
>
>   add(8) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1Q }
>
> This gives it comparable accuracy to dFdx().
>
> Unfortunately, for some reason the SIMD16 version of this instruction:
>
>   add(16) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1H }
>
> Doesn't seem to work reliably (presumably the hardware designers never
> validated the combination of align16 mode with compressed
> instructions), so we unroll it to:


From the gen4 PRM vol4, page 340:

    "A compressed instruction must be in Align1 access mode. Align16
     mode instructions cannot be compressed."

Other than updating the comment about compressed due to the PRM cite,
this is:

Reviewed-by: Eric Anholt <e...@anholt.net>

Thanks for figuring this out.

pgpYrw2z_nYiS.pgp
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Improve accuracy of dFdy() to match dFdx().

Reply via email to